GGUF of a merged checkpoint 4320 [ausboss/llama2-13b-supercot-loras2](https://huggingface.co/ausboss/llama2-13b-supercot-loras2) with base Llama 2 13B. It is currently only quantised to Q5_K_M as this is the smallest size with comparable accuracy to 8bit (almost lossless). I have a fp16 GGUF and will probably quant to 8bit and 4bit GGUF soon.

Ausboss' original model card with the LoRA training info. See his model page for further information.

### Training procedure

The following bitsandbytes quantization config was used during training:

    quant_method: bitsandbytes
    load_in_8bit: False
    load_in_4bit: True
    llm_int8_threshold: 6.0
    llm_int8_skip_modules: None
    llm_int8_enable_fp32_cpu_offload: False
    llm_int8_has_fp16_weight: False
    bnb_4bit_quant_type: nf4
    bnb_4bit_use_double_quant: True
    bnb_4bit_compute_dtype: bfloat16

### Framework versions

    PEFT 0.6.0.dev0