GGUF of a merged checkpoint 4320 [ausboss/llama2-13b-supercot-loras2](https://huggingface.co/ausboss/llama2-13b-supercot-loras2) with base Llama 2 13B. It is currently only quantised to Q5_K_M as this is the smallest size with comparable accuracy to 8bit (almost lossless). I have a fp16 GGUF and will probably quant to 8bit and 4bit GGUF soon. Ausboss' original model card with the LoRA training info. See his model page for further information. ### Training procedure The following bitsandbytes quantization config was used during training: quant_method: bitsandbytes load_in_8bit: False load_in_4bit: True llm_int8_threshold: 6.0 llm_int8_skip_modules: None llm_int8_enable_fp32_cpu_offload: False llm_int8_has_fp16_weight: False bnb_4bit_quant_type: nf4 bnb_4bit_use_double_quant: True bnb_4bit_compute_dtype: bfloat16 ### Framework versions PEFT 0.6.0.dev0