--- base_model: Qwen/Qwen3-8B library_name: peft tags: - qwen3 - raft - qlora - unsloth - rag - hr-policy - legal-qa - enterprise - securex datasets: - rungalileo/ragbench language: - en license: apache-2.0 --- # SecureX-HR: Qwen3-7B RAFT fine-tuned on TechQA QLoRA fine-tuned `Qwen/Qwen3-8B` using RAFT ([arXiv:2403.10131](https://arxiv.org/abs/2403.10131)) for the SecureX AI enterprise Technical RAG pipeline. ## Training details - **Method:** QLoRA 4-bit NF4 + RAFT + Unsloth - **Dataset:** rungalileo/ragbench (cuad subset) - **Hardware:** Kaggle T4 x2 (Unsloth single GPU mode) - **LoRA rank:** 16 | **Alpha:** 32 - **Samples:** 475 | **Epochs:** 3 - **Thinking mode:** OFF at inference ## Usage ```python from unsloth import FastLanguageModel import torch model, tokenizer = FastLanguageModel.from_pretrained( model_name = "Niraj-P-Chaudhari/securex-techqa-qwen3-8b-raft", max_seq_length = 2048, dtype = None, load_in_4bit = True, ) FastLanguageModel.for_inference(model) msgs = [ {"role": "system", "content": "You are SecureX-HR..."}, {"role": "user", "content": "Question: ...\n\nRetrieved Documents:\n..."}, ] inputs = tokenizer.apply_chat_template(msgs, tokenize=True, return_tensors="pt", enable_thinking=False).to("cuda") out = model.generate(inputs, max_new_tokens=512, temperature=0.1) print(tokenizer.decode(out[0][inputs.shape[1]:], skip_special_tokens=True)) ``` ## Citation ```bibtex @misc{zhang2024raft, title={RAFT: Adapting Language Model to Domain Specific RAG}, author={Tianhao Zhang et al.}, year={2024}, eprint={2403.10131}, archivePrefix={arXiv} } ```