SecureX-HR: Qwen3-7B RAFT fine-tuned on TechQA

QLoRA fine-tuned Qwen/Qwen3-8B using RAFT (arXiv:2403.10131) for the SecureX AI enterprise Technical RAG pipeline.

Training details

Method: QLoRA 4-bit NF4 + RAFT + Unsloth
Dataset: rungalileo/ragbench (cuad subset)
Hardware: Kaggle T4 x2 (Unsloth single GPU mode)
LoRA rank: 16 | Alpha: 32
Samples: 475 | Epochs: 3
Thinking mode: OFF at inference

Usage

from unsloth import FastLanguageModel
import torch

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name   = "Niraj-P-Chaudhari/securex-techqa-qwen3-8b-raft",
    max_seq_length = 2048,
    dtype          = None,
    load_in_4bit   = True,
)
FastLanguageModel.for_inference(model)

msgs = [
    {"role": "system", "content": "You are SecureX-HR..."},
    {"role": "user",   "content": "Question: ...\n\nRetrieved Documents:\n..."},
]
inputs = tokenizer.apply_chat_template(msgs, tokenize=True,
          return_tensors="pt", enable_thinking=False).to("cuda")
out = model.generate(inputs, max_new_tokens=512, temperature=0.1)
print(tokenizer.decode(out[0][inputs.shape[1]:], skip_special_tokens=True))

Citation

@misc{zhang2024raft,
  title={RAFT: Adapting Language Model to Domain Specific RAG},
  author={Tianhao Zhang et al.},
  year={2024},
  eprint={2403.10131},
  archivePrefix={arXiv}
}

Downloads last month: 3

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Niraj-P-Chaudhari/securex-techqa-qwen3-8b-raft

Base model

Qwen/Qwen3-8B-Base

Finetuned

Qwen/Qwen3-8B

Adapter

(1451)

this model

Dataset used to train Niraj-P-Chaudhari/securex-techqa-qwen3-8b-raft

Paper for Niraj-P-Chaudhari/securex-techqa-qwen3-8b-raft

RAFT: Adapting Language Model to Domain Specific RAG

Paper • 2403.10131 • Published Mar 15, 2024 • 72