How to use from
Unsloth Studio
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Niraj-P-Chaudhari/securex-techqa-qwen3-8b-raft to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Niraj-P-Chaudhari/securex-techqa-qwen3-8b-raft to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for Niraj-P-Chaudhari/securex-techqa-qwen3-8b-raft to start chatting
Load model with FastModel
pip install unsloth
from unsloth import FastModel
model, tokenizer = FastModel.from_pretrained(
    model_name="Niraj-P-Chaudhari/securex-techqa-qwen3-8b-raft",
    max_seq_length=2048,
)
Quick Links

SecureX-HR: Qwen3-7B RAFT fine-tuned on TechQA

QLoRA fine-tuned Qwen/Qwen3-8B using RAFT (arXiv:2403.10131) for the SecureX AI enterprise Technical RAG pipeline.

Training details

  • Method: QLoRA 4-bit NF4 + RAFT + Unsloth
  • Dataset: rungalileo/ragbench (cuad subset)
  • Hardware: Kaggle T4 x2 (Unsloth single GPU mode)
  • LoRA rank: 16 | Alpha: 32
  • Samples: 475 | Epochs: 3
  • Thinking mode: OFF at inference

Usage

from unsloth import FastLanguageModel
import torch

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name   = "Niraj-P-Chaudhari/securex-techqa-qwen3-8b-raft",
    max_seq_length = 2048,
    dtype          = None,
    load_in_4bit   = True,
)
FastLanguageModel.for_inference(model)

msgs = [
    {"role": "system", "content": "You are SecureX-HR..."},
    {"role": "user",   "content": "Question: ...\n\nRetrieved Documents:\n..."},
]
inputs = tokenizer.apply_chat_template(msgs, tokenize=True,
          return_tensors="pt", enable_thinking=False).to("cuda")
out = model.generate(inputs, max_new_tokens=512, temperature=0.1)
print(tokenizer.decode(out[0][inputs.shape[1]:], skip_special_tokens=True))

Citation

@misc{zhang2024raft,
  title={RAFT: Adapting Language Model to Domain Specific RAG},
  author={Tianhao Zhang et al.},
  year={2024},
  eprint={2403.10131},
  archivePrefix={arXiv}
}
Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Niraj-P-Chaudhari/securex-techqa-qwen3-8b-raft

Finetuned
Qwen/Qwen3-8B
Adapter
(1451)
this model

Dataset used to train Niraj-P-Chaudhari/securex-techqa-qwen3-8b-raft

Paper for Niraj-P-Chaudhari/securex-techqa-qwen3-8b-raft