DIRAC — LLaMA-3.2-3B arXiv LoRA

DIRAC (Domain-specific Intelligent Research Assistant with Context) is a LoRA fine-tuned version of meta-llama/Llama-3.2-3B-Instruct, trained on a curated corpus of arXiv ML/AI papers to function as a domain-specific research assistant.

It is designed to be used with a FAISS-based RAG pipeline: retrieved paper chunks are injected into the prompt context, and the model generates grounded, citation-aware answers.

Model Details

Model Description

Model type: Causal Language Model (decoder-only), LoRA adapter
Base model: meta-llama/Llama-3.2-3B-Instruct
Language: English
License: Llama 3.2 Community License
Fine-tuned for: Retrieval-Augmented Generation (RAG) over arXiv ML/AI papers

Model Sources

Repository: Navyasri12355/llama-3.2-3b-arxiv-lora
Project code: DIRAC GitHub

Uses

Direct Use

This adapter can be used with the PEFT library to answer research questions about ML/AI papers when paired with retrieved context chunks from a FAISS vector store.

from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel
import torch

base    = "meta-llama/Llama-3.2-3B-Instruct"
adapter = "Navyasri12355/llama-3.2-3b-arxiv-lora"

bnb       = BitsAndBytesConfig(load_in_4bit=True, bnb_4bit_compute_dtype=torch.float16)
tokenizer = AutoTokenizer.from_pretrained(base)
model     = AutoModelForCausalLM.from_pretrained(base, quantization_config=bnb, device_map="auto")
model     = PeftModel.from_pretrained(model, adapter)
model.eval()

Downstream Use

Intended for use inside the DIRAC research assistant pipeline, where retrieved arXiv paper chunks are injected into the prompt before generation:

You are a research assistant specializing in machine learning.
Use the following retrieved paper excerpts to answer the question.
If the answer is not in the context, say so.

CONTEXT:
[Source 1: LoRA: Low-Rank Adaptation of LLMs (2021)]
We propose LoRA, which freezes the pre-trained model weights and injects trainable
rank decomposition matrices into each layer of the Transformer architecture...

Question: What are the key advantages of LoRA over full fine-tuning?
Answer:

Out-of-Scope Use

General-purpose chatbot or instruction following outside the ML/AI domain
Medical, legal, or safety-critical decision making
Factual recall without a retrieval context (the model is optimized for RAG, not memorization)

Bias, Risks, and Limitations

The training corpus is limited to arXiv ML/AI papers (cs.LG, cs.AI, 2022–2026); performance on other domains will degrade significantly.
The model may hallucinate paper titles or results when no relevant context is retrieved.
Like all LLM-based systems, outputs should be verified against cited sources before use in academic writing.

Recommendations

Always use this model with a retrieval step (RAG). Do not rely on generated answers as ground truth without cross-referencing the cited arXiv papers.

Training Details

Training Data

A curated corpus of 4000 arXiv ML/AI papers (categories: cs.LG, cs.AI) published between 2022–2026, split into 512-token overlapping chunks and paired with synthetically generated question-answer pairs for supervised fine-tuning. A held-out evaluation set of 50 papers (eval/holdout_50.jsonl) was excluded from training.

Training Procedure

Fine-tuned using 🤗 trl (SFTTrainer) + peft on a single NVIDIA T4 GPU (Google Colab) with QLoRA (4-bit NF4 quantization via bitsandbytes).

Training Hyperparameters

Hyperparameter	Value
Training regime	bf16 mixed precision
Optimizer	AdamW (paged)
Learning rate	2e-4
LR schedule	Cosine with warmup
Warmup ratio	0.03
Batch size (effective)	16
Gradient accumulation steps	4
Epochs	3
Max sequence length	2048 tokens
LoRA rank (r)	16
LoRA alpha	32
LoRA dropout	0.05
LoRA target modules	q_proj, k_proj, v_proj, o_proj
Quantization	4-bit NF4 (QLoRA)

Evaluation

Testing Data

Held-out set of 50 arXiv papers (eval/holdout_50.jsonl) not seen during training. Evaluation queries were generated from paper abstracts and full-text sections.

Metrics

ROUGE-L is computed against reference answers derived from paper abstracts.

Model	ROUGE-L
Base LLaMA-3.2-3B-Instruct (no fine-tune)	0.19
DIRAC (this model) + RAG	0.38

The fine-tuned model with RAG achieves a +100% relative improvement over the untuned base.

Technical Specifications

Model Architecture

Component	Detail
Base model	LLaMA-3.2-3B-Instruct (3.21B parameters)
Adapter type	LoRA (Low-Rank Adaptation)
Trainable parameters	~8.4M (≈0.26% of total)
Embedding model (RAG)	`sentence-transformers/all-MiniLM-L6-v2`
Vector store	FAISS (L2 index)
Retrieval top-K	5 chunks per query

Compute Infrastructure

Hardware: NVIDIA T4 (16 GB VRAM) — Google Colab
Training time: ~3 hours
Framework: PyTorch 2.2, Transformers 4.40, PEFT 0.10, TRL 0.8

Citation

If you use this model, please cite:

@misc{pulipati2026dirac,
  author       = {Navyasri Pulipati},
  title        = {DIRAC: Domain-specific Intelligent Research Assistant with Context},
  year         = {2024},
  howpublished = {\url{https://huggingface.co/Navyasri12355/llama-3.2-3b-arxiv-lora}},
  note         = {LoRA fine-tuned LLaMA-3.2-3B for arXiv ML/AI research Q\&A}
}

Model Card Contact

HuggingFace

Downloads last month: 41

Model tree for Navyasri12355/llama-3.2-3b-arxiv-lora

Base model

meta-llama/Llama-3.2-3B-Instruct

Adapter

(762)

this model