Instructions to use goasty/Qwen3-4B-Indian-Law with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use goasty/Qwen3-4B-Indian-Law with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="goasty/Qwen3-4B-Indian-Law") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForMultimodalLM tokenizer = AutoTokenizer.from_pretrained("goasty/Qwen3-4B-Indian-Law") model = AutoModelForMultimodalLM.from_pretrained("goasty/Qwen3-4B-Indian-Law") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use goasty/Qwen3-4B-Indian-Law with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "goasty/Qwen3-4B-Indian-Law" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "goasty/Qwen3-4B-Indian-Law", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/goasty/Qwen3-4B-Indian-Law
- SGLang
How to use goasty/Qwen3-4B-Indian-Law with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "goasty/Qwen3-4B-Indian-Law" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "goasty/Qwen3-4B-Indian-Law", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "goasty/Qwen3-4B-Indian-Law" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "goasty/Qwen3-4B-Indian-Law", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Unsloth Studio
How to use goasty/Qwen3-4B-Indian-Law with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for goasty/Qwen3-4B-Indian-Law to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for goasty/Qwen3-4B-Indian-Law to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for goasty/Qwen3-4B-Indian-Law to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="goasty/Qwen3-4B-Indian-Law", max_seq_length=2048, ) - Docker Model Runner
How to use goasty/Qwen3-4B-Indian-Law with Docker Model Runner:
docker model run hf.co/goasty/Qwen3-4B-Indian-Law
- Qwen3-4B Indian Law
- Model Overview
- Training Dataset
- Dataset Composition
- 1. Indian Legal Supervised Fine-Tuning Dataset
- 2. Indian Law Instruction Dataset
- 3. Constitution of India QA Dataset
- 4. Indian Penal Code (IPC) Dataset
- 5. Code of Criminal Procedure (CrPC) Dataset
- 6. IndicLegalQA
- 7. Bharatiya Nyaya Sanhita (BNS)
- 8. Bharatiya Sakshya Adhiniyam (BSA)
- 9. Indian Court Judgments Corpus
- 1. Indian Legal Supervised Fine-Tuning Dataset
- Dataset Balancing
- Training Configuration
- Example Usage
- Intended Use
- Limitations
- Acknowledgements
- Citation
Qwen3-4B Indian Law
A domain-adapted legal assistant fine-tuned from Qwen3-4B on a large corpus of Indian legal texts, statutory provisions, constitutional law, criminal law, evidence law, procedural law, and court judgments.
The model is designed to assist with:
- Indian legal question answering
- Statutory interpretation
- Constitution-related queries
- Criminal law and procedure
- Legal reasoning
- Case law understanding
- Legal research assistance
- Judgment summarization
- Legal education and training
Model Overview
| Item | Value |
|---|---|
| Base Model | unsloth/Qwen3-4B |
| Fine-Tuning Method | LoRA + QLoRA |
| Framework | Unsloth |
| Context Length | 4096 |
| Precision | BF16 |
| LoRA Rank | 32 |
| LoRA Alpha | 32 |
| Optimizer | AdamW 8-bit |
| Learning Rate | 2e-4 |
| Scheduler | Cosine |
| Epochs | 2 |
| Effective Batch Size | 32 |
| Domain | Indian Legal Knowledge |
Training Dataset
The training corpus was created by combining multiple publicly available Indian legal datasets together with a large judgment corpus.
The objective was to expose the model to:
- Legal question answering
- Statutory provisions
- Constitutional law
- Criminal law
- Procedural law
- Evidence law
- Court judgments
- Legal summarization
- Legal reasoning
Dataset Composition
1. Indian Legal Supervised Fine-Tuning Dataset
Source:
Prarabdha/indian-legal-supervised-fine-tuning-data
Characteristics:
- Large-scale legal instruction dataset
- Context → Question → Answer format
- Derived from Indian court judgments
- Designed for legal reasoning and legal QA
Original Size:
6,055,371 samples
To prevent over-representation and memorization, a subset was selected during dataset balancing.
Contribution:
≈ 250,000 samples
Example:
Context:
Delhi Development Authority v. Kanwar Kumar Mehta
Question:
Was the High Court justified in calculating interest on escalation charges?
Answer:
Yes. The High Court's decision was held justified on equitable grounds.
2. Indian Law Instruction Dataset
Source:
viber1/indian-law-dataset
Characteristics:
- Legal instruction-response pairs
- Covers Indian legal concepts
- General legal knowledge
- Legal terminology
Samples:
24,607
Example:
Question:
What is the difference between a petition and a plaint?
Answer:
A petition is a formal request seeking relief, whereas a plaint is the written statement initiating a civil suit.
3. Constitution of India QA Dataset
Custom processed dataset containing question-answer pairs generated from constitutional provisions.
Coverage:
- Fundamental Rights
- Directive Principles
- Union and State relations
- Parliament
- Judiciary
- Constitutional amendments
Samples:
4,082
Example:
Question:
What is India according to the Constitution?
Answer:
India, that is Bharat, shall be a Union of States.
4. Indian Penal Code (IPC) Dataset
Custom processed IPC question-answer corpus.
Coverage:
- Definitions
- Offences
- Punishments
- Criminal liability
- General exceptions
Samples:
2,267
Example:
Question:
What is the title and extent of operation of the Indian Penal Code?
Answer:
The title is the Indian Penal Code and it extends to offences committed within India and certain offences committed outside India.
5. Code of Criminal Procedure (CrPC) Dataset
Custom processed question-answer dataset generated from CrPC provisions.
Coverage:
- Investigation
- Arrest
- Bail
- Trial procedures
- Appeals
- Criminal courts
Samples:
8,194
Example:
Question:
What is the short title and commencement of the CrPC?
Answer:
The Code of Criminal Procedure, 1973.
6. IndicLegalQA
Legal question-answer dataset derived from Indian Supreme Court judgments.
Coverage:
- Case law
- Judicial reasoning
- Legal interpretation
Samples:
10,002
Example:
Question:
Who was the respondent in Union of India v. Maj. Gen. Manomoy Ganguly?
Answer:
Maj. Gen. Manomoy Ganguly.
7. Bharatiya Nyaya Sanhita (BNS)
Structured dataset generated from the Bharatiya Nyaya Sanhita, 2023.
Coverage:
- Criminal offences
- Punishments
- Definitions
- Modern criminal law provisions
Source Structure:
Chapter
Section
Section Name
Description
8. Bharatiya Sakshya Adhiniyam (BSA)
Structured dataset generated from the Bharatiya Sakshya Adhiniyam, 2023.
Coverage:
- Evidence law
- Documentary evidence
- Digital evidence
- Witness testimony
Source Structure:
Chapter
Section
Section Name
Description
9. Indian Court Judgments Corpus
Largest component of the training data.
Sources include:
- Supreme Court judgments
- High Court judgments
- CourtNIC archives
- JUDIS archives
Documents processed:
16,726 judgment files
Coverage:
- Constitutional law
- Civil law
- Criminal law
- Taxation
- Property law
- Administrative law
- Service law
Training samples were automatically converted into:
Context → Question → Answer
instruction format.
Dataset Balancing
The original corpus was heavily dominated by judgment-derived samples.
Without balancing:
451,756 samples
Distribution:
Judgment-heavy
To improve generalization across statutory and constitutional law, a balancing procedure was applied.
Final balanced dataset:
304,930 samples
Approximate distribution:
| Category | Samples |
|---|---|
| General Legal QA | 190,744 |
| Court Judgments | 66,368 |
| Constitution | 32,346 |
| CrPC | 8,719 |
| IPC | 6,698 |
| BNS | 50 |
| BSA | 5 |
This balancing significantly reduced bias toward judgment memorization while preserving broad legal coverage.
Training Configuration
The model was fine-tuned using LoRA adapters on top of Qwen3-4B.
LoRA Configuration
r=32
lora_alpha=32
lora_dropout=0.0
Target Modules:
q_proj
k_proj
v_proj
o_proj
gate_proj
up_proj
down_proj
Optimization
Learning Rate: 2e-4
Weight Decay: 0.01
Warmup Ratio: 0.03
Scheduler: Cosine
Optimizer: AdamW 8-bit
Training
Epochs: 2
Max Sequence Length: 4096
Batch Size: 8
Gradient Accumulation: 4
Effective Batch Size: 32
Precision: BF16
Example Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
model_name = "goasty/Qwen3-4B-Indian-Law"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto"
)
prompt = """
What is Article 21 of the Constitution of India?
"""
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=256
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Intended Use
Suitable for:
- Legal research assistance
- Educational purposes
- Law students
- Legal document analysis
- Statutory interpretation
- Legal Q&A systems
- Retrieval-Augmented Generation (RAG)
Limitations
- Not a substitute for licensed legal counsel.
- May generate legally incorrect or outdated interpretations.
- Should not be relied upon for litigation or legal advice without expert review.
- Training data contains historical judgments and statutes which may have been amended or overruled.
Acknowledgements
This work builds upon:
- Qwen Team
- Unsloth
- Hugging Face Datasets Community
- Indian Legal Open Data Contributors
- Supreme Court and High Court public legal records
Citation
@misc{qwen3_indian_law,
title={Qwen3-4B Indian Law},
author={Aditya},
year={2026},
note={Fine-tuned Qwen3-4B model for Indian legal reasoning and question answering}
}
- Downloads last month
- 36