Instructions to use bharatgenai/AyurParam with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use bharatgenai/AyurParam with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="bharatgenai/AyurParam", trust_remote_code=True)
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("bharatgenai/AyurParam", trust_remote_code=True, dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use bharatgenai/AyurParam with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "bharatgenai/AyurParam"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "bharatgenai/AyurParam",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/bharatgenai/AyurParam

SGLang

How to use bharatgenai/AyurParam with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "bharatgenai/AyurParam" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "bharatgenai/AyurParam",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "bharatgenai/AyurParam" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "bharatgenai/AyurParam",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use bharatgenai/AyurParam with Docker Model Runner:
```
docker model run hf.co/bharatgenai/AyurParam
```

AyurParam / README.md

kundeshwar20

Update README.md

932e57e verified 4 months ago

preview code

Raw

History Blame Contribute Delete

11.9 kB

	---
	language:
	- hi
	- en
	base_model:
	- bharatgenai/Param-1-2.9B-Instruct
	pipeline_tag: text-generation
	tags:
	- Ayurvedic
	library_name: transformers
	license: apache-2.0
	---
	<div align="center">
	<img src="https://huggingface.co/bharatgenai/Param-1-2.9B-Instruct/resolve/main/BharatGen%20Logo%20(1).png" width="60%" alt="BharatGen" />
	</div>
	<hr>
	<div align="center">
	<a href="#" style="margin: 4px; pointer-events: none; cursor: default;">
	<img alt="Paper" src="https://img.shields.io/badge/Paper-Coming%20Soon-lightgrey?style=flat" />
	</a>
	<a href="https://creativecommons.org/licenses/by/4.0/" target="_blank" style="margin: 4px;">
	<img alt="License" src="https://img.shields.io/badge/License-CC--BY--4.0-blue.svg" />
	</a>
	<a href="#" target="_blank" style="margin: 4px;">
	<img alt="Blog" src="https://img.shields.io/badge/Blog-Read%20More-brightgreen?style=flat" />
	</a>
	</div>

	# AyurParam
	BharatGen introduces AyurParam, a domain-specialized large language model fine-tuned from Param-1-2.9B-Instruct on a high-quality Ayurveda dataset. It is designed to handle Ayurvedic queries, classical text interpretation, clinical guidance, and wellness knowledge. Ayurveda offers vast traditional medical wisdom, yet most language models lack domain-specific understanding. AyurParam bridges this gap by combining Param-1’s bilingual strengths with a curated Ayurvedic knowledge base, enabling contextually rich and culturally grounded responses.

	## 🏗 Model Architecture
	AyurParam inherits the architecture of Param-1-2.9B-Instruct:
	* Hidden size: 204
	* Intermediate size: 7168
	* Attention heads: 16
	* Hidden layers: 32
	* Key-value heads: 8
	* Max position embeddings: 2048
	* Activation: SiLU
	* Positional Embeddings: Rotary (RoPE, theta=10000)
	* Attention Mechanism: Grouped-query attention
	* Precision: bf16-mixed
	* Base model: Param-1-2.9B-Instruct

	## 📚 AyurParam Dataset Preparation
	AyurParam’s dataset was meticulously curated to capture the depth of Ayurvedic wisdom, ensure bilingual accessibility (English + Hindi), and support diverse clinical and academic applications. The preparation process focused on authenticity, quality, and relevance.
	### 🔎 Data Sources
	#### Total Books Collected: ~1000
	* ~0.15M Pages, ~54.5M words
	* 600 from open-source archives (digitized classical texts)
	* 400 from internet sources covering specialized Ayurvedic domains
	#### Domains Covered (examples):
	* Kaaychikitsa (कायचिकित्सा)
	* Panchkarma (पंचकर्म)
	* Shalya Tantra (शल्यतंत्र)
	* Shalakya Tantra (शालाक्यतंत्र)
	* Research Methodology
	* Ashtang Hruday (अष्टांगहृदय)
	* Kriya Shaarir (क्रिया शारीर)
	* Padarth Vigyan (पदार्थ विज्ञान)
	* Rachana Shaarir (रचना शारीर)
	* Charak Samhita (चरक संहिता)
	* Dravyaguna (द्रव्यगुण)
	* Rasa Shastra & Bhaishajya Kalpana (रसशास्त्र एवम भैषज्यकल्पना)
	* Rog Nidan (रोगनिदान)
	* AgadTantra (अगदतंत्र)
	* Balrog (बालरोग)
	* Strirog & Prasuti Tantra (स्त्रीरोग एवम प्रसूति तंत्र)
	* Swasthvrutta (स्वस्थवृत्त)
	* Sanskrit grammar, commentaries, and supporting texts
	* etc

	### 🧩 Data Processing Pipeline
	#### 1. Source Gathering
	* Collected and digitized 1000 Ayurvedic books across classical, clinical, and academic domains.
	* Preserved Sanskrit terminology with transliteration and contextual explanation
	#### 2. Question–Answer Generation
	* Method: By-page Q&A generation using an open-source LLM.
	* Focus: Only Ayurveda-related, context-grounded questions.
	* Review: Domain expert validation for accuracy and clarity.
	#### 3. Taxonomy
	* Dosha, Dhatu, Mala, Srotas, Nidana, Chikitsa, etc.
	#### 4. Final Dataset Construction
	* Q&A Types:
	* General Q&A – direct knowledge-based
	* Thinking Q&A – reasoning and application-oriented
	* Objective Q&A – fact-check, MCQ, structured answers
	* Languages: English + Hindi
	* Training Samples: ~4.8 Million (all combined)
	* Includes single-turn and multi-turn conversations


	## 🏋️ Training Setup
	* Base model: Param-1-2.9B-Instruct
	* Training framework: Hugging Face + TRL (SFT) + torchrun multi-node setup
	* Prompt template: Custom-designed for Ayurvedic inference
	* Scheduler: Linear with warmup
	* Epochs: 3
	* Total training samples: ~4.8M
	* Test samples: ~800k
	* Base learning rate: 5e-6
	* Minimum learning rate: 0
	* Additional tokens: ```<user>, <assistant>, <context>, <system_prompt>, <actual_response>, </actual_response>```
	* Vocab size: 256k + 4
	* Global batch size: 1024
	* Micro batch size: 4
	* Gradient accumulation steps: 32


	## 🚀 Inference Example
	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM
	import torch

	model_name = "bharatgenai/AyurParam"
	tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=False)
	model = AutoModelForCausalLM.from_pretrained(
	model_name,
	trust_remote_code=True,
	torch_dtype=torch.bfloat16 if torch.cuda.is_available() else torch.bfloat32,
	device_map="auto"
	)

	# Example Ayurvedic query
	user_input = "What is the Samprapti (pathogenesis) of Amavata according to Ayurveda?"

	# Prompt styles
	# 1. Generic QA: <user> ... <assistant>
	# 2. Context-based QA: <context> ... <user> ... <assistant>
	# 3. Multi-turn conversation (supports up to 5 turns): <user> ... <assistant> ... <user> ... <assistant>

	prompt = f"<user> {user_input} <assistant>"
	inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

	with torch.no_grad():
	output = model.generate(
	**inputs,
	max_new_tokens=300,
	do_sample=True,
	top_k=50,
	top_p=0.95,
	temperature=0.6,
	eos_token_id=tokenizer.eos_token_id,
	use_cache=False
	)

	print(tokenizer.decode(output[0], skip_special_tokens=True))
	```


	## 📊 Benchmark Results: Ayur Param vs Baselines
	- [BhashaBench-Ayur benchmark](https://huggingface.co/datasets/bharatgenai/BhashaBench-Ayur)
	---

	## 1. Overall Performance

	### Similar Range Models
	\| Model \| bba \| bba_English \| bba_Hindi \|
	\|-----------------------\|-------\|-------------\|-----------\|
	\| AyurParam-2.9B-Instruct \| 39.97 \| 41.12 \| 38.04 \|
	\| Llama-3.2-3B-Instruct \| 33.20 \| 35.31 \| 29.67 \|
	\| Qwen2.5-3B-Instruct \| 32.68 \| 35.22 \| 28.46 \|
	\| granite-3.1-2b \| 31.10 \| 33.39 \| 27.30 \|
	\| gemma-2-2b-it \| 28.40 \| 29.38 \| 26.79 \|
	\| Llama-3.2-1B-Instruct \| 26.41 \| 26.77 \| 25.82 \|


	### Larger Models
	\| Model \| bba \| bba_English \| bba_Hindi \|
	\|-----------------------------------------\|-------\|-------------\|-----------\|
	\| AyurParam-2.9B-Instruct \| 39.97 \| 41.12 \| 38.04 \|
	\| gemma-2-27b-it \| 37.99 \| 40.45 \| 33.89 \|
	\| Pangea-7B \| 37.41 \| 40.69 \| 31.93 \|
	\| gpt-oss-20b \| 36.34 \| 38.30 \| 33.09 \|
	\| Indic-gemma-7B-Navarasa-2.0 \| 35.13 \| 37.12 \| 31.83 \|
	\| Llama-3.1-8B-Instruct \| 34.76 \| 36.86 \| 31.26 \|
	\| Nemotron-4-Mini-Hindi-4B-Instruct \| 33.54 \| 33.38 \| 33.82 \|
	\| aya-23-8B \| 31.97 \| 33.84 \| 28.87 \|

	---

	## 2. Question Difficulty

	### Similar Range Models
	\| Difficulty \| AyurParam-2.9B-Instruct \| Llama-3.2-3B \| Qwen2.5-3B \| granite-3.1-2b \| gemma-2-2b-it \| Llama-3.2-1B \|
	\|------------\|-----------------------------\|--------------\|------------\|----------------\|---------------\|--------------\|
	\| Easy \| 43.93 \| 36.42 \| 35.55 \| 33.90 \| 29.96 \| 27.44 \|
	\| Medium \| 35.95 \| 29.66 \| 29.57 \| 28.06 \| 26.83 \| 25.23 \|
	\| Hard \| 31.21 \| 28.51 \| 28.23 \| 26.81 \| 24.96 \| 25.39 \|

	### Larger Models
	\| Difficulty \| AyurParam-2.9B-Instruct \| gemma-2-27b-it \| Pangea-7B \| gpt-oss-20b \| Llama-3.1-8B \| Indic-gemma-7B \| Nemotron-4-Mini-Hindi-4B \| aya-23-8B \|
	\|------------\|-----------------------------\|----------------\|-----------\|-------------\|--------------\|----------------\|--------------------------\|-----------\|
	\| Easy \| 43.93 \| 43.47 \| 41.45 \| 42.03 \| 39.43 \| 38.54 \| 36.08 \| 35.51 \|
	\| Medium \| 35.95 \| 31.90 \| 32.94 \| 30.27 \| 29.36 \| 31.72 \| 30.80 \| 28.29 \|
	\| Hard \| 31.21 \| 30.78 \| 31.77 \| 26.67 \| 30.50 \| 27.23 \| 29.50 \| 25.11

	---

	## 3. Question Type

	### Similar Range Models
	\| Type \| Llama-3.2-1B \| Qwen2.5-3B \| Llama-3.2-3B \| AyurParam-2.9B-Instruct \| granite-3.1-2b \| gemma-2-2b-it \|
	\|----------------------\|--------------\|------------\|--------------\|------------------------------\|----------------\|---------------\|
	\| Assertion/Reasoning \| 59.26 \| 51.85 \| 40.74 \| 44.44 \| 33.33 \| 33.33 \|
	\| Fill in the blanks \| 26.97 \| 29.21 \| 34.83 \| 29.78 \| 21.35 \| 32.02 \|
	\| MCQ \| 26.34 \| 32.70 \| 33.17 \| 40.12 \| 31.22 \| 28.33 \|
	\| Match the column \| 26.83 \| 29.27 \| 29.27 \| 24.39 \| 29.27 \| 36.59 \|

	### Larger Models
	\| Type \| Indic-gemma-7B \| Pangea-7B \| gemma-2-27b-it \| AyurParam-2.9B-Instruct \| Nemotron-4-Mini-Hindi-4B \| gpt-oss-20b \| Llama-3.1-8B \| aya-23-8B \|
	\|----------------------\|----------------\|-----------\|----------------\|-----------------------------\|--------------------------\|-------------\|--------------\|-----------\|
	\| Assertion/Reasoning \| 59.26 \| 62.96 \| 55.56 \| 44.44 \| 37.04 \| 25.93 \| 29.63 \| 18.52 \|
	\| Fill in the blanks \| 35.39 \| 24.16 \| 35.96 \| 29.78 \| 30.34 \| 32.02 \| 26.97 \| 30.90 \|
	\| MCQ \| 35.10 \| 37.53 \| 37.98 \| 40.12 \| 33.60 \| 36.39 \| 34.83 \| 32.05 \|
	\| Match the column \| 31.71 \| 34.15 \| 39.02 \| 24.39 \| 24.39 \| 46.34 \| 46.34 \| 17.07 \|

	---
	From the above results, AyurParam not only outperforms all similar-sized models but also achieves competitive or better performance than larger models across multiple metrics.


	## Citation

	Please cite our paper if used in your work:
	```bibtex
	@misc{nauman2025ayurparamstateoftheartbilinguallanguage,
	title={AyurParam: A State-of-the-Art Bilingual Language Model for Ayurveda},
	author={Mohd Nauman and Sravan Gvm and Vijay Devane and Shyam Pawar and Viraj Thakur and Kundeshwar Pundalik and Piyush Sawarkar and Rohit Saluja and Maunendra Desarkar and Ganesh Ramakrishnan},
	year={2025},
	eprint={2511.02374},
	archivePrefix={arXiv},
	primaryClass={cs.CL},
	url={https://arxiv.org/abs/2511.02374},
	}
	```

	## Contact
	For any questions or feedback, please contact:
	- Sravan Kumar (sravan.kumar@tihiitb.org)
	- Kundeshwar Pundalik (kundeshwar.pundalik@tihiitb.org)
	- Mohd.Nauman (mohd.nauman@tihiitb.org)