Text Generation
Transformers
Safetensors
Hindi
English
parambharatgen
Ayurvedic
conversational
custom_code
Instructions to use bharatgenai/AyurParam with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use bharatgenai/AyurParam with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="bharatgenai/AyurParam", trust_remote_code=True) messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained("bharatgenai/AyurParam", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use bharatgenai/AyurParam with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "bharatgenai/AyurParam" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "bharatgenai/AyurParam", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/bharatgenai/AyurParam
- SGLang
How to use bharatgenai/AyurParam with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "bharatgenai/AyurParam" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "bharatgenai/AyurParam", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "bharatgenai/AyurParam" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "bharatgenai/AyurParam", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use bharatgenai/AyurParam with Docker Model Runner:
docker model run hf.co/bharatgenai/AyurParam
| language: | |
| - hi | |
| - en | |
| base_model: | |
| - bharatgenai/Param-1-2.9B-Instruct | |
| pipeline_tag: text-generation | |
| tags: | |
| - Ayurvedic | |
| library_name: transformers | |
| license: apache-2.0 | |
| <div align="center"> | |
| <img src="https://huggingface.co/bharatgenai/Param-1-2.9B-Instruct/resolve/main/BharatGen%20Logo%20(1).png" width="60%" alt="BharatGen" /> | |
| </div> | |
| <hr> | |
| <div align="center"> | |
| <a href="#" style="margin: 4px; pointer-events: none; cursor: default;"> | |
| <img alt="Paper" src="https://img.shields.io/badge/Paper-Coming%20Soon-lightgrey?style=flat" /> | |
| </a> | |
| <a href="https://creativecommons.org/licenses/by/4.0/" target="_blank" style="margin: 4px;"> | |
| <img alt="License" src="https://img.shields.io/badge/License-CC--BY--4.0-blue.svg" /> | |
| </a> | |
| <a href="#" target="_blank" style="margin: 4px;"> | |
| <img alt="Blog" src="https://img.shields.io/badge/Blog-Read%20More-brightgreen?style=flat" /> | |
| </a> | |
| </div> | |
| # AyurParam | |
| BharatGen introduces AyurParam, a domain-specialized large language model fine-tuned from Param-1-2.9B-Instruct on a high-quality Ayurveda dataset. It is designed to handle Ayurvedic queries, classical text interpretation, clinical guidance, and wellness knowledge. Ayurveda offers vast traditional medical wisdom, yet most language models lack domain-specific understanding. AyurParam bridges this gap by combining Param-1’s bilingual strengths with a curated Ayurvedic knowledge base, enabling contextually rich and culturally grounded responses. | |
| ## 🏗 Model Architecture | |
| AyurParam inherits the architecture of Param-1-2.9B-Instruct: | |
| * Hidden size: 204 | |
| * Intermediate size: 7168 | |
| * Attention heads: 16 | |
| * Hidden layers: 32 | |
| * Key-value heads: 8 | |
| * Max position embeddings: 2048 | |
| * Activation: SiLU | |
| * Positional Embeddings: Rotary (RoPE, theta=10000) | |
| * Attention Mechanism: Grouped-query attention | |
| * Precision: bf16-mixed | |
| * Base model: Param-1-2.9B-Instruct | |
| ## 📚 AyurParam Dataset Preparation | |
| AyurParam’s dataset was meticulously curated to capture the depth of Ayurvedic wisdom, ensure bilingual accessibility (English + Hindi), and support diverse clinical and academic applications. The preparation process focused on authenticity, quality, and relevance. | |
| ### 🔎 Data Sources | |
| #### Total Books Collected: ~1000 | |
| * **~0.15M** Pages, **~54.5M** words | |
| * **600** from open-source archives (digitized classical texts) | |
| * **400** from internet sources covering specialized Ayurvedic domains | |
| #### Domains Covered (examples): | |
| * Kaaychikitsa (कायचिकित्सा) | |
| * Panchkarma (पंचकर्म) | |
| * Shalya Tantra (शल्यतंत्र) | |
| * Shalakya Tantra (शालाक्यतंत्र) | |
| * Research Methodology | |
| * Ashtang Hruday (अष्टांगहृदय) | |
| * Kriya Shaarir (क्रिया शारीर) | |
| * Padarth Vigyan (पदार्थ विज्ञान) | |
| * Rachana Shaarir (रचना शारीर) | |
| * Charak Samhita (चरक संहिता) | |
| * Dravyaguna (द्रव्यगुण) | |
| * Rasa Shastra & Bhaishajya Kalpana (रसशास्त्र एवम भैषज्यकल्पना) | |
| * Rog Nidan (रोगनिदान) | |
| * AgadTantra (अगदतंत्र) | |
| * Balrog (बालरोग) | |
| * Strirog & Prasuti Tantra (स्त्रीरोग एवम प्रसूति तंत्र) | |
| * Swasthvrutta (स्वस्थवृत्त) | |
| * Sanskrit grammar, commentaries, and supporting texts | |
| * etc | |
| ### 🧩 Data Processing Pipeline | |
| #### 1. Source Gathering | |
| * Collected and digitized 1000 Ayurvedic books across classical, clinical, and academic domains. | |
| * Preserved Sanskrit terminology with transliteration and contextual explanation | |
| #### 2. Question–Answer Generation | |
| * **Method**: By-page Q&A generation using an open-source LLM. | |
| * **Focus**: Only Ayurveda-related, context-grounded questions. | |
| * **Review**: Domain expert validation for accuracy and clarity. | |
| #### 3. Taxonomy | |
| * Dosha, Dhatu, Mala, Srotas, Nidana, Chikitsa, etc. | |
| #### 4. Final Dataset Construction | |
| * Q&A Types: | |
| * **General Q&A** – direct knowledge-based | |
| * **Thinking Q&A** – reasoning and application-oriented | |
| * **Objective Q&A** – fact-check, MCQ, structured answers | |
| * Languages: English + Hindi | |
| * **Training Samples**: ~4.8 Million (all combined) | |
| * Includes single-turn and multi-turn conversations | |
| ## 🏋️ Training Setup | |
| * Base model: Param-1-2.9B-Instruct | |
| * Training framework: Hugging Face + TRL (SFT) + torchrun multi-node setup | |
| * Prompt template: Custom-designed for Ayurvedic inference | |
| * Scheduler: Linear with warmup | |
| * Epochs: 3 | |
| * Total training samples: ~4.8M | |
| * Test samples: ~800k | |
| * Base learning rate: 5e-6 | |
| * Minimum learning rate: 0 | |
| * Additional tokens: ```<user>, <assistant>, <context>, <system_prompt>, <actual_response>, </actual_response>``` | |
| * Vocab size: 256k + 4 | |
| * Global batch size: 1024 | |
| * Micro batch size: 4 | |
| * Gradient accumulation steps: 32 | |
| ## 🚀 Inference Example | |
| ```python | |
| from transformers import AutoTokenizer, AutoModelForCausalLM | |
| import torch | |
| model_name = "bharatgenai/AyurParam" | |
| tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=False) | |
| model = AutoModelForCausalLM.from_pretrained( | |
| model_name, | |
| trust_remote_code=True, | |
| torch_dtype=torch.bfloat16 if torch.cuda.is_available() else torch.bfloat32, | |
| device_map="auto" | |
| ) | |
| # Example Ayurvedic query | |
| user_input = "What is the Samprapti (pathogenesis) of Amavata according to Ayurveda?" | |
| # Prompt styles | |
| # 1. Generic QA: <user> ... <assistant> | |
| # 2. Context-based QA: <context> ... <user> ... <assistant> | |
| # 3. Multi-turn conversation (supports up to 5 turns): <user> ... <assistant> ... <user> ... <assistant> | |
| prompt = f"<user> {user_input} <assistant>" | |
| inputs = tokenizer(prompt, return_tensors="pt").to(model.device) | |
| with torch.no_grad(): | |
| output = model.generate( | |
| **inputs, | |
| max_new_tokens=300, | |
| do_sample=True, | |
| top_k=50, | |
| top_p=0.95, | |
| temperature=0.6, | |
| eos_token_id=tokenizer.eos_token_id, | |
| use_cache=False | |
| ) | |
| print(tokenizer.decode(output[0], skip_special_tokens=True)) | |
| ``` | |
| ## 📊 Benchmark Results: Ayur Param vs Baselines | |
| - [BhashaBench-Ayur benchmark](https://huggingface.co/datasets/bharatgenai/BhashaBench-Ayur) | |
| --- | |
| ## 1. Overall Performance | |
| ### Similar Range Models | |
| | Model | bba | bba_English | bba_Hindi | | |
| |-----------------------|-------|-------------|-----------| | |
| | **AyurParam-2.9B-Instruct** | **39.97** | **41.12** | **38.04** | | |
| | Llama-3.2-3B-Instruct | 33.20 | 35.31 | 29.67 | | |
| | Qwen2.5-3B-Instruct | 32.68 | 35.22 | 28.46 | | |
| | granite-3.1-2b | 31.10 | 33.39 | 27.30 | | |
| | gemma-2-2b-it | 28.40 | 29.38 | 26.79 | | |
| | Llama-3.2-1B-Instruct | 26.41 | 26.77 | 25.82 | | |
| ### Larger Models | |
| | Model | bba | bba_English | bba_Hindi | | |
| |-----------------------------------------|-------|-------------|-----------| | |
| | **AyurParam-2.9B-Instruct** | **39.97** | **41.12** | **38.04** | | |
| | gemma-2-27b-it | 37.99 | 40.45 | 33.89 | | |
| | Pangea-7B | 37.41 | 40.69 | 31.93 | | |
| | gpt-oss-20b | 36.34 | 38.30 | 33.09 | | |
| | Indic-gemma-7B-Navarasa-2.0 | 35.13 | 37.12 | 31.83 | | |
| | Llama-3.1-8B-Instruct | 34.76 | 36.86 | 31.26 | | |
| | Nemotron-4-Mini-Hindi-4B-Instruct | 33.54 | 33.38 | 33.82 | | |
| | aya-23-8B | 31.97 | 33.84 | 28.87 | | |
| --- | |
| ## 2. Question Difficulty | |
| ### Similar Range Models | |
| | Difficulty | **AyurParam-2.9B-Instruct** | Llama-3.2-3B | Qwen2.5-3B | granite-3.1-2b | gemma-2-2b-it | Llama-3.2-1B | | |
| |------------|-----------------------------|--------------|------------|----------------|---------------|--------------| | |
| | **Easy** | **43.93** | 36.42 | 35.55 | 33.90 | 29.96 | 27.44 | | |
| | **Medium** | **35.95** | 29.66 | 29.57 | 28.06 | 26.83 | 25.23 | | |
| | **Hard** | **31.21** | 28.51 | 28.23 | 26.81 | 24.96 | 25.39 | | |
| ### Larger Models | |
| | Difficulty | **AyurParam-2.9B-Instruct** | gemma-2-27b-it | Pangea-7B | gpt-oss-20b | Llama-3.1-8B | Indic-gemma-7B | Nemotron-4-Mini-Hindi-4B | aya-23-8B | | |
| |------------|-----------------------------|----------------|-----------|-------------|--------------|----------------|--------------------------|-----------| | |
| | **Easy** | **43.93** | 43.47 | 41.45 | 42.03 | 39.43 | 38.54 | 36.08 | 35.51 | | |
| | **Medium** | **35.95** | 31.90 | 32.94 | 30.27 | 29.36 | 31.72 | 30.80 | 28.29 | | |
| | **Hard** | **31.21** | 30.78 | 31.77 | 26.67 | 30.50 | 27.23 | 29.50 | 25.11 | |
| --- | |
| ## 3. Question Type | |
| ### Similar Range Models | |
| | Type | Llama-3.2-1B | Qwen2.5-3B | Llama-3.2-3B | **AyurParam-2.9B-Instruct** | granite-3.1-2b | gemma-2-2b-it | | |
| |----------------------|--------------|------------|--------------|------------------------------|----------------|---------------| | |
| | Assertion/Reasoning | 59.26 | 51.85 | 40.74 | **44.44** | 33.33 | 33.33 | | |
| | Fill in the blanks | 26.97 | 29.21 | 34.83 | **29.78** | 21.35 | 32.02 | | |
| | MCQ | 26.34 | 32.70 | 33.17 | **40.12** | 31.22 | 28.33 | | |
| | Match the column | 26.83 | 29.27 | 29.27 | **24.39** | 29.27 | 36.59 | | |
| ### Larger Models | |
| | Type | Indic-gemma-7B | Pangea-7B | gemma-2-27b-it | **AyurParam-2.9B-Instruct** | Nemotron-4-Mini-Hindi-4B | gpt-oss-20b | Llama-3.1-8B | aya-23-8B | | |
| |----------------------|----------------|-----------|----------------|-----------------------------|--------------------------|-------------|--------------|-----------| | |
| | Assertion/Reasoning | 59.26 | 62.96 | 55.56 | **44.44** | 37.04 | 25.93 | 29.63 | 18.52 | | |
| | Fill in the blanks | 35.39 | 24.16 | 35.96 | **29.78** | 30.34 | 32.02 | 26.97 | 30.90 | | |
| | MCQ | 35.10 | 37.53 | 37.98 | **40.12** | 33.60 | 36.39 | 34.83 | 32.05 | | |
| | Match the column | 31.71 | 34.15 | 39.02 | **24.39** | 24.39 | 46.34 | 46.34 | 17.07 | | |
| --- | |
| From the above results, **AyurParam not only outperforms all similar-sized models** but also achieves **competitive or better performance than larger models** across multiple metrics. | |
| ## Citation | |
| Please cite our paper if used in your work: | |
| ```bibtex | |
| @misc{nauman2025ayurparamstateoftheartbilinguallanguage, | |
| title={AyurParam: A State-of-the-Art Bilingual Language Model for Ayurveda}, | |
| author={Mohd Nauman and Sravan Gvm and Vijay Devane and Shyam Pawar and Viraj Thakur and Kundeshwar Pundalik and Piyush Sawarkar and Rohit Saluja and Maunendra Desarkar and Ganesh Ramakrishnan}, | |
| year={2025}, | |
| eprint={2511.02374}, | |
| archivePrefix={arXiv}, | |
| primaryClass={cs.CL}, | |
| url={https://arxiv.org/abs/2511.02374}, | |
| } | |
| ``` | |
| ## Contact | |
| For any questions or feedback, please contact: | |
| - Sravan Kumar (sravan.kumar@tihiitb.org) | |
| - Kundeshwar Pundalik (kundeshwar.pundalik@tihiitb.org) | |
| - Mohd.Nauman (mohd.nauman@tihiitb.org) |