--- language: kn license: apache-2.0 tags: - causal-lm - assistant - reasoning - kannada model_name: dhee-nxtgen-qwen3-kannada-v2 library_name: transformers --- # Dhee-NxtGen-Qwen3-Kannada-v2 ## Model Description **Dhee-NxtGen-Qwen3-Kannada-v2** is a large language model designed for advanced Kannada language understanding and generation. It is based on the **Qwen3** architecture and fine-tuned for **assistant-style**, **function-calling**, and **reasoning-based** conversational tasks. Developed by **DheeYantra** in collaboration with **NxtGen Cloud Technologies Pvt. Ltd.**, this model is ideal for building intelligent Kannada chatbots, reasoning systems, and task-based dialogue agents. ## Key Features - Fluent, context-aware Kannada text generation - Optimized for assistant-style and reasoning conversations - Handles open-ended generation, summarization, and Q&A - Fully compatible with 🤗 Hugging Face Transformers - Supports **VLLM** for high-performance inference ## Example Usage ```python from transformers import AutoTokenizer, AutoModelForCausalLM model_name = "dheeyantra/dhee-nxtgen-qwen3-kannada-v2" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained(model_name, trust_remote_code=True) # Example prompt prompt = """<|im_start|>system You are a helpful assistant.<|im_end|> <|im_start|>user ನೀವು ನನಗಾಗಿ ಒಂದು ಅಪಾಯಿಂಟ್ಮೆಂಟ್ ನಿಗದಿಪಡಿಸಬಹುದೇ?<|im_end|> <|im_start|>assistant """ inputs = tokenizer(prompt, return_tensors="pt").to(model.device) outputs = model.generate(**inputs, max_new_tokens=150) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` ## Intended Uses & Limitations ### Intended Uses - Kannada conversational chatbots and assistants - Function-calling and structured response generation - Story generation and summarization in Kannada - Natural dialogue systems for Indic AI applications ### Limitations - May generate inaccurate or biased responses in rare cases - Performance can vary on out-of-domain or code-mixed inputs - Primarily optimized for Kannada; other languages may produce less fluent results ## VLLM / High-Performance Serving Requirements For high-throughput serving with **vLLM**, ensure the following environment: - GPU with compute capability ≥ 8.0 (e.g., NVIDIA A100) - PyTorch 2.1+ and CUDA toolkit installed - For V100 GPUs (sm70), vLLM GPU inference is not supported; CPU fallback is possible but slower. Install dependencies: ```bash pip install torch transformers vllm sentencepiece ``` Run vLLM server: ```bash vllm serve --model dheeyantra/dhee-nxtgen-qwen3-kannada-v2 --host 0.0.0.0 --port 8000 ``` ## License Released under the **Apache 2.0 License**. --- *Developed by DheeYantra in collaboration with NxtGen Cloud Technologies Pvt. Ltd.*