--- license: apache-2.0 language: - en - zh base_model: - 01-ai/Yi-1.5-9B tags: - text-generation - chat - instruction-tuned - reasoning - long-context - conversational-ai --- ## Yi-1.5-9B-Chat Yi-1.5-9B-Chat is a conversational large language model developed by 01.AI as part of the Yi-1.5 model family. It is designed to deliver strong instruction-following, reasoning, and dialogue performance while maintaining efficient deployment requirements. The model builds on the Yi-1.5 pretrained foundation and is further optimized for chat-style interaction. It supports structured responses, multi-turn conversation, and analytical tasks such as coding, mathematical reasoning, and knowledge-based explanation. Yi-1.5 models are trained using high-quality large-scale datasets and refined through instruction tuning to improve response usefulness and conversational reliability. --- ## Model Overview - **Model Name**: Yi-1.5-9B-Chat - **Model Family**: Yi-1.5 Series - **Base Model**: 01-ai/Yi-1.5-9B - **Architecture**: Decoder-only Transformer - **Parameter Count**: 9 Billion - **Context Length Options**: 4K, 16K, 32K (variant dependent) - **Modalities**: Text - **Primary Languages**: English, Chinese, multilingual capability - **Developer**: 01.AI - **License**: Apache 2.0 --- ## Design Objectives Yi-1.5-9B-Chat is built to provide high-quality conversational performance with strong reasoning ability while remaining practical for deployment. Key design priorities include: - Reliable instruction-following behavior - Strong performance in reasoning, math, and coding tasks - Stable multi-turn dialogue handling - Flexible long-context processing - Efficient inference compared to larger models --- ## Quantization Details ### Q4_K_M - Approx. ~71% size reduction (4.96 GB) - High compression for reduced memory usage - Suitable for CPU inference and limited VRAM systems - Faster generation for local deployments - Slight reduction in reasoning precision for complex tasks ### Q5_K_M - Approx. ~67% size reduction (5.83 GB) - Higher numerical precision and response stability - Improved logical consistency and coherence - Better performance on reasoning-heavy prompts - Recommended when additional memory is available --- ## Training Overview ### Pretraining Foundation Yi-1.5 models are continuously pretrained from the original Yi models using large-scale high-quality text corpora. Training data includes hundreds of billions of tokens designed to improve language understanding, reasoning, and knowledge representation. ### Instruction Alignment The chat variant is further refined using millions of diverse instruction and conversation examples to enhance: - Conversational clarity - Prompt understanding - Structured response generation - Task-oriented interaction This process improves performance in coding, mathematics, reasoning, and instruction-following tasks compared to earlier Yi models. --- ## Core Capabilities - **Instruction-following** Executes complex prompts and structured tasks reliably. - **Conversational interaction** Maintains coherent multi-turn dialogue. - **Reasoning and analytical problem solving** Strong performance in logic, math, and technical explanation. - **Long-context understanding** Supports extended documents and multi-step workflows. - **General knowledge and comprehension** Handles diverse topics including coding and reading comprehension. --- ## Example Usage ### llama.cpp ``` ./llama-cli -m SandlogicTechnologies\Yi-1.5-9B-Chat_Q4_K_M.gguf -p "Explain the concept of gradient descent." ``` --- ## Recommended Use Cases - Conversational AI assistants - Knowledge and question answering systems - Technical explanation and tutoring - Coding and analytical reasoning support - Research experimentation with instruction-tuned models - Local deployment of capable mid-size language models --- ## Acknowledgments These quantized models are based on the original work by **01-ai** development team. Special thanks to: - The [01-ai](https://huggingface.co/01-ai) team for developing and releasing the [Yi-1.5-9B-Chat](http://huggingface.co/01-ai/Yi-1.5-9B-Chat) model. - **Georgi Gerganov** and the entire [`llama.cpp`](https://github.com/ggerganov/llama.cpp) open-source community for enabling efficient model quantization and inference via the GGUF format. --- ## Contact For any inquiries or support, please contact us at support@sandlogic.com or visit our [Website](https://www.sandlogic.com/).