Text Generation
GGUF
PyTorch
facebook
meta
llama
llama-3
quantized
GGUF
quantization
imat
imatrix
static
16bit
8bit
6bit
5bit
4bit
3bit
2bit
1bit
conversational
Instructions to use legraphista/Llama-3.2-3B-Instruct-IMat-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use legraphista/Llama-3.2-3B-Instruct-IMat-GGUF with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="legraphista/Llama-3.2-3B-Instruct-IMat-GGUF", filename="Llama-3.2-3B-Instruct.BF16.gguf", )
llm.create_chat_completion( messages = [ { "role": "user", "content": "What is the capital of France?" } ] ) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use legraphista/Llama-3.2-3B-Instruct-IMat-GGUF with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf legraphista/Llama-3.2-3B-Instruct-IMat-GGUF:Q4_K_S # Run inference directly in the terminal: llama-cli -hf legraphista/Llama-3.2-3B-Instruct-IMat-GGUF:Q4_K_S
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf legraphista/Llama-3.2-3B-Instruct-IMat-GGUF:Q4_K_S # Run inference directly in the terminal: llama-cli -hf legraphista/Llama-3.2-3B-Instruct-IMat-GGUF:Q4_K_S
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf legraphista/Llama-3.2-3B-Instruct-IMat-GGUF:Q4_K_S # Run inference directly in the terminal: ./llama-cli -hf legraphista/Llama-3.2-3B-Instruct-IMat-GGUF:Q4_K_S
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf legraphista/Llama-3.2-3B-Instruct-IMat-GGUF:Q4_K_S # Run inference directly in the terminal: ./build/bin/llama-cli -hf legraphista/Llama-3.2-3B-Instruct-IMat-GGUF:Q4_K_S
Use Docker
docker model run hf.co/legraphista/Llama-3.2-3B-Instruct-IMat-GGUF:Q4_K_S
- LM Studio
- Jan
- vLLM
How to use legraphista/Llama-3.2-3B-Instruct-IMat-GGUF with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "legraphista/Llama-3.2-3B-Instruct-IMat-GGUF" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "legraphista/Llama-3.2-3B-Instruct-IMat-GGUF", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/legraphista/Llama-3.2-3B-Instruct-IMat-GGUF:Q4_K_S
- Ollama
How to use legraphista/Llama-3.2-3B-Instruct-IMat-GGUF with Ollama:
ollama run hf.co/legraphista/Llama-3.2-3B-Instruct-IMat-GGUF:Q4_K_S
- Unsloth Studio
How to use legraphista/Llama-3.2-3B-Instruct-IMat-GGUF with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for legraphista/Llama-3.2-3B-Instruct-IMat-GGUF to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for legraphista/Llama-3.2-3B-Instruct-IMat-GGUF to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for legraphista/Llama-3.2-3B-Instruct-IMat-GGUF to start chatting
- Pi
How to use legraphista/Llama-3.2-3B-Instruct-IMat-GGUF with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf legraphista/Llama-3.2-3B-Instruct-IMat-GGUF:Q4_K_S
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "legraphista/Llama-3.2-3B-Instruct-IMat-GGUF:Q4_K_S" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use legraphista/Llama-3.2-3B-Instruct-IMat-GGUF with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf legraphista/Llama-3.2-3B-Instruct-IMat-GGUF:Q4_K_S
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default legraphista/Llama-3.2-3B-Instruct-IMat-GGUF:Q4_K_S
Run Hermes
hermes
- Docker Model Runner
How to use legraphista/Llama-3.2-3B-Instruct-IMat-GGUF with Docker Model Runner:
docker model run hf.co/legraphista/Llama-3.2-3B-Instruct-IMat-GGUF:Q4_K_S
- Lemonade
How to use legraphista/Llama-3.2-3B-Instruct-IMat-GGUF with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull legraphista/Llama-3.2-3B-Instruct-IMat-GGUF:Q4_K_S
Run and chat with the model
lemonade run user.Llama-3.2-3B-Instruct-IMat-GGUF-Q4_K_S
List all available models
lemonade list
| base_model: meta-llama/Llama-3.2-3B-Instruct | |
| extra_gated_button_content: Submit | |
| extra_gated_description: The information you provide will be collected, stored, processed | |
| and shared in accordance with the [Meta Privacy Policy](https://www.facebook.com/privacy/policy/). | |
| extra_gated_prompt: "### LLAMA 3.2 COMMUNITY LICENSE AGREEMENT\n\nLlama 3.2 Version\ | |
| \ Release Date: September 25, 2024\n\n\u201CAgreement\u201D means the terms and\ | |
| \ conditions for use, reproduction, distribution and modification of the Llama\ | |
| \ Materials set forth herein.\n\n\u201CDocumentation\u201D means the specifications,\ | |
| \ manuals and documentation accompanying Llama 3.2 distributed by Meta at https://llama.meta.com/doc/overview.\n\ | |
| \n\u201CLicensee\u201D or \u201Cyou\u201D means you, or your employer or any other\ | |
| \ person or entity (if you are entering into this Agreement on such person or entity\u2019\ | |
| s behalf), of the age required under applicable laws, rules or regulations to provide\ | |
| \ legal consent and that has legal authority to bind your employer or such other\ | |
| \ person or entity if you are entering in this Agreement on their behalf.\n\n\u201C\ | |
| Llama 3.2\u201D means the foundational large language models and software and algorithms,\ | |
| \ including machine-learning model code, trained model weights, inference-enabling\ | |
| \ code, training-enabling code, fine-tuning enabling code and other elements of\ | |
| \ the foregoing distributed by Meta at https://www.llama.com/llama-downloads.\n\ | |
| \n\u201CLlama Materials\u201D means, collectively, Meta\u2019s proprietary Llama\ | |
| \ 3.2 and Documentation (and any portion thereof) made available under this Agreement.\n\ | |
| \n\u201CMeta\u201D or \u201Cwe\u201D means Meta Platforms Ireland Limited (if you\ | |
| \ are located in or, if you are an entity, your principal place of business is\ | |
| \ in the EEA or Switzerland) and Meta Platforms, Inc. (if you are located outside\ | |
| \ of the EEA or Switzerland). \n\nBy clicking \u201CI Accept\u201D below or by using\ | |
| \ or distributing any portion or element of the Llama Materials, you agree to be\ | |
| \ bound by this Agreement.\n\n1. License Rights and Redistribution.\na. Grant of\ | |
| \ Rights. You are granted a non-exclusive, worldwide, non-transferable and royalty-free\ | |
| \ limited license under Meta\u2019s intellectual property or other rights owned\ | |
| \ by Meta embodied in the Llama Materials to use, reproduce, distribute, copy, create\ | |
| \ derivative works of, and make modifications to the Llama Materials. \nb. Redistribution\ | |
| \ and Use. \ni. If you distribute or make available the Llama Materials (or any\ | |
| \ derivative works thereof), or a product or service (including another AI model)\ | |
| \ that contains any of them, you shall (A) provide a copy of this Agreement with\ | |
| \ any such Llama Materials; and (B) prominently display \u201CBuilt with Llama\u201D\ | |
| \ on a related website, user interface, blogpost, about page, or product documentation.\ | |
| \ If you use the Llama Materials or any outputs or results of the Llama Materials\ | |
| \ to create, train, fine tune, or otherwise improve an AI model, which is distributed\ | |
| \ or made available, you shall also include \u201CLlama\u201D at the beginning of\ | |
| \ any such AI model name.\nii. If you receive Llama Materials, or any derivative\ | |
| \ works thereof, from a Licensee as part of an integrated end user product, then\ | |
| \ Section 2 of this Agreement will not apply to you. \niii. You must retain in all\ | |
| \ copies of the Llama Materials that you distribute the following attribution notice\ | |
| \ within a \u201CNotice\u201D text file distributed as a part of such copies: \u201C\ | |
| Llama 3.2 is licensed under the Llama 3.2 Community License, Copyright \xA9 Meta\ | |
| \ Platforms, Inc. All Rights Reserved.\u201D\niv. Your use of the Llama Materials\ | |
| \ must comply with applicable laws and regulations (including trade compliance laws\ | |
| \ and regulations) and adhere to the Acceptable Use Policy for the Llama Materials\ | |
| \ (available at https://www.llama.com/llama3_2/use-policy), which is hereby incorporated\ | |
| \ by reference into this Agreement.\n \n2. Additional Commercial Terms. If, on\ | |
| \ the Llama 3.2 version release date, the monthly active users of the products or\ | |
| \ services made available by or for Licensee, or Licensee\u2019s affiliates, is\ | |
| \ greater than 700 million monthly active users in the preceding calendar month,\ | |
| \ you must request a license from Meta, which Meta may grant to you in its sole\ | |
| \ discretion, and you are not authorized to exercise any of the rights under this\ | |
| \ Agreement unless or until Meta otherwise expressly grants you such rights.\n3.\ | |
| \ Disclaimer of Warranty. UNLESS REQUIRED BY APPLICABLE LAW, THE LLAMA MATERIALS\ | |
| \ AND ANY OUTPUT AND RESULTS THEREFROM ARE PROVIDED ON AN \u201CAS IS\u201D BASIS,\ | |
| \ WITHOUT WARRANTIES OF ANY KIND, AND META DISCLAIMS ALL WARRANTIES OF ANY KIND,\ | |
| \ BOTH EXPRESS AND IMPLIED, INCLUDING, WITHOUT LIMITATION, ANY WARRANTIES OF TITLE,\ | |
| \ NON-INFRINGEMENT, MERCHANTABILITY, OR FITNESS FOR A PARTICULAR PURPOSE. YOU ARE\ | |
| \ SOLELY RESPONSIBLE FOR DETERMINING THE APPROPRIATENESS OF USING OR REDISTRIBUTING\ | |
| \ THE LLAMA MATERIALS AND ASSUME ANY RISKS ASSOCIATED WITH YOUR USE OF THE LLAMA\ | |
| \ MATERIALS AND ANY OUTPUT AND RESULTS.\n4. Limitation of Liability. IN NO EVENT\ | |
| \ WILL META OR ITS AFFILIATES BE LIABLE UNDER ANY THEORY OF LIABILITY, WHETHER\ | |
| \ IN CONTRACT, TORT, NEGLIGENCE, PRODUCTS LIABILITY, OR OTHERWISE, ARISING OUT OF\ | |
| \ THIS AGREEMENT, FOR ANY LOST PROFITS OR ANY INDIRECT, SPECIAL, CONSEQUENTIAL,\ | |
| \ INCIDENTAL, EXEMPLARY OR PUNITIVE DAMAGES, EVEN IF META OR ITS AFFILIATES HAVE\ | |
| \ BEEN ADVISED OF THE POSSIBILITY OF ANY OF THE FOREGOING.\n5. Intellectual Property.\n\ | |
| a. No trademark licenses are granted under this Agreement, and in connection with\ | |
| \ the Llama Materials, neither Meta nor Licensee may use any name or mark owned\ | |
| \ by or associated with the other or any of its affiliates, except as required\ | |
| \ for reasonable and customary use in describing and redistributing the Llama Materials\ | |
| \ or as set forth in this Section 5(a). Meta hereby grants you a license to use\ | |
| \ \u201CLlama\u201D (the \u201CMark\u201D) solely as required to comply with the\ | |
| \ last sentence of Section 1.b.i. You will comply with Meta\u2019s brand guidelines\ | |
| \ (currently accessible at https://about.meta.com/brand/resources/meta/company-brand/).\ | |
| \ All goodwill arising out of your use of the Mark will inure to the benefit of\ | |
| \ Meta.\nb. Subject to Meta\u2019s ownership of Llama Materials and derivatives\ | |
| \ made by or for Meta, with respect to any derivative works and modifications of\ | |
| \ the Llama Materials that are made by you, as between you and Meta, you are and\ | |
| \ will be the owner of such derivative works and modifications.\nc. If you institute\ | |
| \ litigation or other proceedings against Meta or any entity (including a cross-claim\ | |
| \ or counterclaim in a lawsuit) alleging that the Llama Materials or Llama 3.2 outputs\ | |
| \ or results, or any portion of any of the foregoing, constitutes infringement of\ | |
| \ intellectual property or other rights owned or licensable by you, then any licenses\ | |
| \ granted to you under this Agreement shall terminate as of the date such litigation\ | |
| \ or claim is filed or instituted. You will indemnify and hold harmless Meta from\ | |
| \ and against any claim by any third party arising out of or related to your use\ | |
| \ or distribution of the Llama Materials.\n6. Term and Termination. The term of\ | |
| \ this Agreement will commence upon your acceptance of this Agreement or access\ | |
| \ to the Llama Materials and will continue in full force and effect until terminated\ | |
| \ in accordance with the terms and conditions herein. Meta may terminate this Agreement\ | |
| \ if you are in breach of any term or condition of this Agreement. Upon termination\ | |
| \ of this Agreement, you shall delete and cease use of the Llama Materials. Sections\ | |
| \ 3, 4 and 7 shall survive the termination of this Agreement. \n7. Governing Law\ | |
| \ and Jurisdiction. This Agreement will be governed and construed under the laws\ | |
| \ of the State of California without regard to choice of law principles, and the\ | |
| \ UN Convention on Contracts for the International Sale of Goods does not apply\ | |
| \ to this Agreement. The courts of California shall have exclusive jurisdiction\ | |
| \ of any dispute arising out of this Agreement. \n### Llama 3.2 Acceptable Use Policy\n\ | |
| Meta is committed to promoting safe and fair use of its tools and features, including\ | |
| \ Llama 3.2. If you access or use Llama 3.2, you agree to this Acceptable Use Policy\ | |
| \ (\u201C**Policy**\u201D). The most recent copy of this policy can be found at\ | |
| \ [https://www.llama.com/llama3_2/use-policy](https://www.llama.com/llama3_2/use-policy).\n\ | |
| #### Prohibited Uses\nWe want everyone to use Llama 3.2 safely and responsibly.\ | |
| \ You agree you will not use, or allow others to use, Llama 3.2 to:\n1. Violate\ | |
| \ the law or others\u2019 rights, including to:\n 1. Engage in, promote, generate,\ | |
| \ contribute to, encourage, plan, incite, or further illegal or unlawful activity\ | |
| \ or content, such as:\n 1. Violence or terrorism\n 2. Exploitation\ | |
| \ or harm to children, including the solicitation, creation, acquisition, or dissemination\ | |
| \ of child exploitative content or failure to report Child Sexual Abuse Material\n\ | |
| \ 3. Human trafficking, exploitation, and sexual violence\n 4. The\ | |
| \ illegal distribution of information or materials to minors, including obscene\ | |
| \ materials, or failure to employ legally required age-gating in connection with\ | |
| \ such information or materials.\n 5. Sexual solicitation\n 6. Any\ | |
| \ other criminal activity\n 1. Engage in, promote, incite, or facilitate the\ | |
| \ harassment, abuse, threatening, or bullying of individuals or groups of individuals\n\ | |
| \ 2. Engage in, promote, incite, or facilitate discrimination or other unlawful\ | |
| \ or harmful conduct in the provision of employment, employment benefits, credit,\ | |
| \ housing, other economic benefits, or other essential goods and services\n 3.\ | |
| \ Engage in the unauthorized or unlicensed practice of any profession including,\ | |
| \ but not limited to, financial, legal, medical/health, or related professional\ | |
| \ practices\n 4. Collect, process, disclose, generate, or infer private or sensitive\ | |
| \ information about individuals, including information about individuals\u2019 identity,\ | |
| \ health, or demographic information, unless you have obtained the right to do so\ | |
| \ in accordance with applicable law\n 5. Engage in or facilitate any action or\ | |
| \ generate any content that infringes, misappropriates, or otherwise violates any\ | |
| \ third-party rights, including the outputs or results of any products or services\ | |
| \ using the Llama Materials\n 6. Create, generate, or facilitate the creation\ | |
| \ of malicious code, malware, computer viruses or do anything else that could disable,\ | |
| \ overburden, interfere with or impair the proper working, integrity, operation\ | |
| \ or appearance of a website or computer system\n 7. Engage in any action, or\ | |
| \ facilitate any action, to intentionally circumvent or remove usage restrictions\ | |
| \ or other safety measures, or to enable functionality disabled by Meta\_\n2. Engage\ | |
| \ in, promote, incite, facilitate, or assist in the planning or development of activities\ | |
| \ that present a risk of death or bodily harm to individuals, including use of Llama\ | |
| \ 3.2 related to the following:\n 8. Military, warfare, nuclear industries or\ | |
| \ applications, espionage, use for materials or activities that are subject to the\ | |
| \ International Traffic Arms Regulations (ITAR) maintained by the United States\ | |
| \ Department of State or to the U.S. Biological Weapons Anti-Terrorism Act of 1989\ | |
| \ or the Chemical Weapons Convention Implementation Act of 1997\n 9. Guns and\ | |
| \ illegal weapons (including weapon development)\n 10. Illegal drugs and regulated/controlled\ | |
| \ substances\n 11. Operation of critical infrastructure, transportation technologies,\ | |
| \ or heavy machinery\n 12. Self-harm or harm to others, including suicide, cutting,\ | |
| \ and eating disorders\n 13. Any content intended to incite or promote violence,\ | |
| \ abuse, or any infliction of bodily harm to an individual\n3. Intentionally deceive\ | |
| \ or mislead others, including use of Llama 3.2 related to the following:\n 14.\ | |
| \ Generating, promoting, or furthering fraud or the creation or promotion of disinformation\n\ | |
| \ 15. Generating, promoting, or furthering defamatory content, including the\ | |
| \ creation of defamatory statements, images, or other content\n 16. Generating,\ | |
| \ promoting, or further distributing spam\n 17. Impersonating another individual\ | |
| \ without consent, authorization, or legal right\n 18. Representing that the\ | |
| \ use of Llama 3.2 or outputs are human-generated\n 19. Generating or facilitating\ | |
| \ false online engagement, including fake reviews and other means of fake online\ | |
| \ engagement\_\n4. Fail to appropriately disclose to end users any known dangers\ | |
| \ of your AI system 5. Interact with third party tools, models, or software designed\ | |
| \ to generate unlawful content or engage in unlawful or harmful conduct and/or represent\ | |
| \ that the outputs of such tools, models, or software are associated with Meta or\ | |
| \ Llama 3.2\n\nWith respect to any multimodal models included in Llama 3.2, the\ | |
| \ rights granted under Section 1(a) of the Llama 3.2 Community License Agreement\ | |
| \ are not being granted to you if you are an individual domiciled in, or a company\ | |
| \ with a principal place of business in, the European Union. This restriction does\ | |
| \ not apply to end users of a product or service that incorporates any such multimodal\ | |
| \ models.\n\nPlease report any violation of this Policy, software \u201Cbug,\u201D\ | |
| \ or other problems that could lead to a violation of this Policy through one of\ | |
| \ the following means:\n\n* Reporting issues with the model: [https://github.com/meta-llama/llama-models/issues](https://l.workplace.com/l.php?u=https%3A%2F%2Fgithub.com%2Fmeta-llama%2Fllama-models%2Fissues&h=AT0qV8W9BFT6NwihiOHRuKYQM_UnkzN_NmHMy91OT55gkLpgi4kQupHUl0ssR4dQsIQ8n3tfd0vtkobvsEvt1l4Ic6GXI2EeuHV8N08OG2WnbAmm0FL4ObkazC6G_256vN0lN9DsykCvCqGZ)\n\ | |
| * Reporting risky content generated by the model: [developers.facebook.com/llama_output_feedback](http://developers.facebook.com/llama_output_feedback)\n\ | |
| * Reporting bugs and security concerns: [facebook.com/whitehat/info](http://facebook.com/whitehat/info)\n\ | |
| * Reporting violations of the Acceptable Use Policy or unlicensed uses of Llama\ | |
| \ 3.2: LlamaUseReport@meta.com" | |
| inference: false | |
| language: | |
| - en | |
| - de | |
| - fr | |
| - it | |
| - pt | |
| - hi | |
| - es | |
| - th | |
| library_name: gguf | |
| license: llama3.2 | |
| pipeline_tag: text-generation | |
| quantized_by: legraphista | |
| tags: | |
| - meta | |
| - pytorch | |
| - llama | |
| - llama-3 | |
| - quantized | |
| - GGUF | |
| - quantization | |
| - imat | |
| - imatrix | |
| - static | |
| - 16bit | |
| - 8bit | |
| - 6bit | |
| - 5bit | |
| - 4bit | |
| - 3bit | |
| - 2bit | |
| - 1bit | |
| # Llama-3.2-3B-Instruct-IMat-GGUF | |
| _Llama.cpp imatrix quantization of meta-llama/Llama-3.2-3B-Instruct_ | |
| Original Model: [meta-llama/Llama-3.2-3B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct) | |
| Original dtype: `BF16` (`bfloat16`) | |
| Quantized by: llama.cpp [b3825](https://github.com/ggerganov/llama.cpp/releases/tag/b3825) | |
| IMatrix dataset: [here](https://gist.githubusercontent.com/bartowski1182/eb213dccb3571f863da82e99418f81e8/raw/b2869d80f5c16fd7082594248e80144677736635/calibration_datav3.txt) | |
| - [Files](#files) | |
| - [IMatrix](#imatrix) | |
| - [Common Quants](#common-quants) | |
| - [All Quants](#all-quants) | |
| - [Downloading using huggingface-cli](#downloading-using-huggingface-cli) | |
| - [Inference](#inference) | |
| - [Simple chat template](#simple-chat-template) | |
| - [Chat template with system prompt](#chat-template-with-system-prompt) | |
| - [Llama.cpp](#llama-cpp) | |
| - [FAQ](#faq) | |
| - [Why is the IMatrix not applied everywhere?](#why-is-the-imatrix-not-applied-everywhere) | |
| - [How do I merge a split GGUF?](#how-do-i-merge-a-split-gguf) | |
| --- | |
| ## Files | |
| ### IMatrix | |
| Status: ✅ Available | |
| Link: [here](https://huggingface.co/legraphista/Llama-3.2-3B-Instruct-IMat-GGUF/blob/main/imatrix.dat) | |
| ### Common Quants | |
| | Filename | Quant type | File Size | Status | Uses IMatrix | Is Split | | |
| | -------- | ---------- | --------- | ------ | ------------ | -------- | | |
| | [Llama-3.2-3B-Instruct.Q8_0.gguf](https://huggingface.co/legraphista/Llama-3.2-3B-Instruct-IMat-GGUF/blob/main/Llama-3.2-3B-Instruct.Q8_0.gguf) | Q8_0 | 3.42GB | ✅ Available | ⚪ Static | 📦 No | |
| | [Llama-3.2-3B-Instruct.Q6_K.gguf](https://huggingface.co/legraphista/Llama-3.2-3B-Instruct-IMat-GGUF/blob/main/Llama-3.2-3B-Instruct.Q6_K.gguf) | Q6_K | 2.64GB | ✅ Available | ⚪ Static | 📦 No | |
| | [Llama-3.2-3B-Instruct.Q4_K.gguf](https://huggingface.co/legraphista/Llama-3.2-3B-Instruct-IMat-GGUF/blob/main/Llama-3.2-3B-Instruct.Q4_K.gguf) | Q4_K | 2.02GB | ✅ Available | 🟢 IMatrix | 📦 No | |
| | [Llama-3.2-3B-Instruct.Q3_K.gguf](https://huggingface.co/legraphista/Llama-3.2-3B-Instruct-IMat-GGUF/blob/main/Llama-3.2-3B-Instruct.Q3_K.gguf) | Q3_K | 1.69GB | ✅ Available | 🟢 IMatrix | 📦 No | |
| | [Llama-3.2-3B-Instruct.Q2_K.gguf](https://huggingface.co/legraphista/Llama-3.2-3B-Instruct-IMat-GGUF/blob/main/Llama-3.2-3B-Instruct.Q2_K.gguf) | Q2_K | 1.36GB | ✅ Available | 🟢 IMatrix | 📦 No | |
| ### All Quants | |
| | Filename | Quant type | File Size | Status | Uses IMatrix | Is Split | | |
| | -------- | ---------- | --------- | ------ | ------------ | -------- | | |
| | [Llama-3.2-3B-Instruct.BF16.gguf](https://huggingface.co/legraphista/Llama-3.2-3B-Instruct-IMat-GGUF/blob/main/Llama-3.2-3B-Instruct.BF16.gguf) | BF16 | 6.43GB | ✅ Available | ⚪ Static | 📦 No | |
| | [Llama-3.2-3B-Instruct.FP16.gguf](https://huggingface.co/legraphista/Llama-3.2-3B-Instruct-IMat-GGUF/blob/main/Llama-3.2-3B-Instruct.FP16.gguf) | F16 | 6.43GB | ✅ Available | ⚪ Static | 📦 No | |
| | [Llama-3.2-3B-Instruct.Q8_0.gguf](https://huggingface.co/legraphista/Llama-3.2-3B-Instruct-IMat-GGUF/blob/main/Llama-3.2-3B-Instruct.Q8_0.gguf) | Q8_0 | 3.42GB | ✅ Available | ⚪ Static | 📦 No | |
| | [Llama-3.2-3B-Instruct.Q6_K.gguf](https://huggingface.co/legraphista/Llama-3.2-3B-Instruct-IMat-GGUF/blob/main/Llama-3.2-3B-Instruct.Q6_K.gguf) | Q6_K | 2.64GB | ✅ Available | ⚪ Static | 📦 No | |
| | [Llama-3.2-3B-Instruct.Q5_K.gguf](https://huggingface.co/legraphista/Llama-3.2-3B-Instruct-IMat-GGUF/blob/main/Llama-3.2-3B-Instruct.Q5_K.gguf) | Q5_K | 2.32GB | ✅ Available | ⚪ Static | 📦 No | |
| | [Llama-3.2-3B-Instruct.Q5_K_S.gguf](https://huggingface.co/legraphista/Llama-3.2-3B-Instruct-IMat-GGUF/blob/main/Llama-3.2-3B-Instruct.Q5_K_S.gguf) | Q5_K_S | 2.27GB | ✅ Available | ⚪ Static | 📦 No | |
| | [Llama-3.2-3B-Instruct.Q4_K.gguf](https://huggingface.co/legraphista/Llama-3.2-3B-Instruct-IMat-GGUF/blob/main/Llama-3.2-3B-Instruct.Q4_K.gguf) | Q4_K | 2.02GB | ✅ Available | 🟢 IMatrix | 📦 No | |
| | [Llama-3.2-3B-Instruct.Q4_K_S.gguf](https://huggingface.co/legraphista/Llama-3.2-3B-Instruct-IMat-GGUF/blob/main/Llama-3.2-3B-Instruct.Q4_K_S.gguf) | Q4_K_S | 1.93GB | ✅ Available | 🟢 IMatrix | 📦 No | |
| | [Llama-3.2-3B-Instruct.IQ4_NL.gguf](https://huggingface.co/legraphista/Llama-3.2-3B-Instruct-IMat-GGUF/blob/main/Llama-3.2-3B-Instruct.IQ4_NL.gguf) | IQ4_NL | 1.92GB | ✅ Available | 🟢 IMatrix | 📦 No | |
| | [Llama-3.2-3B-Instruct.IQ4_XS.gguf](https://huggingface.co/legraphista/Llama-3.2-3B-Instruct-IMat-GGUF/blob/main/Llama-3.2-3B-Instruct.IQ4_XS.gguf) | IQ4_XS | 1.83GB | ✅ Available | 🟢 IMatrix | 📦 No | |
| | [Llama-3.2-3B-Instruct.Q3_K.gguf](https://huggingface.co/legraphista/Llama-3.2-3B-Instruct-IMat-GGUF/blob/main/Llama-3.2-3B-Instruct.Q3_K.gguf) | Q3_K | 1.69GB | ✅ Available | 🟢 IMatrix | 📦 No | |
| | [Llama-3.2-3B-Instruct.Q3_K_L.gguf](https://huggingface.co/legraphista/Llama-3.2-3B-Instruct-IMat-GGUF/blob/main/Llama-3.2-3B-Instruct.Q3_K_L.gguf) | Q3_K_L | 1.82GB | ✅ Available | 🟢 IMatrix | 📦 No | |
| | [Llama-3.2-3B-Instruct.Q3_K_S.gguf](https://huggingface.co/legraphista/Llama-3.2-3B-Instruct-IMat-GGUF/blob/main/Llama-3.2-3B-Instruct.Q3_K_S.gguf) | Q3_K_S | 1.54GB | ✅ Available | 🟢 IMatrix | 📦 No | |
| | [Llama-3.2-3B-Instruct.IQ3_M.gguf](https://huggingface.co/legraphista/Llama-3.2-3B-Instruct-IMat-GGUF/blob/main/Llama-3.2-3B-Instruct.IQ3_M.gguf) | IQ3_M | 1.60GB | ✅ Available | 🟢 IMatrix | 📦 No | |
| | [Llama-3.2-3B-Instruct.IQ3_S.gguf](https://huggingface.co/legraphista/Llama-3.2-3B-Instruct-IMat-GGUF/blob/main/Llama-3.2-3B-Instruct.IQ3_S.gguf) | IQ3_S | 1.54GB | ✅ Available | 🟢 IMatrix | 📦 No | |
| | [Llama-3.2-3B-Instruct.IQ3_XS.gguf](https://huggingface.co/legraphista/Llama-3.2-3B-Instruct-IMat-GGUF/blob/main/Llama-3.2-3B-Instruct.IQ3_XS.gguf) | IQ3_XS | 1.48GB | ✅ Available | 🟢 IMatrix | 📦 No | |
| | [Llama-3.2-3B-Instruct.IQ3_XXS.gguf](https://huggingface.co/legraphista/Llama-3.2-3B-Instruct-IMat-GGUF/blob/main/Llama-3.2-3B-Instruct.IQ3_XXS.gguf) | IQ3_XXS | 1.35GB | ✅ Available | 🟢 IMatrix | 📦 No | |
| | [Llama-3.2-3B-Instruct.Q2_K.gguf](https://huggingface.co/legraphista/Llama-3.2-3B-Instruct-IMat-GGUF/blob/main/Llama-3.2-3B-Instruct.Q2_K.gguf) | Q2_K | 1.36GB | ✅ Available | 🟢 IMatrix | 📦 No | |
| | [Llama-3.2-3B-Instruct.Q2_K_S.gguf](https://huggingface.co/legraphista/Llama-3.2-3B-Instruct-IMat-GGUF/blob/main/Llama-3.2-3B-Instruct.Q2_K_S.gguf) | Q2_K_S | 1.27GB | ✅ Available | 🟢 IMatrix | 📦 No | |
| | [Llama-3.2-3B-Instruct.IQ2_M.gguf](https://huggingface.co/legraphista/Llama-3.2-3B-Instruct-IMat-GGUF/blob/main/Llama-3.2-3B-Instruct.IQ2_M.gguf) | IQ2_M | 1.23GB | ✅ Available | 🟢 IMatrix | 📦 No | |
| | [Llama-3.2-3B-Instruct.IQ2_S.gguf](https://huggingface.co/legraphista/Llama-3.2-3B-Instruct-IMat-GGUF/blob/main/Llama-3.2-3B-Instruct.IQ2_S.gguf) | IQ2_S | 1.15GB | ✅ Available | 🟢 IMatrix | 📦 No | |
| | [Llama-3.2-3B-Instruct.IQ2_XS.gguf](https://huggingface.co/legraphista/Llama-3.2-3B-Instruct-IMat-GGUF/blob/main/Llama-3.2-3B-Instruct.IQ2_XS.gguf) | IQ2_XS | 1.10GB | ✅ Available | 🟢 IMatrix | 📦 No | |
| | [Llama-3.2-3B-Instruct.IQ2_XXS.gguf](https://huggingface.co/legraphista/Llama-3.2-3B-Instruct-IMat-GGUF/blob/main/Llama-3.2-3B-Instruct.IQ2_XXS.gguf) | IQ2_XXS | 1.02GB | ✅ Available | 🟢 IMatrix | 📦 No | |
| | [Llama-3.2-3B-Instruct.IQ1_M.gguf](https://huggingface.co/legraphista/Llama-3.2-3B-Instruct-IMat-GGUF/blob/main/Llama-3.2-3B-Instruct.IQ1_M.gguf) | IQ1_M | 924.19MB | ✅ Available | 🟢 IMatrix | 📦 No | |
| | [Llama-3.2-3B-Instruct.IQ1_S.gguf](https://huggingface.co/legraphista/Llama-3.2-3B-Instruct-IMat-GGUF/blob/main/Llama-3.2-3B-Instruct.IQ1_S.gguf) | IQ1_S | 868.16MB | ✅ Available | 🟢 IMatrix | 📦 No | |
| ## Downloading using huggingface-cli | |
| If you do not have hugginface-cli installed: | |
| ``` | |
| pip install -U "huggingface_hub[cli]" | |
| ``` | |
| Download the specific file you want: | |
| ``` | |
| huggingface-cli download legraphista/Llama-3.2-3B-Instruct-IMat-GGUF --include "Llama-3.2-3B-Instruct.Q8_0.gguf" --local-dir ./ | |
| ``` | |
| If the model file is big, it has been split into multiple files. In order to download them all to a local folder, run: | |
| ``` | |
| huggingface-cli download legraphista/Llama-3.2-3B-Instruct-IMat-GGUF --include "Llama-3.2-3B-Instruct.Q8_0/*" --local-dir ./ | |
| # see FAQ for merging GGUF's | |
| ``` | |
| --- | |
| ## Inference | |
| ### Simple chat template | |
| ``` | |
| <|begin_of_text|><|start_header_id|>system<|end_header_id|> | |
| Cutting Knowledge Date: December 2023 | |
| Today Date: 26 Jul 2024 | |
| <|eot_id|><|start_header_id|>user<|end_header_id|> | |
| {user_prompt}<|eot_id|><|start_header_id|>assistant<|end_header_id|> | |
| {assistant_response}<|eot_id|><|start_header_id|>user<|end_header_id|> | |
| {next_user_prompt}<|eot_id|> | |
| ``` | |
| ### Chat template with system prompt | |
| ``` | |
| <|begin_of_text|><|start_header_id|>system<|end_header_id|> | |
| Cutting Knowledge Date: December 2023 | |
| Today Date: 26 Jul 2024 | |
| {system_prompt}<|eot_id|><|start_header_id|>user<|end_header_id|> | |
| {user_prompt}<|eot_id|><|start_header_id|>assistant<|end_header_id|> | |
| {assistant_response}<|eot_id|><|start_header_id|>user<|end_header_id|> | |
| {next_user_prompt}<|eot_id|> | |
| ``` | |
| ### Llama.cpp | |
| ``` | |
| llama.cpp/main -m Llama-3.2-3B-Instruct.Q8_0.gguf --color -i -p "prompt here (according to the chat template)" | |
| ``` | |
| --- | |
| ## FAQ | |
| ### Why is the IMatrix not applied everywhere? | |
| According to [this investigation](https://www.reddit.com/r/LocalLLaMA/comments/1993iro/ggufs_quants_can_punch_above_their_weights_now/), it appears that lower quantizations are the only ones that benefit from the imatrix input (as per hellaswag results). | |
| ### How do I merge a split GGUF? | |
| 1. Make sure you have `gguf-split` available | |
| - To get hold of `gguf-split`, navigate to https://github.com/ggerganov/llama.cpp/releases | |
| - Download the appropriate zip for your system from the latest release | |
| - Unzip the archive and you should be able to find `gguf-split` | |
| 2. Locate your GGUF chunks folder (ex: `Llama-3.2-3B-Instruct.Q8_0`) | |
| 3. Run `gguf-split --merge Llama-3.2-3B-Instruct.Q8_0/Llama-3.2-3B-Instruct.Q8_0-00001-of-XXXXX.gguf Llama-3.2-3B-Instruct.Q8_0.gguf` | |
| - Make sure to point `gguf-split` to the first chunk of the split. | |
| --- | |
| Got a suggestion? Ping me [@legraphista](https://x.com/legraphista)! |