Instructions to use OPEA/DeepSeek-R1-0528-q2ks-mixed-AutoRound-inc-v0 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use OPEA/DeepSeek-R1-0528-q2ks-mixed-AutoRound-inc-v0 with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="OPEA/DeepSeek-R1-0528-q2ks-mixed-AutoRound-inc-v0",
	filename="DeepSeek-R1-0528-hf-256x20B-Q2_K_S-00001-of-00005.gguf",
)

output = llm(
	"Once upon a time,",
	max_tokens=512,
	echo=True
)
print(output)

Notebooks
Google Colab
Kaggle
Local Apps Settings

llama.cpp

How to use OPEA/DeepSeek-R1-0528-q2ks-mixed-AutoRound-inc-v0 with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf OPEA/DeepSeek-R1-0528-q2ks-mixed-AutoRound-inc-v0:Q2_K_S
# Run inference directly in the terminal:
llama-cli -hf OPEA/DeepSeek-R1-0528-q2ks-mixed-AutoRound-inc-v0:Q2_K_S

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf OPEA/DeepSeek-R1-0528-q2ks-mixed-AutoRound-inc-v0:Q2_K_S
# Run inference directly in the terminal:
llama-cli -hf OPEA/DeepSeek-R1-0528-q2ks-mixed-AutoRound-inc-v0:Q2_K_S

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf OPEA/DeepSeek-R1-0528-q2ks-mixed-AutoRound-inc-v0:Q2_K_S
# Run inference directly in the terminal:
./llama-cli -hf OPEA/DeepSeek-R1-0528-q2ks-mixed-AutoRound-inc-v0:Q2_K_S

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf OPEA/DeepSeek-R1-0528-q2ks-mixed-AutoRound-inc-v0:Q2_K_S
# Run inference directly in the terminal:
./build/bin/llama-cli -hf OPEA/DeepSeek-R1-0528-q2ks-mixed-AutoRound-inc-v0:Q2_K_S

Use Docker

docker model run hf.co/OPEA/DeepSeek-R1-0528-q2ks-mixed-AutoRound-inc-v0:Q2_K_S

LM Studio
Jan
Ollama
How to use OPEA/DeepSeek-R1-0528-q2ks-mixed-AutoRound-inc-v0 with Ollama:
```
ollama run hf.co/OPEA/DeepSeek-R1-0528-q2ks-mixed-AutoRound-inc-v0:Q2_K_S
```

Unsloth Studio

How to use OPEA/DeepSeek-R1-0528-q2ks-mixed-AutoRound-inc-v0 with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for OPEA/DeepSeek-R1-0528-q2ks-mixed-AutoRound-inc-v0 to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for OPEA/DeepSeek-R1-0528-q2ks-mixed-AutoRound-inc-v0 to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for OPEA/DeepSeek-R1-0528-q2ks-mixed-AutoRound-inc-v0 to start chatting

Docker Model Runner
How to use OPEA/DeepSeek-R1-0528-q2ks-mixed-AutoRound-inc-v0 with Docker Model Runner:
```
docker model run hf.co/OPEA/DeepSeek-R1-0528-q2ks-mixed-AutoRound-inc-v0:Q2_K_S
```

Lemonade

How to use OPEA/DeepSeek-R1-0528-q2ks-mixed-AutoRound-inc-v0 with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull OPEA/DeepSeek-R1-0528-q2ks-mixed-AutoRound-inc-v0:Q2_K_S

Run and chat with the model

lemonade run user.DeepSeek-R1-0528-q2ks-mixed-AutoRound-inc-v0-Q2_K_S

List all available models

lemonade list

cicdatopea commited on Aug 28, 2025

Commit

982160b

verified ·

1 Parent(s): df7e7e8

Upload folder using huggingface_hub

Browse files

Files changed (7) hide show

.gitattributes +7 -0
DeepSeek-R1-0528-hf-256x20B-Q2_K_S-00001-of-00005.gguf +3 -0
DeepSeek-R1-0528-hf-256x20B-Q2_K_S-00002-of-00005.gguf +3 -0
DeepSeek-R1-0528-hf-256x20B-Q2_K_S-00003-of-00005.gguf +3 -0
DeepSeek-R1-0528-hf-256x20B-Q2_K_S-00004-of-00005.gguf +3 -0
DeepSeek-R1-0528-hf-256x20B-Q2_K_S-00005-of-00005.gguf +3 -0
README.md +133 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,10 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+DeepSeek-R1-0528-hf-256x20B-Q2_K_S-00001-of-00005.gguf filter=lfs diff=lfs merge=lfs -text
+DeepSeek-R1-0528-hf-256x20B-Q2_K_S-00002-of-00005.gguf filter=lfs diff=lfs merge=lfs -text
+DeepSeek-R1-0528-hf-256x20B-Q2_K_S-00003-of-00005.gguf filter=lfs diff=lfs merge=lfs -text
+DeepSeek-R1-0528-hf-256x20B-Q2_K_S-00004-of-00005.gguf filter=lfs diff=lfs merge=lfs -text
+DeepSeek-R1-0528-hf-256x20B-Q2_K_S-00005-of-00005.gguf filter=lfs diff=lfs merge=lfs -text
+Qwen3-235B-A22B-128x10B-Q2_K_S-00001-of-00002.gguf filter=lfs diff=lfs merge=lfs -text
+Qwen3-235B-A22B-128x10B-Q2_K_S-00002-of-00002.gguf filter=lfs diff=lfs merge=lfs -text

DeepSeek-R1-0528-hf-256x20B-Q2_K_S-00001-of-00005.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:65358b14d49ba297edcb7422389f9c580722d6a885d72d0e408fe3bc139b4a3e
+size 49636719904

DeepSeek-R1-0528-hf-256x20B-Q2_K_S-00002-of-00005.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:659ce1e2d50d9af4bd67417a76121bdc8c52d4fcb503a76125d94b37a756974d
+size 49905690912

DeepSeek-R1-0528-hf-256x20B-Q2_K_S-00003-of-00005.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:fdf98c59acefdf78aa1cbbe0ceb3fdbe2cb27a720c89aba5097bc20f1ce11d65
+size 49905690912

DeepSeek-R1-0528-hf-256x20B-Q2_K_S-00004-of-00005.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:155d9fb13632ec5717ed592aa4d796ebddf52bb4f9c6059cf42e42b1e99a2de1
+size 49905690912

DeepSeek-R1-0528-hf-256x20B-Q2_K_S-00005-of-00005.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:49296350852180eaf72fd8e6faecf2534785b1f885da199643bff4059434dde1
+size 25587383552

README.md ADDED Viewed

	@@ -0,0 +1,133 @@

+---
+license: mit
+base_model:
+- deepseek-ai/DeepSeek-R1-0528
+---
+## Model Details
+This model is a mixed gguf:q2ks of [deepseek-ai/DeepSeek-R1-0528](https://huggingface.co/deepseek-ai/DeepSeek-R1-0528) generated by [intel/auto-round](https://github.com/intel/auto-round) algorithm.
+Please refer to Section `Generate the model` for more details.
+**Theoretically, this model(https://huggingface.co/Intel/DeepSeek-R1-0528-q2ks-mixed-AutoRound-inc-v1) should perform better, but we don’t yet have accuracy metrics to confirm it.**
+## How To Use
+~~~bash
+./llama-cli --model ./DeepSeek-R1-0528-hf-256x20B-Q2_K_S-00001-of-00005.gguf --prompt "<｜User｜>Create a Flappy Bird game in Python.<｜Assistant｜>"
+~~~
+### LLamacpp Inference
+~~~python
+"""
+Create a Flappy Bird game in Python.<think>
+We are going to create a simple Flappy Bird game using Pygame.
+ The game will have:
+   - A bird (represented by a circle or an image) that can flap (jump) when the space bar is pressed.
+   - Pipes (obstacles) that move from the right to the left of the screen.
+   - The bird must navigate through the gaps between the pipes without colliding.
+   - The game ends if the bird collides with a pipe or the ground, or if it goes off the top of the screen.
+   - The score increases each time the bird passes a pair of pipes.
+ Steps:
+ 1. Initialize Pygame and set up the display.
+ 2. Define the game constants (screen size, colors, gravity, flap strength, pipe speed, etc.).
+ 3. Create the bird: position, velocity, and a method for jumping.
+ 4. Create the pipes: manage their positions and movement. We'll have a list of pipes.
+ 5. Implement collision detection between the bird and the pipes, and between the bird and the ground/top.
+ 6. Implement the main game loop: handle events, update positions, check collisions, draw everything.
+ We'll use a simple representation for the bird (a circle) and the pipes (rectangles).
+ Let's code accordingly.
+</think>
+# Flappy Bird Game in Python using Pygame
+Here's a complete implementation of the Flappy Bird game using Pygame:
+```python
+import pygame
+import random
+import sys
+# Initialize pygame
+pygame.init()
+# Game constants
+WIDTH, HEIGHT = 400, 600
+FPS = 60
+GRAVITY = 0.5
+FLAP_STRENGTH = -10
+PIPE_SPEED = 5
+PIPE_GAP = 200
+PIPE_SPAWN_FREQUENCY = 1500  # milliseconds
+GROUND_HEIGHT = 100
+# Colors
+SKY_BLUE = (113, 197, 207)
+GREEN = (111, 196, 70)
+DARK_GREEN = (76, 145, 65)
+BIRD_YELLOW = (255, 221, 45)
+BIRD_ORANGE = (255, 150, 0)
+WHITE = (255, 255, 255)
+BLACK = (0, 0, 0)
+"""
+too long to copy
+~~~
+### ## Generate the model
+auto-round>0.5.1
+700G RAM and 30G vram(optional) are required
+~~~python
+import torch
+from transformers import AutoModelForCausalLM, AutoTokenizer
+from auto_round import AutoRound
+model_name = "/models/DeepSeek-R1-0528-hf"  ##must be BF16 model
+model = AutoModelForCausalLM.from_pretrained(model_name,
+                                             device_map="cpu", torch_dtype="auto")
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+layer_config = {}
+for n, m in model.named_modules():
+    if n == "lm_head":
+        layer_config[n] = {"bits": 8}
+    elif isinstance(m, torch.nn.Linear) and (not "expert" in n or "shared_experts" in n) and n != "lm_head":
+        layer_config[n] = {"bits": 4}
+autoround = AutoRound(model, tokenizer, iters=0, layer_config=layer_config, batch_size=8, nsamples=512)
+autoround.quantize_and_save("/models/DeepSeek-R1-0528-q2ks", format="gguf:q2_k_s")
+~~~
+## Ethical Considerations and Limitations
+The model can produce factually incorrect output, and should not be relied on to produce factually accurate information. Because of the limitations of the pretrained model and the finetuning datasets, it is possible that this model could generate lewd, biased or otherwise offensive outputs.
+Therefore, before deploying any applications of the model, developers should perform safety testing.
+## Caveats and Recommendations
+Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model.
+Here are a couple of useful links to learn more about Intel's AI software:
+- Intel Neural Compressor [link](https://github.com/intel/neural-compressor)
+## Disclaimer
+The license on this model does not constitute legal advice. We are not responsible for the actions of third parties who use this model. Please consult an attorney before using this model for commercial purposes.
+## Cite
+@article{cheng2023optimize, title={Optimize weight rounding via signed gradient descent for the quantization of llms}, author={Cheng, Wenhua and Zhang, Weiwei and Shen, Haihao and Cai, Yiyang and He, Xin and Lv, Kaokao and Liu, Yi}, journal={arXiv preprint arXiv:2309.05516}, year={2023} }
+[arxiv](https://arxiv.org/abs/2309.05516) [github](https://github.com/intel/auto-round)