Mesh LLM

DeepSeek-R1-Distill-Qwen-14B-Q4_K_M

Distributed GGUF inference package for Mesh LLM

Website GitHub Discord

GGUF layer package for running DeepSeek-R1-Distill-Qwen-14B-Q4_K_M across a local Mesh LLM cluster.

This package is derived from unsloth/DeepSeek-R1-Distill-Qwen-14B-GGUF and keeps the original GGUF distribution split into per-layer artifacts for distributed inference.

Highlights

Run locally Pool multiple machines OpenAI-compatible Package variant
Private inference on your hardware Split layers across peers Serve /v1/chat/completions locally Q4_K_M layer package

Model Overview

Property Value
Source model unsloth/DeepSeek-R1-Distill-Qwen-14B-GGUF
Model id unsloth/DeepSeek-R1-Distill-Qwen-14B-GGUF:Q4_K_M
Family DeepSeek
Parameter scale 14B
Quantization Q4_K_M
Layer count 48
Activation width 5120
Package size 8.6 GB
Source file DeepSeek-R1-Distill-Qwen-14B-Q4_K_M.gguf
Package repo meshllm/DeepSeek-R1-Distill-Qwen-14B-Q4_K_M-layers

Recommended Use

  • Local and private inference with Mesh LLM.
  • Multi-machine serving when the full GGUF is too large for one host.
  • OpenAI-compatible chat/completions workflows through Mesh LLM's local API.

For upstream architecture details, chat template guidance, sampling recommendations, license terms, and benchmark notes, see the source model card: unsloth/DeepSeek-R1-Distill-Qwen-14B-GGUF.

Quickstart

# Run this on each machine that should contribute memory/compute.
mesh-llm serve --model "meshllm/DeepSeek-R1-Distill-Qwen-14B-Q4_K_M-layers" --split
# Check the mesh and discover the OpenAI-compatible model name.
curl -s http://localhost:3131/api/status
curl -s http://localhost:3131/v1/models
# Send an OpenAI-compatible chat request.
curl -s http://localhost:3131/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "unsloth/DeepSeek-R1-Distill-Qwen-14B-GGUF:Q4_K_M",
    "messages": [{"role": "user", "content": "Write a tiny hello-world function in Rust."}],
    "max_tokens": 128
  }'

Package Variant

Property Value
Format layer-package
Canonical source ref unsloth/DeepSeek-R1-Distill-Qwen-14B-GGUF@main/DeepSeek-R1-Distill-Qwen-14B-Q4_K_M.gguf
Source revision main
Source SHA-256 67a7933cf2ad596a393c8e13b30bc4da2d50b283e250b78554aed18817eca31c
Skippy ABI 0.1.25
Package manifest SHA-256 ed42abf28ffba246e1395db3ea7680508ba5bafe0d3458c743474043f52fcaf7

What Is Included

Artifact Path Contents SHA-256
Manifest model-package.json Package schema, source identity, checksums ed42abf28ffba246e1395db3ea7680508ba5bafe0d3458c743474043f52fcaf7
Metadata shared/metadata.gguf 0 tensors, 5.7 MB a3d0588bcd39abad3a371f7a0be026cd0d3c85204a0d2d924887a85a3c724104
Embeddings shared/embeddings.gguf 1 tensors, 423.3 MB 75439acb3fb299b37dc2cef8f579ceacc24d2b503b6f6cd2ea77ec829181392e
Output head shared/output.gguf 2 tensors, 614.8 MB aafaa3e7bb15175b51254de0a38bc2554f78a85100260d70f26678b5be2126c3
Transformer layers layers/layer-*.gguf 48 layer artifacts, 576 tensors, 7.6 GB see model-package.json

Validation

Generated by the Mesh LLM HF Jobs splitter from mesh-llm ref main. Each artifact is checksummed as it is written, uploaded to this repository, and removed from the job workspace before the next artifact is produced.

skippy-model-package write-package "/source/DeepSeek-R1-Distill-Qwen-14B-Q4_K_M.gguf" --out-dir "/tmp/meshllm-layer-job-meshllm_DeepSeek-R1-Distill-Qwen-14B-Q4_K_M-layers-193/package"

Links

Downloads last month
1,151
GGUF
Model size
0.3B params
Architecture
qwen2
Hardware compatibility
Log In to add your hardware

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for meshllm/DeepSeek-R1-Distill-Qwen-14B-Q4_K_M-layers