How to use from the
Use from the
llama-cpp-python library
# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="meshllm/diffusiongemma-26B-A4B-it-Q4_K_M-layers",
	filename="",
)
llm.create_chat_completion(
	messages = [
		{
			"role": "user",
			"content": "What is the capital of France?"
		}
	]
)
Mesh LLM

diffusiongemma-26B-A4B-it-Q4_K_M

Distributed GGUF inference package for Mesh LLM

Website GitHub Discord

GGUF layer package for running diffusiongemma-26B-A4B-it-Q4_K_M across a local Mesh LLM cluster.

This package is derived from unsloth/diffusiongemma-26B-A4B-it-GGUF and keeps the original GGUF distribution split into per-layer artifacts for distributed inference.

Highlights

Run locally Pool multiple machines OpenAI-compatible Package variant
Private inference on your hardware Split layers across peers Serve /v1/chat/completions locally Q4_K_M layer package

Model Overview

Property Value
Source model unsloth/diffusiongemma-26B-A4B-it-GGUF
Model id unsloth/diffusiongemma-26B-A4B-it-GGUF:Q4_K_M
Family Gemma
Parameter scale 26B-A4B
Quantization Q4_K_M
Layer count 30
Activation width 2816
Package size 16.4 GB
Source file diffusiongemma-26B-A4B-it-Q4_K_M.gguf
Package repo meshllm/diffusiongemma-26B-A4B-it-Q4_K_M-layers

Recommended Use

  • Local and private inference with Mesh LLM.
  • Multi-machine serving when the full GGUF is too large for one host.
  • OpenAI-compatible chat/completions workflows through Mesh LLM's local API.

For upstream architecture details, chat template guidance, sampling recommendations, license terms, and benchmark notes, see the source model card: unsloth/diffusiongemma-26B-A4B-it-GGUF.

Quickstart

# Run this on each machine that should contribute memory/compute.
mesh-llm serve --model "meshllm/diffusiongemma-26B-A4B-it-Q4_K_M-layers" --split
# Check the mesh and discover the OpenAI-compatible model name.
curl -s http://localhost:3131/api/status
curl -s http://localhost:3131/v1/models
# Send an OpenAI-compatible chat request.
curl -s http://localhost:3131/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "unsloth/diffusiongemma-26B-A4B-it-GGUF:Q4_K_M",
    "messages": [{"role": "user", "content": "Write a tiny hello-world function in Rust."}],
    "max_tokens": 128
  }'

Package Variant

Property Value
Format layer-package
Canonical source ref unsloth/diffusiongemma-26B-A4B-it-GGUF@main/diffusiongemma-26B-A4B-it-Q4_K_M.gguf
Source revision main
Source SHA-256 d2ca2c032ebfb23cf2d1794a3465e615c7545634d46b3c30652a26d8b07c4ad3
Skippy ABI 0.1.25
Package manifest SHA-256 39f27bda195fc2278ec4e3f558576c3902def47e500d63734d3b530417b55cc2

What Is Included

Artifact Path Contents SHA-256
Manifest model-package.json Package schema, source identity, checksums 39f27bda195fc2278ec4e3f558576c3902def47e500d63734d3b530417b55cc2
Metadata shared/metadata.gguf 5 tensors, 25.3 MB 458d44ddd2f111fd4dd7fd6b7d6aa4f8831c999f715a60b1fe4a0194e80ad256
Embeddings shared/embeddings.gguf 6 tensors, 602.8 MB a543ceb86296e24c7dab5b03791f7dd6546ddf969bf922bb8c9728dd1ab0aba1
Output head shared/output.gguf 6 tensors, 25.4 MB f50a9e5f42769bf62644683d34b8386b84f8f63f73aac1bb2cdfb6fbf0859d3f
Transformer layers layers/layer-*.gguf 30 layer artifacts, 835 tensors, 15.8 GB see model-package.json

Validation

Generated by the Mesh LLM HF Jobs splitter from mesh-llm ref main. Each artifact is checksummed as it is written, uploaded to this repository, and removed from the job workspace before the next artifact is produced.

skippy-model-package write-package "/source/diffusiongemma-26B-A4B-it-Q4_K_M.gguf" --out-dir "/tmp/meshllm-layer-job-meshllm_diffusiongemma-26B-A4B-it-Q4_K_M-layers-199/package"

Links

Downloads last month
-
GGUF
Model size
0.8B params
Architecture
diffusion-gemma
Hardware compatibility
Log In to add your hardware

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for meshllm/diffusiongemma-26B-A4B-it-Q4_K_M-layers