Mesh LLM

Mistral-Large-3-675B-Instruct-2512-UD-Q4_K_XL

Distributed GGUF inference package for Mesh LLM

Website GitHub Discord

GGUF layer package for running Mistral-Large-3-675B-Instruct-2512-UD-Q4_K_XL across a local Mesh LLM cluster.

This package is derived from unsloth/Mistral-Large-3-675B-Instruct-2512-GGUF and keeps the original GGUF distribution split into per-layer artifacts for distributed inference.

Highlights

Run locally Pool multiple machines OpenAI-compatible Package variant
Private inference on your hardware Split layers across peers Serve /v1/chat/completions locally UD-Q4_K_XL layer package

Model Overview

Property Value
Source model unsloth/Mistral-Large-3-675B-Instruct-2512-GGUF
Model id unsloth/Mistral-Large-3-675B-Instruct-2512-GGUF:UD-Q4_K_XL
Family Mistral
Parameter scale 675B
Quantization UD-Q4_K_XL
Layer count 61
Activation width 7168
Package size 361.7 GB
Source file UD-Q4_K_XL/Mistral-Large-3-675B-Instruct-2512-UD-Q4_K_XL-00001-of-00008.gguf
Package repo meshllm/Mistral-Large-3-675B-Instruct-2512-UD-Q4_K_XL-layers

Recommended Use

  • Local and private inference with Mesh LLM.
  • Multi-machine serving when the full GGUF is too large for one host.
  • OpenAI-compatible chat/completions workflows through Mesh LLM's local API.

For upstream architecture details, chat template guidance, sampling recommendations, license terms, and benchmark notes, see the source model card: unsloth/Mistral-Large-3-675B-Instruct-2512-GGUF.

Quickstart

# Run this on each machine that should contribute memory/compute.
mesh-llm serve --model "meshllm/Mistral-Large-3-675B-Instruct-2512-UD-Q4_K_XL-layers" --split
# Check the mesh and discover the OpenAI-compatible model name.
curl -s http://localhost:3131/api/status
curl -s http://localhost:3131/v1/models
# Send an OpenAI-compatible chat request.
curl -s http://localhost:3131/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "unsloth/Mistral-Large-3-675B-Instruct-2512-GGUF:UD-Q4_K_XL",
    "messages": [{"role": "user", "content": "Write a tiny hello-world function in Rust."}],
    "max_tokens": 128
  }'

Package Variant

Property Value
Format layer-package
Canonical source ref unsloth/Mistral-Large-3-675B-Instruct-2512-GGUF@f1c24a5a3bda6e04ee5439aa8293613fb3179b92/UD-Q4_K_XL/Mistral-Large-3-675B-Instruct-2512-UD-Q4_K_XL-00001-of-00008.gguf
Source revision f1c24a5a3bda6e04ee5439aa8293613fb3179b92
Source SHA-256 e1bc3401ed9afdeca80301e7cac31951bfae99b70b1e49773af11951ebaca63c
Skippy ABI 0.1.22
Package manifest SHA-256 ba8f08a4e91aeb9873b051fc03c9bd73186634614955c68a19c7bf615cc7397b

What Is Included

Artifact Path Contents SHA-256
Manifest model-package.json Package schema, source identity, checksums ba8f08a4e91aeb9873b051fc03c9bd73186634614955c68a19c7bf615cc7397b
Metadata shared/metadata.gguf 0 tensors, 8.0 MB e4529ab9f36b29b49052790d557c0f2457e1b9585296178ac38ad3829552efd2
Embeddings shared/embeddings.gguf 1 tensors, 512.0 MB 71351a7ece52ca01dd0443ac3e9572f72cc092668d3564f629ba1e4b855f7058
Output head shared/output.gguf 2 tensors, 743.0 MB 10c5f843054d9885f082ba1b2d44b389274c3bd39ffd20aa141e9740a0e7ddb1
Transformer layers layers/layer-*.gguf 61 layer artifacts, 1025 tensors, 360.5 GB see model-package.json

Validation

Generated by the Mesh LLM HF Jobs splitter from mesh-llm ref main. Each artifact is checksummed as it is written, uploaded to this repository, and removed from the job workspace before the next artifact is produced.

skippy-model-package write-package "/source/UD-Q4_K_XL/Mistral-Large-3-675B-Instruct-2512-UD-Q4_K_XL-00001-of-00008.gguf" --out-dir "/tmp/meshllm-layer-job-meshllm_Mistral-Large-3-675B-Instruct-2512-UD-Q4_K_XL-layers-198/package"

Links

Downloads last month
609
GGUF
Model size
0.5B params
Architecture
deepseek2
Hardware compatibility
Log In to add your hardware

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for meshllm/Mistral-Large-3-675B-Instruct-2512-UD-Q4_K_XL-layers