--- license: other tags: - gguf - glm - pruned - quantized base_model: zai-org/GLM-5 --- # GLM-5 Pruned Q4_K_M GGUF Pruned and quantized version of [GLM-5](https://huggingface.co/zai-org/GLM-5) in GGUF format. ## Model Details - **Base Model**: GLM-5 by Zhipu AI / Z.AI - **Quantization**: Q4_K_M (4-bit, medium quality) - **Pruning**: Pruned variant for reduced size - **Format**: GGUF (compatible with llama.cpp, ollama, etc.) - **File Size**: ~218 GB ## Usage With llama.cpp: ```bash llama-server --model GLM-5-pruned-Q4_K_M.gguf --n-gpu-layers 999 --ctx-size 8192 ``` ## Notes This is a community upload of a pruned + quantized GLM-5 model. Requires significant RAM/VRAM due to the large MoE architecture.