philipjohnbasile commited on
Commit
1659817
·
verified ·
1 Parent(s): 40bfce6

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +3 -0
README.md CHANGED
@@ -96,6 +96,9 @@ One model, fully local, **verify-everything** — every hat above, on a MacBook.
96
  python dist/install_glm_dsa_patch.py # patch mlx_lm (venv AND LM Studio's bundled engine)
97
  GLM_STREAM_EVAL=0 python -m mlx_lm.server --model models/GLM-5.2-q3a4-v4 \
98
  --adapter-path heal/adapters-v4 # serve (OpenAI-compatible); v2 + heal/adapters also ship
 
 
 
99
  # drive the 47-tool agent on your repo:
100
  python scripts/57_tool_agent.py --repo /path/to/your/repo --apply --task "..." --test "cargo test"
101
  # speed: try --dsa-block-size 32/64/128 (free, pick fastest). External draft is Metal-unstable here; MTP self-spec is the real path.
 
96
  python dist/install_glm_dsa_patch.py # patch mlx_lm (venv AND LM Studio's bundled engine)
97
  GLM_STREAM_EVAL=0 python -m mlx_lm.server --model models/GLM-5.2-q3a4-v4 \
98
  --adapter-path heal/adapters-v4 # serve (OpenAI-compatible); v2 + heal/adapters also ship
99
+ # query it — `enable_thinking` toggles the reasoning trace (GLM-specific; off = faster, on = harder problems):
100
+ curl -s localhost:8080/v1/chat/completions -H 'Content-Type: application/json' \
101
+ -d '{"messages":[{"role":"user","content":"Write a typed debounce in TypeScript."}],"chat_template_kwargs":{"enable_thinking":true}}'
102
  # drive the 47-tool agent on your repo:
103
  python scripts/57_tool_agent.py --repo /path/to/your/repo --apply --task "..." --test "cargo test"
104
  # speed: try --dsa-block-size 32/64/128 (free, pick fastest). External draft is Metal-unstable here; MTP self-spec is the real path.