Text Generation
Transformers
Safetensors
English
Chinese
glm_moe_dsa
macaron
personal-agent
tool-use
mixture-of-lora
generative-ui
a2ui
glm
conversational
Eval Results
Instructions to use mindlab-research/Macaron-V1-Preview-749B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use mindlab-research/Macaron-V1-Preview-749B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="mindlab-research/Macaron-V1-Preview-749B") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("mindlab-research/Macaron-V1-Preview-749B") model = AutoModelForCausalLM.from_pretrained("mindlab-research/Macaron-V1-Preview-749B") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use mindlab-research/Macaron-V1-Preview-749B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "mindlab-research/Macaron-V1-Preview-749B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "mindlab-research/Macaron-V1-Preview-749B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/mindlab-research/Macaron-V1-Preview-749B
- SGLang
How to use mindlab-research/Macaron-V1-Preview-749B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "mindlab-research/Macaron-V1-Preview-749B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "mindlab-research/Macaron-V1-Preview-749B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "mindlab-research/Macaron-V1-Preview-749B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "mindlab-research/Macaron-V1-Preview-749B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use mindlab-research/Macaron-V1-Preview-749B with Docker Model Runner:
docker model run hf.co/mindlab-research/Macaron-V1-Preview-749B
Update README.md
Browse filesparsing the text into better one..
README.md
CHANGED
|
@@ -137,20 +137,20 @@ The headline benchmark suite focuses on personal-agent behavior, daily-life task
|
|
| 137 |
|
| 138 |
| Category | Benchmark | Macaron V1 Preview | GLM 5.1 | GPT 5.4 | Claude Opus 4.6 | Gemini 3.1 Pro | Qwen 3.6 Plus | Minimax 2.7 |
|
| 139 |
| --- | --- | --- | --- | --- | --- | --- | --- | --- |
|
| 140 |
-
| Personal Agent Benchmark | Macaron Livingbench | 75.2 | 63.2 | 66.5 | 68.9 | 57.6 | 59.0 | 58.2 |
|
| 141 |
-
| | VitaBench | 59.6 | 56.8 | 48.7 | 53.0 | 55.2 | 47.5 | 52.2 |
|
| 142 |
-
| | VitaBench (Delivery) | 67.0 | 64.2 | 50.0 | 65.0 | 63.0 | 58.0 | 63.0 |
|
| 143 |
-
| | VitaBench (In-Store) | 75.0 | 70.0 | 55.0 | 66.0 | 68.0 | 58.0 | 58.0 |
|
| 144 |
-
| | VitaBench (OTA) | 51.0 | 54.0 | 41.0 | 45.0 | 48.0 | 36.0 | 51.0 |
|
| 145 |
-
| | VitaBench (Cross-shop) | 45.3 | 39.0 | -- | 36.0 | 42.0 | 38.0 | 37.0 |
|
| 146 |
-
| | A2UI-Bench | 75.6 | 61.7 | 74.1 | 67.6 | 71.0 | 69.8 | 54.4 |
|
| 147 |
-
| | A2UI L1 | 89.5 | 72.2 | 82.3 | 81.5 | 85.1 | 84.1 | 75.1 |
|
| 148 |
-
| | A2UI L2 | 67.2 | 54.7 | 71.8 | 59.4 | 64.1 | 59.9 | 46.3 |
|
| 149 |
-
| | A2UI L3 | 65.7 | 54.5 | 65.4 | 57.5 | 59.2 | 60.7 | 34.8 |
|
| 150 |
-
| | PinchBench | 92.5 | 76.6 | 88.4 | 88.9 | 82.9 | 85.9 | 84.5 |
|
| 151 |
-
| General Agent Benchmark | Tau3 Bench | 67.6 | 70.6 | 72.9 | 72.4 | 67.1 | 70.7 | 67.6 |
|
| 152 |
-
| | SWE-bench Verified | 78.1 | 76.4 | 78.2 | 78.2 | 78.8 | 73.4 | 73.8 |
|
| 153 |
-
| | Terminal-Bench 2.0 | 67.4 | 63.5 | 75.1 | 65.4 | 68.5 | 61.6 | 57.0 |
|
| 154 |
|
| 155 |
Higher is better for all scores shown in the charts and table.
|
| 156 |
|
|
|
|
| 137 |
|
| 138 |
| Category | Benchmark | Macaron V1 Preview | GLM 5.1 | GPT 5.4 | Claude Opus 4.6 | Gemini 3.1 Pro | Qwen 3.6 Plus | Minimax 2.7 |
|
| 139 |
| --- | --- | --- | --- | --- | --- | --- | --- | --- |
|
| 140 |
+
| Personal Agent Benchmark | Macaron Livingbench | **75.2** | 63.2 | 66.5 | 68.9 | 57.6 | 59.0 | 58.2 |
|
| 141 |
+
| | VitaBench | **59.6** | 56.8 | 48.7 | 53.0 | 55.2 | 47.5 | 52.2 |
|
| 142 |
+
| | VitaBench (Delivery) | **67.0** | 64.2 | 50.0 | 65.0 | 63.0 | 58.0 | 63.0 |
|
| 143 |
+
| | VitaBench (In-Store) | **75.0** | 70.0 | 55.0 | 66.0 | 68.0 | 58.0 | 58.0 |
|
| 144 |
+
| | VitaBench (OTA) | 51.0 | **54.0** | 41.0 | 45.0 | 48.0 | 36.0 | 51.0 |
|
| 145 |
+
| | VitaBench (Cross-shop) | **45.3** | 39.0 | -- | 36.0 | 42.0 | 38.0 | 37.0 |
|
| 146 |
+
| | A2UI-Bench | **75.6** | 61.7 | 74.1 | 67.6 | 71.0 | 69.8 | 54.4 |
|
| 147 |
+
| | A2UI L1 | **89.5** | 72.2 | 82.3 | 81.5 | 85.1 | 84.1 | 75.1 |
|
| 148 |
+
| | A2UI L2 | 67.2 | 54.7 | **71.8** | 59.4 | 64.1 | 59.9 | 46.3 |
|
| 149 |
+
| | A2UI L3 | **65.7** | 54.5 | 65.4 | 57.5 | 59.2 | 60.7 | 34.8 |
|
| 150 |
+
| | PinchBench | **92.5** | 76.6 | 88.4 | 88.9 | 82.9 | 85.9 | 84.5 |
|
| 151 |
+
| General Agent Benchmark | Tau3 Bench | 67.6 | 70.6 | **72.9** | 72.4 | 67.1 | 70.7 | 67.6 |
|
| 152 |
+
| | SWE-bench Verified | 78.1 | 76.4 | 78.2 | 78.2 | **78.8** | 73.4 | 73.8 |
|
| 153 |
+
| | Terminal-Bench 2.0 | 67.4 | 63.5 | **75.1** | 65.4 | 68.5 | 61.6 | 57.0 |
|
| 154 |
|
| 155 |
Higher is better for all scores shown in the charts and table.
|
| 156 |
|