Text Generation
Transformers
Safetensors
mixtral
Mixture of Experts
frankenmoe
Merge
mergekit
lazymergekit
M4-ai/TinyMistral-248M-v2-cleaner
Locutusque/TinyMistral-248M-Instruct
jtatman/tinymistral-v2-pycoder-instuct-248m
Locutusque/TinyMistral-248M-v2-Instruct
Eval Results (legacy)
text-generation-inference
Instructions to use gate369/TinyMistral-248Mx4-MOE-not-tuned-pls-help with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use gate369/TinyMistral-248Mx4-MOE-not-tuned-pls-help with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="gate369/TinyMistral-248Mx4-MOE-not-tuned-pls-help")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("gate369/TinyMistral-248Mx4-MOE-not-tuned-pls-help") model = AutoModelForCausalLM.from_pretrained("gate369/TinyMistral-248Mx4-MOE-not-tuned-pls-help") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use gate369/TinyMistral-248Mx4-MOE-not-tuned-pls-help with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "gate369/TinyMistral-248Mx4-MOE-not-tuned-pls-help" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "gate369/TinyMistral-248Mx4-MOE-not-tuned-pls-help", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/gate369/TinyMistral-248Mx4-MOE-not-tuned-pls-help
- SGLang
How to use gate369/TinyMistral-248Mx4-MOE-not-tuned-pls-help with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "gate369/TinyMistral-248Mx4-MOE-not-tuned-pls-help" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "gate369/TinyMistral-248Mx4-MOE-not-tuned-pls-help", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "gate369/TinyMistral-248Mx4-MOE-not-tuned-pls-help" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "gate369/TinyMistral-248Mx4-MOE-not-tuned-pls-help", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use gate369/TinyMistral-248Mx4-MOE-not-tuned-pls-help with Docker Model Runner:
docker model run hf.co/gate369/TinyMistral-248Mx4-MOE-not-tuned-pls-help
| base_model: Locutusque/TinyMistral-248M-v2-Instruct | |
| gate_mode: hidden | |
| dtype: bfloat16 | |
| experts: | |
| - source_model: M4-ai/TinyMistral-248M-v2-cleaner | |
| positive_prompts: | |
| - "versatile" | |
| - "helpful" | |
| - "factual" | |
| - "integrated" | |
| - "adaptive" | |
| - "comprehensive" | |
| - "balanced" | |
| negative_prompts: | |
| - "specialized" | |
| - "narrow" | |
| - "focused" | |
| - "limited" | |
| - "specific" | |
| - source_model: Locutusque/TinyMistral-248M-Instruct | |
| positive_prompts: | |
| - "creative" | |
| - "chat" | |
| - "discuss" | |
| - "culture" | |
| - "world" | |
| - "expressive" | |
| - "detailed" | |
| - "imaginative" | |
| - "engaging" | |
| negative_prompts: | |
| - "sorry" | |
| - "cannot" | |
| - "factual" | |
| - "concise" | |
| - "straightforward" | |
| - "objective" | |
| - "dry" | |
| - source_model: jtatman/tinymistral-v2-pycoder-instuct-248m | |
| positive_prompts: | |
| - "analytical" | |
| - "accurate" | |
| - "logical" | |
| - "knowledgeable" | |
| - "precise" | |
| - "calculate" | |
| - "compute" | |
| - "solve" | |
| - "work" | |
| - "python" | |
| - "javascript" | |
| - "programming" | |
| - "algorithm" | |
| - "tell me" | |
| - "assistant" | |
| negative_prompts: | |
| - "creative" | |
| - "abstract" | |
| - "imaginative" | |
| - "artistic" | |
| - "emotional" | |
| - "mistake" | |
| - "inaccurate" | |
| - source_model: Locutusque/TinyMistral-248M-v2-Instruct | |
| positive_prompts: | |
| - "instructive" | |
| - "clear" | |
| - "directive" | |
| - "helpful" | |
| - "informative" | |
| negative_prompts: | |
| - "exploratory" | |
| - "open-ended" | |
| - "narrative" | |
| - "speculative" | |
| - "artistic" | |