Instructions to use ZeroXClem/Qwen2.5-7B-HomerFuse-NerdExp with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use ZeroXClem/Qwen2.5-7B-HomerFuse-NerdExp with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="ZeroXClem/Qwen2.5-7B-HomerFuse-NerdExp") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("ZeroXClem/Qwen2.5-7B-HomerFuse-NerdExp") model = AutoModelForCausalLM.from_pretrained("ZeroXClem/Qwen2.5-7B-HomerFuse-NerdExp") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use ZeroXClem/Qwen2.5-7B-HomerFuse-NerdExp with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "ZeroXClem/Qwen2.5-7B-HomerFuse-NerdExp" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ZeroXClem/Qwen2.5-7B-HomerFuse-NerdExp", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/ZeroXClem/Qwen2.5-7B-HomerFuse-NerdExp
- SGLang
How to use ZeroXClem/Qwen2.5-7B-HomerFuse-NerdExp with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "ZeroXClem/Qwen2.5-7B-HomerFuse-NerdExp" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ZeroXClem/Qwen2.5-7B-HomerFuse-NerdExp", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "ZeroXClem/Qwen2.5-7B-HomerFuse-NerdExp" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ZeroXClem/Qwen2.5-7B-HomerFuse-NerdExp", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use ZeroXClem/Qwen2.5-7B-HomerFuse-NerdExp with Docker Model Runner:
docker model run hf.co/ZeroXClem/Qwen2.5-7B-HomerFuse-NerdExp
Use Docker images
docker run --gpus all \
--shm-size 32g \
-p 30000:30000 \
-v ~/.cache/huggingface:/root/.cache/huggingface \
--env "HF_TOKEN=<secret>" \
--ipc=host \
lmsysorg/sglang:latest \
python3 -m sglang.launch_server \
--model-path "ZeroXClem/Qwen2.5-7B-HomerFuse-NerdExp" \
--host 0.0.0.0 \
--port 30000# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "ZeroXClem/Qwen2.5-7B-HomerFuse-NerdExp",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'ZeroXClem-Qwen2.5-7B-HomerFuse-NerdExp
🚀 Overview
ZeroXClem-Qwen2.5-7B-HomerFuse-NerdExp is a powerful and finely-tuned AI model built on HomerSlerp6-7B, with a fusion of Qwen2.5-7B-based models to create a unique blend of reasoning, creativity, and enhanced conversational depth. This model is an experimental fusion designed to bring high adaptability, deep knowledge, and engaging responses across a wide variety of use cases.
🛠 Merge Details
- Merge Method:
model_stock - Base Model: allknowingroger/HomerSlerp6-7B
- Data Type:
bfloat16 - Tokenizer Source:
allknowingroger/HomerSlerp6-7B
🔗 Merged Models
This fusion includes carefully selected models to enhance general intelligence, technical depth, and roleplay capabilities:
| Model Name | Description |
|---|---|
| jeffmeloy/Qwen2.5-7B-nerd-uncensored-v1.0 | A knowledge-rich, uncensored model with deep expertise in multiple domains. |
| bunnycore/Blabbertron-1.0 | A model optimized for free-flowing and expressive conversation. |
| bunnycore/Qwen2.5-7B-Fuse-Exp | Experimental fusion of Qwen2.5-based models for nuanced understanding. |
| Xiaojian9992024/Qwen2.5-Dyanka-7B-Preview | Enhanced context comprehension and complex reasoning capabilities. |
⚙ Configuration
name: ZeroXClem-Qwen2.5-7B-HomerFuse-NerdExp
base_model: allknowingroger/HomerSlerp6-7B
dtype: bfloat16
merge_method: model_stock
models:
- model: jeffmeloy/Qwen2.5-7B-nerd-uncensored-v1.0
- model: bunnycore/Blabbertron-1.0
- model: bunnycore/Qwen2.5-7B-Fuse-Exp
- model: Xiaojian9992024/Qwen2.5-Dyanka-7B-Preview
tokenizer_source: allknowingroger/HomerSlerp6-7B
🧠 Why This Model?
✅ Balanced Fusion – A well-calibrated mix of reasoning, factual accuracy, and expressive depth.
✅ Uncensored Knowledge – Suitable for academic, technical, and exploratory conversations.
✅ Enhanced Context Retention – Ideal for long-form discussions and in-depth analysis.
✅ Diverse Applications – Can handle creative writing, roleplay, and problem-solving tasks.
🛠 How to Use
🔥 Ollama (Quick Inference)
You can run the model using Ollama for direct testing:
ollama run hf.co/ZeroXClem/Qwen2.5-7B-HomerFuse-NerdExp-Q4_K_M-GGUF
🤗 Hugging Face Transformers (Python)
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
import torch
model_name = "ZeroXClem/Qwen2.5-7B-HomerFuse-NerdExp"
# Load tokenizer & model
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.bfloat16,
device_map="auto"
)
# Initialize text generation pipeline
text_generator = pipeline(
"text-generation",
model=model,
tokenizer=tokenizer,
torch_dtype=torch.bfloat16,
device_map="auto"
)
# Example prompt
prompt = "Describe the significance of AI ethics in modern technology."
# Generate output
outputs = text_generator(
prompt,
max_new_tokens=200,
do_sample=True,
temperature=0.7,
top_k=50,
top_p=0.95
)
print(outputs[0]["generated_text"])
🏆 Performance & Benchmarks
This model has been crafted to perform exceptionally well across a variety of domains, including reasoning, mathematics, and conversation. Evaluation results will be updated upon testing.
🔥 Usage Recommendations
For best performance, ensure that you:
- Use the correct tokenizer:
allknowingroger/HomerSlerp6-7B - Fine-tune prompts for logical reasoning with a step-by-step approach.
- Utilize the model in an interactive setting for long-form discussions.
🎯 Future Plans
- 🚀 Further optimization for multi-turn dialogues and zero-shot reasoning.
- 🧠 Improving knowledge distillation for factual consistency.
- 🎭 Enhancing character roleplay depth with better expressiveness.
📢 Feedback & Contributions
This is an open project, and your feedback is invaluable!
💬 Leave a review or open a discussion on Hugging Face.
❤️ Acknowledgments
A huge thanks to ALL the contributors & model creators and Hugging Face's mergekit community for pushing the boundaries of AI model merging!
- Downloads last month
- 6
Install from pip and serve model
# Install SGLang from pip: pip install sglang# Start the SGLang server: python3 -m sglang.launch_server \ --model-path "ZeroXClem/Qwen2.5-7B-HomerFuse-NerdExp" \ --host 0.0.0.0 \ --port 30000# Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ZeroXClem/Qwen2.5-7B-HomerFuse-NerdExp", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'