Instructions to use osmapi/Nidum-Llama-3.2-3B-Uncensored-MLX-8bit with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use osmapi/Nidum-Llama-3.2-3B-Uncensored-MLX-8bit with Adapters:

from adapters import AutoAdapterModel

model = AutoAdapterModel.from_pretrained("undefined")
model.load_adapter("osmapi/Nidum-Llama-3.2-3B-Uncensored-MLX-8bit", set_active=True)

MLX

How to use osmapi/Nidum-Llama-3.2-3B-Uncensored-MLX-8bit with MLX:

# Make sure mlx-lm is installed
# pip install --upgrade mlx-lm

# Generate text with mlx-lm
from mlx_lm import load, generate

model, tokenizer = load("osmapi/Nidum-Llama-3.2-3B-Uncensored-MLX-8bit")

prompt = "Write a story about Einstein"
messages = [{"role": "user", "content": prompt}]
prompt = tokenizer.apply_chat_template(
    messages, add_generation_prompt=True
)

text = generate(model, tokenizer, prompt=prompt, verbose=True)

Notebooks
Google Colab
Kaggle
Local Apps
LM Studio

How to use osmapi/Nidum-Llama-3.2-3B-Uncensored-MLX-8bit with Pi:

Start the MLX server

# Install MLX LM:
uv tool install mlx-lm
# Start a local OpenAI-compatible server:
mlx_lm.server --model "osmapi/Nidum-Llama-3.2-3B-Uncensored-MLX-8bit"

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "mlx-lm": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "osmapi/Nidum-Llama-3.2-3B-Uncensored-MLX-8bit"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use osmapi/Nidum-Llama-3.2-3B-Uncensored-MLX-8bit with Hermes Agent:

Start the MLX server

# Install MLX LM:
uv tool install mlx-lm
# Start a local OpenAI-compatible server:
mlx_lm.server --model "osmapi/Nidum-Llama-3.2-3B-Uncensored-MLX-8bit"

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default osmapi/Nidum-Llama-3.2-3B-Uncensored-MLX-8bit

Run Hermes

hermes

MLX LM

How to use osmapi/Nidum-Llama-3.2-3B-Uncensored-MLX-8bit with MLX LM:

Generate or start a chat session

# Install MLX LM
uv tool install mlx-lm
# Interactive chat REPL
mlx_lm.chat --model "osmapi/Nidum-Llama-3.2-3B-Uncensored-MLX-8bit"

Run an OpenAI-compatible server

# Install MLX LM
uv tool install mlx-lm
# Start the server
mlx_lm.server --model "osmapi/Nidum-Llama-3.2-3B-Uncensored-MLX-8bit"
# Calling the OpenAI-compatible server with curl
curl -X POST "http://localhost:8000/v1/chat/completions" \
   -H "Content-Type: application/json" \
   --data '{
     "model": "osmapi/Nidum-Llama-3.2-3B-Uncensored-MLX-8bit",
     "messages": [
       {"role": "user", "content": "Hello"}
     ]
   }'

Nidum-Llama-3.2-3B-Uncensored-MLX-8bit

Welcome to Nidum!

At Nidum, our mission is to bring cutting-edge AI capabilities to everyone with unrestricted access to innovation. With Nidum-Llama-3.2-3B-Uncensored-MLX-8bit, you get an optimized, efficient, and versatile AI model for diverse applications.

Discover Nidum's Open-Source Projects on GitHub: https://github.com/NidumAI-Inc

Key Features

Efficient and Compact: Developed in MLX-8bit format for improved performance and reduced memory demands.
Wide Applicability: Suitable for technical problem-solving, educational content, and conversational tasks.
Advanced Context Awareness: Handles long-context conversations with exceptional coherence.
Streamlined Integration: Optimized for use with the mlx-lm library for effortless development.
Unrestricted Responses: Offers uncensored answers across all supported domains.

How to Use

To use Nidum-Llama-3.2-3B-Uncensored-MLX-8bit, install the mlx-lm library and follow these steps:

Installation

pip install mlx-lm

Usage

from mlx_lm import load, generate

# Load the model and tokenizer
model, tokenizer = load("nidum/Nidum-Llama-3.2-3B-Uncensored-MLX-8bit")

# Create a prompt
prompt = "hello"

# Apply the chat template if available
if hasattr(tokenizer, "apply_chat_template") and tokenizer.chat_template is not None:
    messages = [{"role": "user", "content": prompt}]
    prompt = tokenizer.apply_chat_template(
        messages, tokenize=False, add_generation_prompt=True
    )

# Generate the response
response = generate(model, tokenizer, prompt=prompt, verbose=True)

# Print the response
print(response)

About the Model

The nidum/Nidum-Llama-3.2-3B-Uncensored-MLX-8bit model, converted using mlx-lm version 0.19.2, brings:

Memory Efficiency: Tailored for systems with limited hardware.
Performance Optimization: Matches the capabilities of the original model while delivering faster inference.
Plug-and-Play: Easily integrates with the mlx-lm library for deployment ease.

Use Cases

Problem Solving in Tech and Science
Educational and Research Assistance
Creative Writing and Brainstorming
Extended Dialogues
Uninhibited Knowledge Exploration

Datasets and Fine-Tuning

Derived from Nidum-Llama-3.2-3B-Uncensored, the MLX-8bit version inherits:

Uncensored Fine-Tuning: Delivers detailed and open-ended responses.
RAG-Based Optimization: Enhances retrieval-augmented generation for data-driven tasks.
Math Reasoning Support: Precise mathematical computations and explanations.
Long-Context Training: Ensures relevance and coherence in extended conversations.

Quantized Model Download

The MLX-8bit format strikes the perfect balance between memory optimization and performance.

Benchmark

Benchmark	Metric	LLaMA 3B	Nidum 3B	Observation
GPQA	Exact Match (Flexible)	0.3	0.5	Nidum 3B achieves notable improvement in generative tasks.
	Accuracy	0.4	0.5	Demonstrates strong performance, especially in zero-shot tasks.
HellaSwag	Accuracy	0.3	0.4	Excels in common-sense reasoning tasks.
	Normalized Accuracy	0.3	0.4	Strong contextual understanding in sentence completion tasks.
	Normalized Accuracy (Stderr)	0.15275	0.1633	Enhanced consistency in normalized accuracy.
	Accuracy (Stderr)	0.15275	0.1633	Demonstrates robustness in reasoning accuracy compared to LLaMA 3B.

Insights

High Performance, Low Resource: The MLX-8bit format is ideal for environments with limited memory and processing power.
Seamless Integration: Designed for smooth integration into lightweight systems and workflows.

Contributing

Join us in enhancing the MLX-8bit model's capabilities. Contact us for collaboration opportunities.

Contact

For questions, support, or feedback, email info@nidum.ai.

Experience the Future

Harness the power of Nidum-Llama-3.2-3B-Uncensored-MLX-8bit for a perfect blend of performance and efficiency.

Downloads last month: -

Safetensors

Model size

3B params

Tensor type

F16

MLX

Hardware compatibility

Quantized

Model tree for osmapi/Nidum-Llama-3.2-3B-Uncensored-MLX-8bit

Base model

meta-llama/Llama-3.2-3B-Instruct

Adapter

osmapi/Nidum-Llama-3.2-3B-Uncensored

Adapter

(5)

this model

Collection including osmapi/Nidum-Llama-3.2-3B-Uncensored-MLX-8bit

Nidum Uncensored MLX

Collection

2 items • Updated Dec 5, 2024