ramgovindv's picture
Update README.md
1766879 verified
metadata
tags:
  - gguf
  - llama.cpp
  - unsloth
  - mind_call
  - function_call
datasets:
  - frshafi/mind_call
language:
  - en
base_model:
  - meta-llama/Llama-3.2-3B-Instruct
metrics:
  - accuracy: 88%

health_function_call_llama3.2_3b_gguf: GGUF

A fine-tuned Llama 3.2 3B GGUF model designed for structured function calling in healthcare edge devices.Trained to convert natural language health queries into JSON-based function calls.

Base Model: LLama 3.2 3B

Fine Tuning: Parameter Efficient Fine Tuning. Targeted all linear layers (Q, K, V, O, gate, up, down), the model learned complex mapping logic while maintaining a tiny 10.5 MB adapter footprint.

Quantization: Exported to GGUF (Q4_K_M) format.

Dataset: The model is trained on the MindCall Dataset, a curated synthetic collection of 5,000+ high-fidelity health interaction pairs.

πŸš€ Key Features

  • Converts user queries β†’ structured API calls
  • Lightweight GGUF format (runs locally via llama.cpp)
  • Optimized for deterministic outputs (low temperature)
  • Supports reasoning via <think> tags

πŸ“¦ Model Files

  • Llama-3.2-3B-Instruct.Q4_K_M.gguf

⚑ Quick Start (Python)

Install dependencies

pip install llama-cpp-python huggingface_hub

Load the model

from llama_cpp import Llama

llm = Llama.from_pretrained(
    repo_id="ramgovindv/health_function_call_llama3.2_3b_gguf",
    filename="Llama-3.2-3B-Instruct.Q4_K_M.gguf",
)

Inference

query = "I am feeling dizzy for 2 days"

prompt = f"""
You are an API generator.
Return JSON in this format:
{{
"name": "function_name",
"parameters": {{
    "key": "value"
}}
}}
User query:
{query}
JSON:
"""

response = llm.create_chat_completion(
    messages=[{"role": "user", "content": prompt}],
    temperature=0.1
)
output = response["choices"][0]["message"]["content"]
print(output)

Output

<think>
User has dizziness β†’ likely need blood pressure check
</think>

<function>
{
  "name": "get_blood_pressure_data",
  "parameters": {
    "num_days": 2
  }
}
</function>

<think> β†’ reasoning
<function> β†’ actual function call