Instructions to use Skshackster/gemma3-270m-mental-health-fine-tuned-gguf with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Skshackster/gemma3-270m-mental-health-fine-tuned-gguf with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="Skshackster/gemma3-270m-mental-health-fine-tuned-gguf",
	filename="gemma-3-finetune.BF16.gguf",
)

llm.create_chat_completion(
	messages = [
		{
			"role": "user",
			"content": "What is the capital of France?"
		}
	]
)

Notebooks
Google Colab
Kaggle
Local Apps

llama.cpp

How to use Skshackster/gemma3-270m-mental-health-fine-tuned-gguf with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf Skshackster/gemma3-270m-mental-health-fine-tuned-gguf:BF16
# Run inference directly in the terminal:
llama-cli -hf Skshackster/gemma3-270m-mental-health-fine-tuned-gguf:BF16

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf Skshackster/gemma3-270m-mental-health-fine-tuned-gguf:BF16
# Run inference directly in the terminal:
llama-cli -hf Skshackster/gemma3-270m-mental-health-fine-tuned-gguf:BF16

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf Skshackster/gemma3-270m-mental-health-fine-tuned-gguf:BF16
# Run inference directly in the terminal:
./llama-cli -hf Skshackster/gemma3-270m-mental-health-fine-tuned-gguf:BF16

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf Skshackster/gemma3-270m-mental-health-fine-tuned-gguf:BF16
# Run inference directly in the terminal:
./build/bin/llama-cli -hf Skshackster/gemma3-270m-mental-health-fine-tuned-gguf:BF16

Use Docker

docker model run hf.co/Skshackster/gemma3-270m-mental-health-fine-tuned-gguf:BF16

LM Studio
Jan

vLLM

How to use Skshackster/gemma3-270m-mental-health-fine-tuned-gguf with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Skshackster/gemma3-270m-mental-health-fine-tuned-gguf"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Skshackster/gemma3-270m-mental-health-fine-tuned-gguf",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Skshackster/gemma3-270m-mental-health-fine-tuned-gguf:BF16

Ollama
How to use Skshackster/gemma3-270m-mental-health-fine-tuned-gguf with Ollama:
```
ollama run hf.co/Skshackster/gemma3-270m-mental-health-fine-tuned-gguf:BF16
```

Unsloth Studio

How to use Skshackster/gemma3-270m-mental-health-fine-tuned-gguf with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Skshackster/gemma3-270m-mental-health-fine-tuned-gguf to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Skshackster/gemma3-270m-mental-health-fine-tuned-gguf to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for Skshackster/gemma3-270m-mental-health-fine-tuned-gguf to start chatting

Docker Model Runner
How to use Skshackster/gemma3-270m-mental-health-fine-tuned-gguf with Docker Model Runner:
```
docker model run hf.co/Skshackster/gemma3-270m-mental-health-fine-tuned-gguf:BF16
```

Lemonade

How to use Skshackster/gemma3-270m-mental-health-fine-tuned-gguf with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull Skshackster/gemma3-270m-mental-health-fine-tuned-gguf:BF16

Run and chat with the model

lemonade run user.gemma3-270m-mental-health-fine-tuned-gguf-BF16

List all available models

lemonade list

Gemma-3 270M Mental Health Fine-tuned Model

Model Description

This model is a fine-tuned version of Google's Gemma-3 270M, specifically trained for mental health conversational support using Cognitive Behavioral Therapy (CBT) principles. The model has been trained on 5M+ tokens of high-quality mental health conversational data to provide empathetic, supportive, and therapeutically-informed responses.

Developed by: Saurav Kumar Srivastava

Model Details

Base Model: google/gemma-3-270m
Model Size: 270M parameters
Training Data: 5M+ tokens of CBT-based therapeutic conversations
Training Method: LoRA fine-tuning using Unsloth
Quantization: BF16 GGUF format available
License: MIT

Training Configuration

The model was fine-tuned using the following specifications:

LoRA Rank (r): 8
LoRA Alpha: 8
Target Modules: All attention and MLP modules
Batch Size: 2 (per device) with 4 gradient accumulation steps
Learning Rate: 2e-4
Training Steps: 30 (optimized for efficiency)
Optimizer: AdamW 8-bit
Framework: Unsloth + TRL SFTTrainer

Intended Use

Primary Use Cases

Mental Health Support: Providing empathetic conversations and CBT-based guidance
Therapeutic Assistance: Supporting individuals with anxiety, depression, and stress management
Educational Tool: Teaching CBT techniques and mental health awareness
Research: Studying conversational AI in mental health applications

Limitations

Not a Replacement for Professional Help: This model should not replace licensed mental health professionals
Crisis Situations: Not suitable for handling severe mental health crises or suicidal ideation
General Limitations: As with all language models, may occasionally generate inappropriate or inaccurate responses

Usage

Basic Inference

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# Load model and tokenizer
model = AutoModelForCausalLM.from_pretrained("Skshackster/gemma3-270m-mental-health-fine-tuned-gguf")
tokenizer = AutoTokenizer.from_pretrained("Skshackster/gemma3-270m-mental-health-fine-tuned-gguf")

# Prepare conversation
messages = [{
    "role": "user",
    "content": [{"type": "text", "text": "I've been feeling really anxious lately about work."}]
}]

# Generate response
text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer([text], return_tensors="pt")

with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=128,
        temperature=1.0,
        top_p=0.95,
        top_k=64,
        do_sample=True
    )

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Recommended Inference Settings

Temperature: 1.0
Top-p: 0.95
Top-k: 64
Max New Tokens: 64-256 (depending on desired response length)

Training Data

The model was trained on a carefully curated dataset of mental health conversations incorporating:

CBT-based therapeutic dialogues
Empathetic response patterns
Crisis de-escalation techniques
Mindfulness and coping strategies
Educational mental health content

Data Volume: 5M+ tokens of high-quality conversational data

Evaluation and Performance

The model demonstrates strong performance in:

Empathetic response generation
CBT technique application
Maintaining therapeutic conversation flow
Appropriate boundary setting
Educational content delivery

Ethical Considerations

Safety Measures

Trained to redirect users to professional help when appropriate
Designed to avoid giving specific medical advice
Incorporates safety guidelines for mental health conversations
Includes appropriate disclaimers about professional treatment

Bias and Fairness

Efforts made to ensure inclusive and culturally sensitive responses
Regular evaluation for potential biases in mental health recommendations
Continuous monitoring for harmful or inappropriate outputs

Technical Specifications

Architecture: Gemma-3 (Transformer-based)
Context Length: 4000 tokens
Precision: BF16
Hardware Requirements: Compatible with consumer GPUs (4GB+ VRAM recommended)
Inference Speed: Optimized for real-time conversation

Files and Formats

Standard Model: PyTorch format compatible with Transformers library
GGUF Format: Available for llama.cpp and Ollama integration
Quantization: BF16 precision maintained for quality

Citation

If you use this model in your research or applications, please cite:

@misc{srivastava2025gemma3mentalhealth,
  title={Gemma-3 270M Mental Health Fine-tuned Model},
  author={Saurav Kumar Srivastava},
  year={2025},
  howpublished={\url{https://huggingface.co/Skshackster/gemma3-270m-mental-health-fine-tuned-gguf}},
}

Contact and Support

Developer: Saurav Kumar Srivastava

For questions, issues, or collaboration inquiries, please open an issue in the model repository

Acknowledgments

Google for the Gemma-3 base model
Unsloth for the efficient fine-tuning framework
Mental Health Community for supporting ethical AI development in therapeutic applications

Disclaimer

This model is designed for educational and supportive purposes only. It should not be used as a substitute for professional mental health treatment. If you are experiencing a mental health crisis, please contact a licensed mental health professional or emergency services immediately.

Downloads last month: 28

GGUF

Model size

0.3B params

Architecture

gemma3

Hardware compatibility

16-bit

Model tree for Skshackster/gemma3-270m-mental-health-fine-tuned-gguf

Base model

google/gemma-3-270m

Adapter

(57)

this model