---
base_model: unsloth/Qwen3-8B-unsloth-bnb-4bit
tags:
- text-generation-inference
- transformers
- unsloth
- qwen3
- trl
license: apache-2.0
language:
- es
datasets:
- somosnlp-hackathon-2025/gastronomia-hispana-dpo
---

# Qwen3-8B Gastronomía Hispana DPO LoRA

**A specialized culinary assistant for Hispanic gastronomy, fine-tuned with Direct Preference Optimization (DPO)**

## Model Description

This LoRA adapter transforms Qwen3-8B-Instruct into an expert culinary assistant specialized in Hispanic and Latino cuisine. The model has been fine-tuned using **Direct Preference Optimization (DPO)** to provide high-quality, culturally authentic responses about cooking techniques, ingredients, and traditional recipes from Spanish-speaking countries.

### Key Features

- 🥘 **Specialized Knowledge**: Expert-level understanding of Hispanic/Latino culinary traditions
- 🔧 **DPO Training**: Enhanced response quality through preference optimization
- 🌍 **Cultural Authenticity**: Respects traditional cooking methods and regional variations
- 📚 **Comprehensive Coverage**: Ingredients, techniques, recipes, and cultural context
- 🇪🇸 **Spanish Language**: Native Spanish culinary terminology and explanations

## Base Model

- **Architecture**: Qwen3-8B-unsloth-bnb-4bit
- **Quantization**: 4-bit (BNB)
- **Chat Template**: ChatML format
- **Context Length**: 2,500 tokens

## Training Details

### DPO Configuration
- **Method**: Direct Preference Optimization
- **Beta**: 0.1 (KL regularization parameter)
- **Epochs**: 3
- **Learning Rate**: 5e-6
- **Scheduler**: Cosine with 3% warmup

### LoRA Configuration
- **Rank (r)**: 64
- **Alpha**: 64
- **Dropout**: 0.0
- **Target Modules**: `q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj`
- **RSLoRA**: Enabled for rank stabilization

### Training Infrastructure
- **Batch Size**: 32 (4 per device × 8 gradient accumulation steps)
- **Optimizer**: AdamW 8-bit
- **Weight Decay**: 0.01
- **Max Gradient Norm**: 1.0
- **Training Time**: ~5.4 hours

### Dataset
- **Source**: `somosnlp-hackathon-2025/gastronomia-hispana-dpo`
- **Size**: 7,092 preference pairs
- **Split**: 95% train, 5% evaluation
- **Format**: DPO preference pairs (chosen vs rejected responses)

## Usage

### Loading the Model

```python
from unsloth import FastLanguageModel
from unsloth.chat_templates import get_chat_template

# Load model and tokenizer
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="somosnlp-hackathon-2025/Qwen3-8B-gastronomia-hispana-dpo-LoRA",
    max_seq_length=2500,
    dtype=None,
    load_in_4bit=True,
)

# Setup chat template
tokenizer = get_chat_template(
    tokenizer,
    chat_template="chatml",
    mapping={"role": "role", "content": "content", "user": "user", "assistant": "assistant"},
    map_eos_token=True,
)

# Enable inference mode
FastLanguageModel.for_inference(model)
```

### Inference Example

```python
from transformers import TextStreamer

# Prepare conversation
messages = [
    {"role": "user", "content": "¿Cómo preparo el encebollado ecuatoriano tradicional?"}
]

# Tokenize input
inputs = tokenizer.apply_chat_template(
    messages,
    tokenize=True,
    add_generation_prompt=True,
    return_tensors="pt",
).to("cuda")

# Generate response
text_streamer = TextStreamer(tokenizer)
_ = model.generate(
    input_ids=inputs, 
    streamer=text_streamer, 
    max_new_tokens=512, 
    use_cache=True,
    do_sample=True,
    temperature=0.7,
    top_p=0.95
)
```

### Alternative Loading (HF Transformers)

```python
from peft import AutoModelForPeftCausalLM
from transformers import AutoTokenizer

model = AutoModelForPeftCausalLM.from_pretrained(
    "somosnlp-hackathon-2025/Qwen3-8B-gastronomia-hispana-dpo-LoRA",
    load_in_4bit=True,
)
tokenizer = AutoTokenizer.from_pretrained(
    "somosnlp-hackathon-2025/Qwen3-8B-gastronomia-hispana-dpo-LoRA"
)
```

## Example Interactions

### Ingredient Expertise
**User**: "¿Qué tipo de harina se utiliza para hacer los crackers de los canapés con crema de salmón?"

**Assistant**: "Para preparar los crackers de los canapés con crema de salmón, la receta especifica el uso de **harina de fuerza**. Esta harina, también conocida como harina fuerte o harina de panadería, es ideal para este tipo de preparaciones debido a su alto contenido de gluten..."

### Technique Guidance
**User**: "¿Cuáles son los errores más comunes al preparar macarrones con repollo?"

**Assistant**: "Al preparar **macarrones con repollo**, un plato de origen eslavo... los errores más comunes incluyen: 1. **Cocer en exceso los macarrones**... 2. **Cortar el repollo demasiado grueso**..."

### Cultural Context
**User**: "¿De dónde viene el Cheesecake japonés y qué lo hace único?"

**Assistant**: "El **Cheesecake japonés**, también conocido como **tarta de queso que tiembla**... tiene su origen en la fusión de la repostería occidental con la precisión y delicadeza de la cocina japonesa..."

## Model Capabilities

### Specialized Domains
- **Traditional Recipes**: Step-by-step authentic preparation methods
- **Ingredient Knowledge**: Properties, uses, and cultural significance
- **Cooking Techniques**: International methods with cultural context
- **Substitutions**: Appropriate alternatives respecting authenticity
- **Cultural Context**: Historical and regional cooking traditions

### Response Quality
- **Detailed Explanations**: Comprehensive, technically accurate guidance
- **Cultural Sensitivity**: Respects traditional methods and origins
- **Practical Tips**: Real-world cooking advice and troubleshooting
- **Educational Value**: Teaches both technique and cultural background

## Performance Metrics

- **Training Loss**: Converged effectively over 3 epochs
- **Memory Usage**: ~10.3GB peak GPU memory during training
- **Inference Speed**: 2x faster with Unsloth optimizations
- **Model Size**: ~168M trainable parameters (2.4% of base model)

## Limitations

- **Language**: Primarily optimized for Spanish culinary content
- **Domain**: Specialized for cooking/gastronomy (may not perform well on other topics)
- **Context**: Limited to 2,500 tokens per conversation
- **Base Model**: Inherits any limitations from Qwen3-8B-Instruct

## Technical Requirements

- **GPU Memory**: Minimum 8GB for inference, 12GB+ recommended for fine-tuning
- **CUDA**: Compatible with CUDA 12.4+
- **Libraries**: Unsloth, Transformers 4.52+, PEFT, TRL
- **Python**: 3.8+

## Training Environment

- **Hardware**: NVIDIA L40S (44GB VRAM)
- **Framework**: Unsloth 2025.5.10
- **Precision**: BF16 training, 4-bit quantization
- **Optimization**: Gradient checkpointing, 8-bit AdamW

## Ethical Considerations

- **Cultural Respect**: Trained to honor traditional cooking methods and cultural origins
- **Accuracy**: Provides technically sound culinary advice
- **Safety**: Includes appropriate food safety considerations
- **Authenticity**: Prioritizes traditional techniques over convenience modifications

## Citation

```bibtex
@misc{gastronomia-hispana-dpo-2025,
  title={Qwen3-8B Gastronomía Hispana DPO LoRA},
  author={SomosNLP Hackathon 2025 Team},
  year={2025},
  publisher={Hugging Face},
  url={https://huggingface.co/somosnlp-hackathon-2025/Qwen3-8B-gastronomia-hispana-dpo-LoRA}
}
```

## Acknowledgments

- **SomosNLP**: For organizing the hackathon and providing the platform
- **Unsloth**: For efficient training optimizations
- **Dataset Contributors**: For creating the Hispanic gastronomy preference dataset
- **Base Model**: Mistral AI for the foundation model

## License

This adapter is released under the same license as the base Qwen3-8B model. Please refer to the original model's licensing terms for commercial use.

---

**Note**: This model is designed for educational and culinary assistance purposes. Always follow proper food safety guidelines when cooking.

# Uploaded  model

- **Developed by:** somosnlp-hackathon-2025
- **License:** apache-2.0
- **Finetuned from model :** unsloth/Qwen3-8B-unsloth-bnb-4bit

This qwen3 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)