--- base_model: unsloth/Qwen3-8B-unsloth-bnb-4bit tags: - text-generation-inference - transformers - unsloth - qwen3 - trl license: apache-2.0 language: - es datasets: - somosnlp-hackathon-2025/gastronomia-hispana-dpo --- # Qwen3-8B Gastronomía Hispana DPO LoRA **A specialized culinary assistant for Hispanic gastronomy, fine-tuned with Direct Preference Optimization (DPO)** ## Model Description This LoRA adapter transforms Qwen3-8B-Instruct into an expert culinary assistant specialized in Hispanic and Latino cuisine. The model has been fine-tuned using **Direct Preference Optimization (DPO)** to provide high-quality, culturally authentic responses about cooking techniques, ingredients, and traditional recipes from Spanish-speaking countries. ### Key Features - 🥘 **Specialized Knowledge**: Expert-level understanding of Hispanic/Latino culinary traditions - 🔧 **DPO Training**: Enhanced response quality through preference optimization - 🌍 **Cultural Authenticity**: Respects traditional cooking methods and regional variations - 📚 **Comprehensive Coverage**: Ingredients, techniques, recipes, and cultural context - 🇪🇸 **Spanish Language**: Native Spanish culinary terminology and explanations ## Base Model - **Architecture**: Qwen3-8B-unsloth-bnb-4bit - **Quantization**: 4-bit (BNB) - **Chat Template**: ChatML format - **Context Length**: 2,500 tokens ## Training Details ### DPO Configuration - **Method**: Direct Preference Optimization - **Beta**: 0.1 (KL regularization parameter) - **Epochs**: 3 - **Learning Rate**: 5e-6 - **Scheduler**: Cosine with 3% warmup ### LoRA Configuration - **Rank (r)**: 64 - **Alpha**: 64 - **Dropout**: 0.0 - **Target Modules**: `q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj` - **RSLoRA**: Enabled for rank stabilization ### Training Infrastructure - **Batch Size**: 32 (4 per device × 8 gradient accumulation steps) - **Optimizer**: AdamW 8-bit - **Weight Decay**: 0.01 - **Max Gradient Norm**: 1.0 - **Training Time**: ~5.4 hours ### Dataset - **Source**: `somosnlp-hackathon-2025/gastronomia-hispana-dpo` - **Size**: 7,092 preference pairs - **Split**: 95% train, 5% evaluation - **Format**: DPO preference pairs (chosen vs rejected responses) ## Usage ### Loading the Model ```python from unsloth import FastLanguageModel from unsloth.chat_templates import get_chat_template # Load model and tokenizer model, tokenizer = FastLanguageModel.from_pretrained( model_name="somosnlp-hackathon-2025/Qwen3-8B-gastronomia-hispana-dpo-LoRA", max_seq_length=2500, dtype=None, load_in_4bit=True, ) # Setup chat template tokenizer = get_chat_template( tokenizer, chat_template="chatml", mapping={"role": "role", "content": "content", "user": "user", "assistant": "assistant"}, map_eos_token=True, ) # Enable inference mode FastLanguageModel.for_inference(model) ``` ### Inference Example ```python from transformers import TextStreamer # Prepare conversation messages = [ {"role": "user", "content": "¿Cómo preparo el encebollado ecuatoriano tradicional?"} ] # Tokenize input inputs = tokenizer.apply_chat_template( messages, tokenize=True, add_generation_prompt=True, return_tensors="pt", ).to("cuda") # Generate response text_streamer = TextStreamer(tokenizer) _ = model.generate( input_ids=inputs, streamer=text_streamer, max_new_tokens=512, use_cache=True, do_sample=True, temperature=0.7, top_p=0.95 ) ``` ### Alternative Loading (HF Transformers) ```python from peft import AutoModelForPeftCausalLM from transformers import AutoTokenizer model = AutoModelForPeftCausalLM.from_pretrained( "somosnlp-hackathon-2025/Qwen3-8B-gastronomia-hispana-dpo-LoRA", load_in_4bit=True, ) tokenizer = AutoTokenizer.from_pretrained( "somosnlp-hackathon-2025/Qwen3-8B-gastronomia-hispana-dpo-LoRA" ) ``` ## Example Interactions ### Ingredient Expertise **User**: "¿Qué tipo de harina se utiliza para hacer los crackers de los canapés con crema de salmón?" **Assistant**: "Para preparar los crackers de los canapés con crema de salmón, la receta especifica el uso de **harina de fuerza**. Esta harina, también conocida como harina fuerte o harina de panadería, es ideal para este tipo de preparaciones debido a su alto contenido de gluten..." ### Technique Guidance **User**: "¿Cuáles son los errores más comunes al preparar macarrones con repollo?" **Assistant**: "Al preparar **macarrones con repollo**, un plato de origen eslavo... los errores más comunes incluyen: 1. **Cocer en exceso los macarrones**... 2. **Cortar el repollo demasiado grueso**..." ### Cultural Context **User**: "¿De dónde viene el Cheesecake japonés y qué lo hace único?" **Assistant**: "El **Cheesecake japonés**, también conocido como **tarta de queso que tiembla**... tiene su origen en la fusión de la repostería occidental con la precisión y delicadeza de la cocina japonesa..." ## Model Capabilities ### Specialized Domains - **Traditional Recipes**: Step-by-step authentic preparation methods - **Ingredient Knowledge**: Properties, uses, and cultural significance - **Cooking Techniques**: International methods with cultural context - **Substitutions**: Appropriate alternatives respecting authenticity - **Cultural Context**: Historical and regional cooking traditions ### Response Quality - **Detailed Explanations**: Comprehensive, technically accurate guidance - **Cultural Sensitivity**: Respects traditional methods and origins - **Practical Tips**: Real-world cooking advice and troubleshooting - **Educational Value**: Teaches both technique and cultural background ## Performance Metrics - **Training Loss**: Converged effectively over 3 epochs - **Memory Usage**: ~10.3GB peak GPU memory during training - **Inference Speed**: 2x faster with Unsloth optimizations - **Model Size**: ~168M trainable parameters (2.4% of base model) ## Limitations - **Language**: Primarily optimized for Spanish culinary content - **Domain**: Specialized for cooking/gastronomy (may not perform well on other topics) - **Context**: Limited to 2,500 tokens per conversation - **Base Model**: Inherits any limitations from Qwen3-8B-Instruct ## Technical Requirements - **GPU Memory**: Minimum 8GB for inference, 12GB+ recommended for fine-tuning - **CUDA**: Compatible with CUDA 12.4+ - **Libraries**: Unsloth, Transformers 4.52+, PEFT, TRL - **Python**: 3.8+ ## Training Environment - **Hardware**: NVIDIA L40S (44GB VRAM) - **Framework**: Unsloth 2025.5.10 - **Precision**: BF16 training, 4-bit quantization - **Optimization**: Gradient checkpointing, 8-bit AdamW ## Ethical Considerations - **Cultural Respect**: Trained to honor traditional cooking methods and cultural origins - **Accuracy**: Provides technically sound culinary advice - **Safety**: Includes appropriate food safety considerations - **Authenticity**: Prioritizes traditional techniques over convenience modifications ## Citation ```bibtex @misc{gastronomia-hispana-dpo-2025, title={Qwen3-8B Gastronomía Hispana DPO LoRA}, author={SomosNLP Hackathon 2025 Team}, year={2025}, publisher={Hugging Face}, url={https://huggingface.co/somosnlp-hackathon-2025/Qwen3-8B-gastronomia-hispana-dpo-LoRA} } ``` ## Acknowledgments - **SomosNLP**: For organizing the hackathon and providing the platform - **Unsloth**: For efficient training optimizations - **Dataset Contributors**: For creating the Hispanic gastronomy preference dataset - **Base Model**: Mistral AI for the foundation model ## License This adapter is released under the same license as the base Qwen3-8B model. Please refer to the original model's licensing terms for commercial use. --- **Note**: This model is designed for educational and culinary assistance purposes. Always follow proper food safety guidelines when cooking. # Uploaded model - **Developed by:** somosnlp-hackathon-2025 - **License:** apache-2.0 - **Finetuned from model :** unsloth/Qwen3-8B-unsloth-bnb-4bit This qwen3 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library. [](https://github.com/unslothai/unsloth)