--- base_model: Qwen/Qwen3.5-2B-Base library_name: peft pipeline_tag: text-generation license: apache-2.0 datasets: - MariChatmen/MariChatmen-Project-Data tags: - base_model:adapter:Qwen/Qwen3.5-2B-Base - lora - peft - sft - transformers - trl - andaluh - marichatmen --- # MariChatmen 2B Experimental MariChatmen 2B Experimental is a LoRA/PEFT adapter for `Qwen/Qwen3.5-2B-Base`. It was trained locally on 2026-05-13 as a recovery run after the original 2B experiment failed its behavioural gate and no usable 2B artifact was available. This is an experimental checkpoint. The current demo should prefer the 4B adapter (`MariChatmen/MariChatmen-4B-Experimental`) when hardware allows it. ## Intended Use The adapter is intended for Spanish/Andaluh chat experiments around the fictional MariChatmen assistant persona. It is not a general production assistant and should not be used for high-stakes decisions. ## Loading ```python from peft import PeftModel from transformers import AutoModelForCausalLM, AutoTokenizer base_id = "Qwen/Qwen3.5-2B-Base" adapter_id = "MariChatmen/MariChatmen-2B-Experimental" tokenizer = AutoTokenizer.from_pretrained(adapter_id) base = AutoModelForCausalLM.from_pretrained(base_id, device_map="auto") model = PeftModel.from_pretrained(base, adapter_id) ``` ## Training Data The local recovery mix contained 22,858 SFT training rows and 1,134 validation rows. It combined: - a broad local Andaluh SFT mix derived from Spanish SFT data transformed with `andalugeeks/andaluh-py`; - oversampled MariChatmen project-authored repair anchors covering identity, style, safety, and instruction-following regressions. The mixed training dataset is not uploaded with this model. The broad SFT portion includes downloaded rows transformed with `andaluh-py`, so it should not be republished as MariChatmen proprietary/project data. Uploadable project data is tracked separately in `MariChatmen/MariChatmen-Project-Data`. ## Credits and Copyright - Base model: `Qwen/Qwen3.5-2B-Base`. - Fine-tuning framework: Hugging Face Transformers, TRL, PEFT, and PyTorch. - Transliteration / Andaluh transformation tooling: `andalugeeks/andaluh-py`. - Broad Spanish SFT sources recorded in the local row metadata include `VillanovaAI/villanova-sft-2603` and upstream sources such as `CohereLabs/aya_collection`; original dataset licenses and attribution requirements remain with those sources. - MariChatmen repair anchors are project-authored/curated material for this project and are documented in the project data repository. ## Training Procedure - Stage: supervised fine-tuning. - Base model: `Qwen/Qwen3.5-2B-Base`. - Tokenizer source: recovered MariChatmen 4B checkpoint tokenizer. - Sequence length: 384. - Prompt token cap: 256. - Max steps: 600. - LoRA rank: 16. - LoRA alpha: 32. - LoRA dropout: 0.05. - Learning rate: 5e-5. - Gradient accumulation: 16. - Embeddings resized and trained to match the MariChatmen tokenizer. - Hardware: local NVIDIA RTX 5060 Laptop GPU, 8 GB VRAM. ## Evaluation Snapshot The selected checkpoint is step 600, which was also the best checkpoint by validation loss. - Final validation loss: 2.2429933547973633. - Final validation mean token accuracy: 0.5876548955053303. - Training runtime: approximately 7,633 seconds. - Generation probes showed usable instruction following and safety refusals, with remaining roughness on some style and technical prompts. ## Limitations The model is a LoRA adapter, not a merged full model. Quality is expected to be below the recovered 4B adapter, and the Andaluh style can be uneven. Outputs may contain linguistic artifacts from automatic transformation and should be reviewed before publication. ## Framework Versions - PEFT 0.19.1 - TRL 1.3.0 - Transformers 5.8.0.dev0 - PyTorch 2.11.0+cu130 - Datasets 4.8.5 - Tokenizers 0.22.2