---
base_model: Qwen/Qwen3.5-2B-Base
library_name: peft
pipeline_tag: text-generation
license: apache-2.0
datasets:
- MariChatmen/MariChatmen-Project-Data
tags:
- base_model:adapter:Qwen/Qwen3.5-2B-Base
- lora
- peft
- sft
- transformers
- trl
- andaluh
- marichatmen
---

# MariChatmen 2B Experimental

MariChatmen 2B Experimental is a LoRA/PEFT adapter for
`Qwen/Qwen3.5-2B-Base`. It was trained locally on 2026-05-13 as a recovery
run after the original 2B experiment failed its behavioural gate and no
usable 2B artifact was available.

This is an experimental checkpoint. The current demo should prefer the 4B
adapter (`MariChatmen/MariChatmen-4B-Experimental`) when hardware allows it.

## Intended Use

The adapter is intended for Spanish/Andaluh chat experiments around the
fictional MariChatmen assistant persona. It is not a general production
assistant and should not be used for high-stakes decisions.

## Loading

```python
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base_id = "Qwen/Qwen3.5-2B-Base"
adapter_id = "MariChatmen/MariChatmen-2B-Experimental"

tokenizer = AutoTokenizer.from_pretrained(adapter_id)
base = AutoModelForCausalLM.from_pretrained(base_id, device_map="auto")
model = PeftModel.from_pretrained(base, adapter_id)
```

## Training Data

The local recovery mix contained 22,858 SFT training rows and 1,134 validation
rows. It combined:

- a broad local Andaluh SFT mix derived from Spanish SFT data transformed with
  `andalugeeks/andaluh-py`;
- oversampled MariChatmen project-authored repair anchors covering identity,
  style, safety, and instruction-following regressions.

The mixed training dataset is not uploaded with this model. The broad SFT
portion includes downloaded rows transformed with `andaluh-py`, so it should
not be republished as MariChatmen proprietary/project data. Uploadable project
data is tracked separately in `MariChatmen/MariChatmen-Project-Data`.

## Credits and Copyright

- Base model: `Qwen/Qwen3.5-2B-Base`.
- Fine-tuning framework: Hugging Face Transformers, TRL, PEFT, and PyTorch.
- Transliteration / Andaluh transformation tooling: `andalugeeks/andaluh-py`.
- Broad Spanish SFT sources recorded in the local row metadata include
  `VillanovaAI/villanova-sft-2603` and upstream sources such as
  `CohereLabs/aya_collection`; original dataset licenses and attribution
  requirements remain with those sources.
- MariChatmen repair anchors are project-authored/curated material for this
  project and are documented in the project data repository.

## Training Procedure

- Stage: supervised fine-tuning.
- Base model: `Qwen/Qwen3.5-2B-Base`.
- Tokenizer source: recovered MariChatmen 4B checkpoint tokenizer.
- Sequence length: 384.
- Prompt token cap: 256.
- Max steps: 600.
- LoRA rank: 16.
- LoRA alpha: 32.
- LoRA dropout: 0.05.
- Learning rate: 5e-5.
- Gradient accumulation: 16.
- Embeddings resized and trained to match the MariChatmen tokenizer.
- Hardware: local NVIDIA RTX 5060 Laptop GPU, 8 GB VRAM.

## Evaluation Snapshot

The selected checkpoint is step 600, which was also the best checkpoint by
validation loss.

- Final validation loss: 2.2429933547973633.
- Final validation mean token accuracy: 0.5876548955053303.
- Training runtime: approximately 7,633 seconds.
- Generation probes showed usable instruction following and safety refusals,
  with remaining roughness on some style and technical prompts.

## Limitations

The model is a LoRA adapter, not a merged full model. Quality is expected to be
below the recovered 4B adapter, and the Andaluh style can be uneven. Outputs
may contain linguistic artifacts from automatic transformation and should be
reviewed before publication.

## Framework Versions

- PEFT 0.19.1
- TRL 1.3.0
- Transformers 5.8.0.dev0
- PyTorch 2.11.0+cu130
- Datasets 4.8.5
- Tokenizers 0.22.2