---
base_model: deepseek-ai/DeepSeek-R1-Distill-Llama-8B
tags:
- lora
- mlx
- fine-tuned
- multichain
- web3
- cross-chain
- defi
- wrapped-events
- purple-squirrel
- adapter
- deepseek-r1
- deepseek
- reasoning
- 8b
- apple-silicon
- local-inference
- blockchain
library_name: mlx
pipeline_tag: text-generation
license: mit
language:
- en
datasets:
- purplesquirrelnetworks/multichain-day-training
---

# Purple Squirrel R1 — Multichain LoRA Adapters

LoRA adapter weights for [Purple Squirrel R1 Multichain](https://huggingface.co/purplesquirrelnetworks/purple-squirrel-r1-multichain), fine-tuned on 58 conference sessions from [Wrapped Events](https://wrapped.events) covering cross-chain protocols, DeFi infrastructure, and Web3 technology.

Use these adapters to apply the multichain fine-tuning to the base model yourself, or continue training with your own data.

## Adapter Details

| Property | Value |
|----------|-------|
| **Base Model** | [DeepSeek-R1-Distill-Llama-8B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-8B) (4-bit) |
| **Method** | LoRA (Low-Rank Adaptation) |
| **Rank** | 8 |
| **Scale** | 20.0 |
| **Dropout** | 0.0 |
| **LoRA Layers** | 4 |
| **Trainable Params** | 2.621M / 8,030M (0.033%) |
| **Framework** | MLX-LM 0.29.1 |
| **Adapter Size** | ~10 MB |
| **Hardware** | Apple M-series (16GB RAM) |
| **Peak Memory** | 6.184 GB |

## Training Configuration

```yaml
framework: mlx-lm 0.29.1
method: LoRA
lora_layers: 4
lora_rank: 8
learning_rate: 1e-5
batch_size: 1
iterations: 200
max_seq_length: 1024
grad_checkpoint: true
save_every: 100
seed: 42
```

## Training Curve

| Iteration | Train Loss | Val Loss | Improvement |
|-----------|-----------|----------|-------------|
| 0 | — | 3.799 | baseline |
| 50 | 3.202 | 3.241 | -14.7% |
| 100 | 3.056 | 3.126 | -17.7% |
| 150 | 3.140 | 3.098 | -18.5% |
| 200 | 3.083 | 3.091 | **-18.6%** |

## Files

```
├── adapters.safetensors          # Final adapter weights (iteration 200)
├── adapter_config.json           # Training config & hyperparameters
└── checkpoints/
    ├── 0000100_adapters.safetensors  # Checkpoint at iteration 100
    └── 0000200_adapters.safetensors  # Checkpoint at iteration 200
```

## Usage with MLX

```python
from mlx_lm import load, generate

# Load base model with LoRA adapters
model, tokenizer = load(
    "mlx-community/DeepSeek-R1-Distill-Llama-8B-4bit",
    adapter_path="purplesquirrelnetworks/purple-squirrel-r1-multichain-lora"
)

messages = [
    {"role": "system", "content": "You are a multichain ecosystem expert."},
    {"role": "user", "content": "How does Wormhole enable cross-chain messaging?"}
]

prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
response = generate(model, tokenizer, prompt=prompt, max_tokens=500)
print(response)
```

## Continue Fine-Tuning

```bash
mlx_lm.lora \
  --model mlx-community/DeepSeek-R1-Distill-Llama-8B-4bit \
  --resume-adapter-file purplesquirrelnetworks/purple-squirrel-r1-multichain-lora/adapters.safetensors \
  --data /path/to/your/data \
  --iters 100
```

## Domain Knowledge

Protocols covered: Wormhole, LayerZero, ZetaChain, Compose Network, Aptos, Monad, NEAR, Polygon, Stacks, Aurora, Pyth, 1inch, Beefy, Relay, Pipe Network, DoubleZero, BitcoinOS.

Topics: cross-chain messaging, L1/L2 ecosystems, DeFi infrastructure, onchain AI agents, RWA tokenization, account abstraction, sustainable yield.

## Related Resources

| Resource | Link |
|----------|------|
| Full Fused Model | [purple-squirrel-r1-multichain](https://huggingface.co/purplesquirrelnetworks/purple-squirrel-r1-multichain) |
| Training Data | [multichain-day-training](https://huggingface.co/datasets/purplesquirrelnetworks/multichain-day-training) |
| Base Model (R1) | [purple-squirrel-r1](https://huggingface.co/purplesquirrelnetworks/purple-squirrel-r1) |
| GGUF Version | [purple-squirrel-r1-gguf](https://huggingface.co/purplesquirrelnetworks/purple-squirrel-r1-gguf) |
| AIDP Neural Cloud Paper | [aidp-neural-cloud-paper](https://huggingface.co/purplesquirrelnetworks/aidp-neural-cloud-paper) |
| Full Collection | [Purple Squirrel AI](https://huggingface.co/collections/purplesquirrelnetworks/purple-squirrel-ai-models-papers-and-data-699b4a18abe59a025baf2149) |

## Citation

```bibtex
@misc{purplesquirrel-r1-multichain-lora-2025,
  title={Purple Squirrel R1 Multichain LoRA Adapters},
  author={Karsten, Matthew},
  year={2025},
  publisher={Purple Squirrel Media},
  howpublished={\url{https://huggingface.co/purplesquirrelnetworks/purple-squirrel-r1-multichain-lora}},
  note={MLX LoRA adapters for DeepSeek-R1-Distill-Llama-8B, fine-tuned on Wrapped Events multichain conference data}
}
```

## License

MIT

## Contact

- **Organization:** [Purple Squirrel Media](https://purplesquirrelmedia.io)
- **Maintainer:** Matthew Karsten
- **Email:** matthew@purplesquirrelmedia.io