---
base_model: mlx-community/Qwen2.5-Coder-7B-Instruct-4bit
language:
- en
library_name: mlx
license: apache-2.0
license_link: https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct/blob/main/LICENSE
pipeline_tag: text-generation
tags:
- code
- mlx
- qwen
- qwen-coder
- qwen2.5-coder
- lora
- magento
- magento2
- hyva
- php
---

# qwen2.5-coder-7b-magento

A **LoRA fine-tune of `Qwen2.5-Coder-7B-Instruct` (4-bit, MLX)** specialized for writing
**idiomatic Magento 2 + Hyvä** code. The adapter is fused into the base, so this is a
self-contained model.

> **Current version: v5.1** (on `main`). Previous releases stay pinned as git tags —
> load an older version with `revision="v5"` / `revision="v4"` / `revision="v2"`.

## What's new in v5.1

A targeted data patch on top of v5, written against observed failure outputs:

- **CLI commands now register correctly** — the `CommandListInterface` `di.xml` wiring
  was drilled with dedicated examples and a wrong→right contrast pair.
- **`setup_version` eliminated** from generated `module.xml` (deprecated since
  declarative schema).
- **Fewer plugin result-type mistakes** — more after-plugin examples where the `$result`
  type is *not* the obvious payload object (e.g. `Quote::addProduct` returns the quote
  *Item*), so the "mirror the return type" rule generalizes.
- **More total-collector coverage** with rotated task phrasings, plus contrast pairs for
  the exact hallucinations seen in v5 output (Magento-1-style `<sales><quote>` XML
  nesting; inventing frontend asset files for a totals task).

**Eval ladder** (six multi-file scaffolding tasks, scored deterministically by file-set
completeness, `php -l`, XML validation, and signature checks — fused weights, greedy
decoding): **v4 0/6 tasks (48/76 criteria) → v5 2/6 (69/74) → v5.1 3/6 (71/74)**.

> *Correction:* an earlier revision of this card reported v5 at "2/6, 64/74" — those
> numbers were accidentally measured against the **base model** (an adapter-serving
> pitfall in `mlx_lm.server`). The fused v5 actually scores **69/74**; the table above
> is measured on fused weights throughout.

## What was new in v5

v5 retrained on an expanded dataset focused on **complete multi-file generation** — the
main weakness of earlier versions, which tended to emit a single PHP class and drop its
companion wiring XML:

- **File sets, not files:** plugins are trained *together with* their `di.xml`, observers
  with `events.xml`, themes as the registration + `theme.xml` + `composer.json` trio,
  CLI commands with their `CommandListInterface` registration, cron jobs with `crontab.xml`.
- **Contrast pairs for recurring API mistakes:** the after-plugin `$result` parameter is the
  intercepted method's *return* type; product `status`/`visibility` are EAV attributes
  (`addAttributeToFilter`), never raw columns on `catalog_product_entity`; a quote total
  collector must mutate the `Total` object; config XML is always wrapped in `<config>` with
  the correct `xsi:noNamespaceSchemaLocation`.

**Developed by:** [Navin D. Bhudiya](https://in.linkedin.com/in/navindbhudiya) — AI Engineer (11+ years in e-commerce at scale).

> ⚠️ **Apple Silicon only.** This is an **MLX** model and runs via [`mlx-lm`](https://github.com/ml-explore/mlx-lm)
> on Apple Silicon Macs. It is **not** loadable with `transformers`/vLLM/llama.cpp without conversion.

## What it's for

A locally-running, open-source code assistant that writes **house-style, modern Magento 2**.
A LoRA teaches **style / conventions**, not facts — so this model is good at *how* idiomatic
Magento code is structured, and is best paired with **RAG for factual grounding** (see Limitations).

Consistently produces:
- PHP 8 constructor property promotion + `readonly`, `declare(strict_types=1)`, typed signatures
- Plugins (not preferences) for behavior tweaks; observers + `events.xml`; correct `di.xml` wiring
- HTTP-verb action interfaces (`HttpGetActionInterface`/`HttpPostActionInterface`), result objects
- Service contracts (Api interfaces, repositories, `SearchCriteria`) over direct models
- Declarative schema (`db_schema.xml`) + data/schema patches over `InstallSchema`
- `$escaper` per context in `.phtml`; ViewModels over fat blocks
- Hyvä = Alpine.js + Tailwind (no jQuery/Knockout)

…and avoids the base model's anti-patterns (e.g. `setup_version`, `ObjectManager::getInstance()`,
Magento-1 `app/code/local`, and answering Magento questions in React/Vue/Java).

## ⚠️ Required inference settings

This model is served with **sampling**. Greedy decoding (`temperature 0`) can fall into
repetition loops on long/complex generations.

- `temperature ≈ 0.3`, `top_p ≈ 0.95`
- `repetition_penalty ≈ 1.1`
- stops on `<|im_end|>` (baked into `config.json` / `generation_config.json` — `eos_token_id` includes `151645`)

### Run with mlx-lm

```bash
pip install mlx-lm

mlx_lm.generate --model navindbhudiya/qwen2.5-coder-7b-magento-v2 \
  --prompt "Create an observer for catalog_product_save_after." \
  --temp 0.3 --top-p 0.95 --max-tokens 400
```

### Serve an OpenAI-compatible API

```bash
mlx_lm.server --model navindbhudiya/qwen2.5-coder-7b-magento-v2 --port 8080 \
  --host 127.0.0.1 --temp 0.3 --top-p 0.95
```
```bash
curl -s http://127.0.0.1:8080/v1/chat/completions -H 'Content-Type: application/json' -d '{
  "model":"navindbhudiya/qwen2.5-coder-7b-magento-v2",
  "messages":[{"role":"user","content":"Build a Hyva accordion component with Alpine.js and Tailwind."}],
  "max_tokens":512, "temperature":0.3, "top_p":0.95, "repetition_penalty":1.1
}'
```

## Training (v5.1)

- **Method:** QLoRA via `mlx-lm` — **8 layers, rank 8, dropout 0.1**, optimizer **AdamW
  (`weight_decay 0.01`)**, **cosine-decay LR** (peak `1e-4`, 50-step warmup → `1e-6`),
  `batch-size 1`, `max-seq-length 2048`, **720 iters**. The validation curve is noisy on
  the small valid set, so the shipped checkpoint was selected by the **downstream task
  eval** (the six-task scorecard above), not val loss alone.
- **Data:** **642** instruction→answer chat pairs (**578 train / 64 valid**) teaching **generic, idiomatic
  Magento 2 conventions** across backend, frontend/Hyvä, database, performance, error-handling, and tooling —
  including complete multi-file scaffolding exemplars and wrong→rule→correct contrast pairs (see above).
  Every example is length-validated with the real tokenizer + chat template (no silent truncation).
- **Regularization:** fewer layers + dropout + weight decay + cosine decay keep the training loss from
  collapsing and hold validation flat — favoring generalization over memorization.
- **Hardware:** trained on a **24 GB fanless Apple Silicon MacBook Air**.

## Limitations

- **Facts vs style:** as a style-focused LoRA, it can confabulate *long-tail facts*. Observed cases
  where the raw model is unreliable (and **RAG fixes them**): the CLI **"Area code is not set"** API
  (it may invent `AreaRegistryInterface::setCurrentArea`; the real API is `App\State::setAreaCode`),
  **CSRF** (may invent interfaces instead of `App\CsrfAwareActionInterface`), choosing **`InputException`
  vs `LocalizedException`** for invalid input, and occasionally re-introducing an **N+1** by calling
  `getById()` inside a loop over rows it already loaded. **Pair with RAG** for ground-truth API details;
  trust the model for structure/idioms.
- **Verify generated total collectors:** v5.1 reliably emits the canonical
  `collect(Quote, ShippingAssignmentInterface, Total)` signature and `$total` mutation,
  but may still omit the `fetch(Quote, Total)` method that displays the line in the
  totals block — add it by hand if missing.
- **Greedy decoding loops:** see Required inference settings — always use sampling + a repetition penalty.
- **Apple/MLX only.**

## Versions

| revision | notes |
|---|---|
| `main` / `v5.1` | 642-pair dataset, targeted patch (CLI registration, setup_version, $result types, collectors) — eval 3/6 tasks, 71/74 criteria |
| `v5` | 618-pair dataset, multi-file scaffolding + contrast pairs — eval 2/6 tasks, 69/74 criteria |
| `v4` | 478-pair dataset, anti-overfit recipe (8 layers, 500 iters) — eval 0/6 tasks, 48/76 criteria |
| `v2` | 412-pair dataset, first public release |

## Privacy / data provenance

Trained **only on anonymized, generic Magento patterns** authored from general knowledge. **No client/
company/project names, no proprietary business logic, no secrets/keys/credentials, no PII** are present in
the training data (validated with automated leak + denylist scans before training).

## Author

**[Navin D. Bhudiya](https://in.linkedin.com/in/navindbhudiya)** — AI Engineer with **11+ years shipping
e-commerce at scale**. Builds production **RAG, LLM agents, and intelligent search on AWS** (Claude API,
Amazon Bedrock, LangGraph, vector DBs). **AWS + Anthropic certified.**

Designed, trained, evaluated, and released this Magento 2 + Hyvä code model (dataset curation → QLoRA
fine-tuning on Apple Silicon → checkpoint selection → evaluation → RAG grounding).
Connect: [linkedin.com/in/navindbhudiya](https://in.linkedin.com/in/navindbhudiya)

## License & attribution

Finetuned from **Qwen2.5-Coder-7B-Instruct** (Apache-2.0); base MLX conversion:
`mlx-community/Qwen2.5-Coder-7B-Instruct-4bit`. This derivative is released under **Apache-2.0**.