---
language:
- en
license: apache-2.0
base_model: mistralai/Mistral-7B-v0.3
library_name: peft
pipeline_tag: text-generation
tags:
- text-to-sql
- sql
- code-generation
- mistral
- qlora
- lora
- peft
- nl2sql
datasets:
- spider
- b-mc2/sql-create-context
---

# SQLForge-Mistral-7B-QLoRA

**SQLForge** is a QLoRA adapter for [`mistralai/Mistral-7B-v0.3`](https://huggingface.co/mistralai/Mistral-7B-v0.3)
that forges natural-language questions into executable SQL. Trained on a mixture of
**Spider** and **b-mc2/sql-create-context** (~90k examples), it brings exact-match
accuracy on an internal text-to-SQL test split from **9.2% → 87.0%** over the base model,
while keeping the adapter footprint under **340 MB**.

## Results

Evaluated on a held-out internal text-to-SQL test set (500 examples, schema-aware prompt):

| Model                          | Exact Set Match | Valid SQL |
| ------------------------------ | --------------- | --------- |
| `mistralai/Mistral-7B-v0.3` (base) | 9.2%        | 86.2%     |
| **SQLForge-Mistral-7B-QLoRA**  | **87.0%**       | **97.4%** |

A **9.5× improvement** in exact-match accuracy and a **+11.2 pp** lift in syntactic validity,
delivered by a 336 MB LoRA adapter.

## Usage

```python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

BASE  = "mistralai/Mistral-7B-v0.3"
ADAPT = "shreyash-pandey-katni/SQLForge-Mistral-7B-QLoRA"

tok = AutoTokenizer.from_pretrained(BASE)
model = AutoModelForCausalLM.from_pretrained(
    BASE,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)
model = PeftModel.from_pretrained(model, ADAPT)
model.eval()

schema   = "CREATE TABLE employees (id INT, name TEXT, dept TEXT, salary REAL);"
question = "List the names of employees in the Engineering department earning over 100000."

prompt = f"""[INST] You are an expert SQL assistant. Given the schema and question, write a single SQL query.

Schema:
{schema}

Question: {question} [/INST]"""

inputs = tok(prompt, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=128, do_sample=False)
print(tok.decode(out[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))
```

### 4-bit inference (single 12 GB GPU)

```python
from transformers import BitsAndBytesConfig

bnb = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16,
)
model = AutoModelForCausalLM.from_pretrained(BASE, quantization_config=bnb, device_map="auto")
model = PeftModel.from_pretrained(model, ADAPT)
```

## Training Details

- **Base model:** `mistralai/Mistral-7B-v0.3`
- **Method:** QLoRA (4-bit NF4 base, BF16 LoRA adapters) via `trl.SFTTrainer`
- **Datasets:**
  - [`spider`](https://huggingface.co/datasets/spider) — cross-domain text-to-SQL
  - [`b-mc2/sql-create-context`](https://huggingface.co/datasets/b-mc2/sql-create-context) — schema-grounded SQL
  - Combined: **90,719 train / 3,929 val rows**
- **Prompt format:** `[INST] … [/INST]` (Mistral instruction template). Loss is masked to
  apply only on tokens **after** `[/INST]` (the SQL response).
- **Max sequence length:** 512 tokens

### LoRA configuration

| Setting        | Value                                                       |
| -------------- | ----------------------------------------------------------- |
| Rank `r`       | 64                                                          |
| Alpha          | 16 (scale = α/r = 0.25, conservative — preserves reasoning) |
| Dropout        | 0.05                                                        |
| RSLoRA         | enabled (1/√r scaling for r=64 stability)                   |
| Target modules | `q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj` |
| Bias           | none                                                        |

### Training hyperparameters

| Setting              | Value          |
| -------------------- | -------------- |
| Epochs               | 3              |
| Effective batch size | 8 (1 × 8 grad-accum) |
| Learning rate        | 2e-4           |
| LR scheduler         | cosine         |
| Warmup ratio         | 0.03           |
| Precision            | BF16           |
| Gradient checkpointing | enabled      |
| Optimizer            | AdamW (transformers default) |
| Checkpoint           | `checkpoint-34020` (final) |

### Hardware

- **GPU:** 1× NVIDIA RTX 3080 Ti (12 GB VRAM)
- **Peak VRAM:** ~8.25 GB (4-bit NF4 base + BF16 adapters + grad-ckpt)
- **Training time:** ~4–8 hours

## Limitations

- **Schema sensitivity:** trained with explicit `CREATE TABLE` schemas in the prompt;
  performance degrades without schema context.
- **Single-query bias:** optimized for single SELECT statements; multi-statement scripts,
  DDL, and procedural SQL are out of scope.
- **Dialect:** training data is primarily SQLite/ANSI; vendor-specific syntax
  (T-SQL, PL/pgSQL, BigQuery functions) may not generalize.
- **English-only:** prompts and questions are English; non-English NL queries untested.
- **No execution safety:** generated SQL should be sandboxed/parameterized before running
  against production databases.

## Evaluation

The metrics above use `exact_set_match` (gold and predicted normalized SQL match exactly
as token sets) and `valid_sql_pct` (predicted SQL parses with `sqlparse`/sqlite3).
See `evaluate_sql.py` in the [training repository](https://github.com/shreyash-Pandey-Katni/preTrainLLM)
for the full eval harness.

## Files

| File                          | Purpose                            |
| ----------------------------- | ---------------------------------- |
| `adapter_model.safetensors`   | LoRA weights (~336 MB)             |
| `adapter_config.json`         | PEFT/LoRA configuration            |
| `tokenizer.json` / `tokenizer_config.json` | Mistral tokenizer       |
| `optimizer.pt`, `scheduler.pt`, `rng_state.pth`, `training_args.bin`, `trainer_state.json` | Training-state snapshot (resume-capable) |

## License

Released under **Apache 2.0**, the same license as the base model
`mistralai/Mistral-7B-v0.3`.

## Citation

```bibtex
@misc{sqlforge_mistral7b_qlora_2026,
  title  = {SQLForge-Mistral-7B-QLoRA: A QLoRA Text-to-SQL Adapter for Mistral-7B-v0.3},
  author = {Shreyash Pandey Katni},
  year   = {2026},
  url    = {https://huggingface.co/shreyash-pandey-katni/SQLForge-Mistral-7B-QLoRA}
}
```