Instructions to use maru979/qwen2.5-3b-teacher-ocr-rebuilder with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use maru979/qwen2.5-3b-teacher-ocr-rebuilder with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-3B-Instruct") model = PeftModel.from_pretrained(base_model, "maru979/qwen2.5-3b-teacher-ocr-rebuilder") - Transformers
How to use maru979/qwen2.5-3b-teacher-ocr-rebuilder with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="maru979/qwen2.5-3b-teacher-ocr-rebuilder") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("maru979/qwen2.5-3b-teacher-ocr-rebuilder", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use maru979/qwen2.5-3b-teacher-ocr-rebuilder with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "maru979/qwen2.5-3b-teacher-ocr-rebuilder" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "maru979/qwen2.5-3b-teacher-ocr-rebuilder", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/maru979/qwen2.5-3b-teacher-ocr-rebuilder
- SGLang
How to use maru979/qwen2.5-3b-teacher-ocr-rebuilder with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "maru979/qwen2.5-3b-teacher-ocr-rebuilder" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "maru979/qwen2.5-3b-teacher-ocr-rebuilder", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "maru979/qwen2.5-3b-teacher-ocr-rebuilder" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "maru979/qwen2.5-3b-teacher-ocr-rebuilder", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use maru979/qwen2.5-3b-teacher-ocr-rebuilder with Docker Model Runner:
docker model run hf.co/maru979/qwen2.5-3b-teacher-ocr-rebuilder
Qwen2.5-3B Teacher OCR Rebuilder
This repository publishes a LoRA adapter for a narrow OCR post-processing task in a teacher-assistant workflow.
It is not a solver and not a teaching model.
It is a conservative pre-processor that rewrites noisy OCR text into a JSON object that downstream services can safely consume.
Task
Input: raw OCR text from a math exam question
Output:
{
"stem": "cleaned problem statement",
"answer_raw": "raw answer if clearly visible, otherwise empty",
"solution_raw": "",
"ocr_notes": ["risk tag 1", "risk tag 2"]
}
Intended boundary
This adapter is designed to sit on a separate line:
OCR -> OCR rebuilder -> existing GPT teaching chain
It should:
- improve
stem - improve
answer_raw - reduce hallucinated answers
- add conservative OCR risk notes
It should not:
- replace your main GPT explanation model
- solve the math problem
- generate a polished
solution_raw
At the current stage, solution_raw is intentionally kept empty.
Why this adapter exists
The base model can often emit valid JSON, but it tends to:
- hallucinate answers when gold should be empty
- drift away from the intended field semantics
- over-talk beyond the strict OCR rebuild task
This adapter is optimized for a more conservative behavior.
Main test comparison
Evaluation setting:
- base model:
Qwen/Qwen2.5-3B-Instruct - adapter: current best
stage-1protocol-only LoRA - prompt: conservative non-solver prompt
- generation:
max_new_tokens=192 - test set: 30 held-out samples
Metric table
| Metric | Base model | Stage-1 adapter |
|---|---|---|
| JSON parse rate | 80.00% | 76.67% |
stem exact match |
0.00% | 16.67% |
answer_raw exact match |
16.67% | 60.00% |
| empty-answer hallucination | 23.33% | 0.00% |
Visual comparison
JSON parse rate
Base model 80.00% ββββββββββββββββ
Stage-1 adapter 76.67% βββββββββββββββ
answer_raw exact match
Base model 16.67% βββ
Stage-1 adapter 60.00% ββββββββββββ
empty-answer hallucination (lower is better)
Base model 23.33% βββββ
Stage-1 adapter 0.00%
Semantic fidelity
We also measured average character-level similarity against gold labels on the same held-out test set.
| Field | Base model | Stage-1 adapter |
|---|---|---|
stem avg similarity |
0.4898 | 0.7217 |
answer_raw avg similarity |
0.4058 | 0.6667 |
ocr_notes avg similarity |
0.1597 | 0.2391 |
Visual comparison
stem average similarity
Base model 0.4898 ββββββββββ
Stage-1 adapter 0.7217 ββββββββββββββ
answer_raw average similarity
Base model 0.4058 ββββββββ
Stage-1 adapter 0.6667 βββββββββββββ
What this means
The adapter gives up a small amount of parse rate, but buys back the behaviors that matter most for this task:
- much better
answer_raw - much better
stem - zero hallucinated answers on gold-empty cases
For an OCR rebuilding module that feeds a larger teaching system, this tradeoff is usually worth it.
Dataset summary
The project used two task buckets during development:
single_problem_rebuild: 204 synthetic/curated samplesmulti_problem_fragment_rebuild: 102 synthetic/curated samples
The released adapter comes from a stage-1 protocol-only training setup that focused on:
- one JSON object only
- fixed field schema
- conservative extraction
- no
solution_rawgeneration
Stage-1 smoke subset:
- train: 32
- dev: 8
Known limitations
solution_rawis intentionally weak and currently fixed to empty.ocr_notesis helpful but not yet fully normalized.- Multi-problem mixed fragments are harder than single-problem OCR cleanup.
- This is a task adapter, not a general OCR foundation model.
Deployment
This repository includes a handler.py for Hugging Face Inference Endpoints custom deployment.
Recommended input:
{
"inputs": "raw OCR text"
}
Recommended output:
{
"stem": "...",
"answer_raw": "...",
"solution_raw": "",
"ocr_notes": ["..."],
"meta": {
"raw_ocr_notes": ["model raw notes"]
}
}
Local usage
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch
base_model = "Qwen/Qwen2.5-3B-Instruct"
adapter_model = "maru979/qwen2.5-3b-teacher-ocr-rebuilder"
tokenizer = AutoTokenizer.from_pretrained(base_model, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
base_model,
torch_dtype=torch.float16,
device_map="auto",
trust_remote_code=True,
)
model = PeftModel.from_pretrained(model, adapter_model)
model.eval()
If you deploy this adapter as an endpoint, prefer the included handler.py instead of directly exposing raw generation.
- Downloads last month
- 27
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-3B-Instruct") model = PeftModel.from_pretrained(base_model, "maru979/qwen2.5-3b-teacher-ocr-rebuilder")