cstr's picture
Upload README.md with huggingface_hub
38de94f verified
metadata
license: afl-3.0
pipeline_tag: image-to-text
tags:
  - math
  - ocr
  - handwriting
  - trocr
  - gguf
  - crispembed
base_model: fhswf/TrOCR_Math_handwritten

TrOCR Math Handwritten — GGUF

Handwritten math equation recognition via CrispEmbed's ggml inference engine.

Source model: fhswf/TrOCR_Math_handwritten (AFL-3.0)

Architecture: ViT-Large encoder (24L, 1024d, 16H) + TrOCR-Large decoder (12L, 1024d, 16H, vocab=50265)

Models

File Size Description
trocr-math-handwritten-fp16.gguf 1.2 GB FP16 weights
trocr-math-handwritten-q4_k.gguf 355 MB Q4_K quantized

Verified Output

Input Recognized
x+y=5 (hand-drawn) X + Y =
x²+1 (printed) x^{2}
3x+7=0 (printed) 3x+7
+ (cross pattern) +

Usage with CrispEmbed

math_ocr_context* ctx = math_ocr_init("trocr-math-handwritten-q4_k.gguf", 4);
const char* latex = math_ocr_recognize(ctx, gray_pixels, width, height, &len);
math_ocr_free(ctx);

Conversion

Converted from HuggingFace safetensors using CrispEmbed's torch-free converter:

python models/convert-trocr-safetensors-to-gguf.py \
  --model-dir /path/to/fhswf/TrOCR_Math_handwritten \
  --output trocr-math-handwritten-fp16.gguf --fp16

Key converter features:

  • No PyTorch dependency (uses safetensors + gguf + numpy only)
  • Generates sinusoidal position embeddings (not stored in checkpoint)
  • Squeezes 3D batch dimensions, reshapes 4D conv weights
  • Preserves AFL-3.0 license in GGUF metadata

License

Academic Free License v3.0 (inherited from source model)