Open Vision, Layout & OCR Models by Loay
Collection
This collection hosts a series of Vision Language Models (VLMs) fine-tuned for Optical Character Recognition (OCR) and Document Processing. • 5 items • Updated • 1
How to use loay/Arabic-OCR-Qwen2.5-VL-7B-Vision with Transformers:
# Use a pipeline as a high-level helper
# Warning: Pipeline type "image-to-text" is no longer supported in transformers v5.
# You must load the model directly (see below) or downgrade to v4.x with:
# 'pip install "transformers<5.0.0'
from transformers import pipeline
pipe = pipeline("image-to-text", model="loay/Arabic-OCR-Qwen2.5-VL-7B-Vision") # Load model directly
from transformers import AutoProcessor, AutoModelForImageTextToText
processor = AutoProcessor.from_pretrained("loay/Arabic-OCR-Qwen2.5-VL-7B-Vision")
model = AutoModelForImageTextToText.from_pretrained("loay/Arabic-OCR-Qwen2.5-VL-7B-Vision")How to use loay/Arabic-OCR-Qwen2.5-VL-7B-Vision with Unsloth Studio:
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for loay/Arabic-OCR-Qwen2.5-VL-7B-Vision to start chatting
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for loay/Arabic-OCR-Qwen2.5-VL-7B-Vision to start chatting
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for loay/Arabic-OCR-Qwen2.5-VL-7B-Vision to start chatting
pip install unsloth
from unsloth import FastModel
model, tokenizer = FastModel.from_pretrained(
model_name="loay/Arabic-OCR-Qwen2.5-VL-7B-Vision",
max_seq_length=2048,
)This repository contains the float16 merged version of a Vision-Language Model (VLM), fine-tuned by loay for the specific task of performing Optical Character Recognition (OCR) on Arabic text from images.
The model was created by fine-tuning the unsloth/Qwen2.5-VL-7B-Instruct-bnb-4bit model using LoRA adapters. The high-performance training was made possible by the Unsloth library, and the adapters were then merged back into the base model for easy deployment.
unsloth/Qwen2.5-VL-7B-Instruct-bnb-4bitfloat16 precision model, ideal for inference on GPUs with sufficient VRAM (requires >14GB).Base model
Qwen/Qwen2.5-VL-7B-Instruct