---
language: pt
license: mit
tags:
- ocr
- optical-character-recognition
- text-recognition
- trocr
- vision
- vision-and-language
datasets:
- mazafard/portugues_ocr_dataset_full
model-index:
- name: trocr-finetuned-portugues
  results:
  - task:
      type: optical-character-recognition
      name: Optical Character Recognition
    dataset:
      type: mazafard/portugues_ocr_dataset_full
      name: portugues_ocr_dataset_full
      args: default
    metrics:
    - type: cer
      value: 0.01
      name: Character Error Rate
    - type: wer
      value: 0.05
      name: Word Error Rate
base_model:
- microsoft/trocr-base-printed
new_version: mazafard/trocr-finetuned_20250422_125947
---

# TrOCR Fine-tuned for Portuguese

This model is a fine-tuned version of the [microsoft/trocr-base-printed](https://huggingface.co/microsoft/trocr-base-printed) model for Optical Character Recognition (OCR) in Portuguese. It has been trained on the  [mazafard/portugues_ocr_dataset_full](https://huggingface.co/datasets/mazafard/portugues_ocr_dataset_full) dataset, which contains images of Portuguese text and their corresponding transcriptions.


## Model Description

- **Architecture:** TrOCR (Transformer-based Optical Character Recognition)
- **Base Model:** microsoft/trocr-base-printed
- **Training Data:** mazafard/portugues_ocr_dataset_full
- **Language:** Portuguese (pt)


## Intended Uses & Limitations

This model is intended for OCR tasks on printed Portuguese text. It may not perform well on handwritten text or text in other languages.  While the model has been fine-tuned and shows promising results, it's important to be aware that OCR models can still make errors, especially on complex or low-quality images.

## Training and Evaluation Data

The model was trained on the `mazafard/portugues_ocr_dataset_full` dataset. This dataset includes images and labels specific to Portuguese. The images were preprocessed and augmented to enhance the model's performance and generalization capabilities.
 
The model was evaluated on a held-out portion of this same dataset, achieving the following results:

- **Character Error Rate (CER):** 0.01
- **Word Error Rate (WER):** 0.05 (This might vary, needs to be updated)

## How to Use
```python
python from transformers import VisionEncoderDecoderModel, TrOCRProcessor, pipeline

Load the model and processor
processor = TrOCRProcessor.from_pretrained("mazafard/trocr-finetuned-portugues") model = VisionEncoderDecoderModel.from_pretrained("mazafard/trocr-finetuned-portugues")

Create an OCR pipeline
ocr_pipeline = pipeline("image-to-text", model=model, processor=processor)

Perform OCR on an image
image_path = "path/to/your/image.jpg" predicted_text = ocr_pipeline(image_path)

print(predicted_text)
```
## Limitations and Biases

The model's performance may be affected by factors such as image quality, font type, and text layout. It is important to evaluate the model's performance on your specific use case and dataset. Like any machine learning model, this model may also have biases inherited from the training data. 

## Further Information

For more details about the TrOCR architecture and the base model, please refer to the original model card: [microsoft/trocr-base-printed](https://huggingface.co/microsoft/trocr-base-printed)