--- language: pt license: mit tags: - ocr - optical-character-recognition - text-recognition - trocr - vision - vision-and-language datasets: - mazafard/portugues_ocr_dataset_full model-index: - name: trocr-finetuned-portugues results: - task: type: optical-character-recognition name: Optical Character Recognition dataset: type: mazafard/portugues_ocr_dataset_full name: portugues_ocr_dataset_full args: default metrics: - type: cer value: 0.01 name: Character Error Rate - type: wer value: 0.05 name: Word Error Rate base_model: - microsoft/trocr-base-printed new_version: mazafard/trocr-finetuned_20250422_125947 --- # TrOCR Fine-tuned for Portuguese This model is a fine-tuned version of the [microsoft/trocr-base-printed](https://huggingface.co/microsoft/trocr-base-printed) model for Optical Character Recognition (OCR) in Portuguese. It has been trained on the [mazafard/portugues_ocr_dataset_full](https://huggingface.co/datasets/mazafard/portugues_ocr_dataset_full) dataset, which contains images of Portuguese text and their corresponding transcriptions. ## Model Description - **Architecture:** TrOCR (Transformer-based Optical Character Recognition) - **Base Model:** microsoft/trocr-base-printed - **Training Data:** mazafard/portugues_ocr_dataset_full - **Language:** Portuguese (pt) ## Intended Uses & Limitations This model is intended for OCR tasks on printed Portuguese text. It may not perform well on handwritten text or text in other languages. While the model has been fine-tuned and shows promising results, it's important to be aware that OCR models can still make errors, especially on complex or low-quality images. ## Training and Evaluation Data The model was trained on the `mazafard/portugues_ocr_dataset_full` dataset. This dataset includes images and labels specific to Portuguese. The images were preprocessed and augmented to enhance the model's performance and generalization capabilities. The model was evaluated on a held-out portion of this same dataset, achieving the following results: - **Character Error Rate (CER):** 0.01 - **Word Error Rate (WER):** 0.05 (This might vary, needs to be updated) ## How to Use ```python python from transformers import VisionEncoderDecoderModel, TrOCRProcessor, pipeline Load the model and processor processor = TrOCRProcessor.from_pretrained("mazafard/trocr-finetuned-portugues") model = VisionEncoderDecoderModel.from_pretrained("mazafard/trocr-finetuned-portugues") Create an OCR pipeline ocr_pipeline = pipeline("image-to-text", model=model, processor=processor) Perform OCR on an image image_path = "path/to/your/image.jpg" predicted_text = ocr_pipeline(image_path) print(predicted_text) ``` ## Limitations and Biases The model's performance may be affected by factors such as image quality, font type, and text layout. It is important to evaluate the model's performance on your specific use case and dataset. Like any machine learning model, this model may also have biases inherited from the training data. ## Further Information For more details about the TrOCR architecture and the base model, please refer to the original model card: [microsoft/trocr-base-printed](https://huggingface.co/microsoft/trocr-base-printed)