bertimbau-large-lener_br-onnx

ONNX conversion of Luciano/bertimbau-large-lener_br — Brazilian legal Named Entity Recognition (LeNER-Br, 6 classes) on top of BERTimbau-large — for Transformers.js (v3+) and ONNX Runtime.

Files

file dtype size
onnx/model.onnx fp32 1.33 GB
onnx/model_quantized.onnx int8 (QUInt8 dynamic) 335 MB

Verified parity (vs PyTorch reference, 10 PT-BR legal sentences)

metric fp32 q8
entity parity (Optimum/ORT, aggregation_strategy="simple") 26/26 (100%) 25/26 (96.2%)
token-level argmax parity (PyTorch↔ONNX) 196/196 (100%) 193/196 (98.5%)
max |Δlogit| vs PyTorch 2.1e-5

Usage

Python (Optimum / ONNX Runtime) — recommended

from optimum.onnxruntime import ORTModelForTokenClassification
from transformers import AutoTokenizer, pipeline

model = ORTModelForTokenClassification.from_pretrained(
    "rchuluc/bertimbau-large-lener_br-onnx",
    subfolder="onnx",
    file_name="model_quantized.onnx",  # or "model.onnx"
)
tok = AutoTokenizer.from_pretrained("rchuluc/bertimbau-large-lener_br-onnx")

pipe = pipeline("ner", model=model, tokenizer=tok, aggregation_strategy="simple")
print(pipe("Conforme o art. 5º da Constituição Federal, todos são iguais perante a lei."))

Transformers.js

import { pipeline } from '@huggingface/transformers';

const ner = await pipeline(
  'token-classification',
  'rchuluc/bertimbau-large-lener_br-onnx',
  { dtype: 'q8' }, // or 'fp32'
);
const out = await ner('O Supremo Tribunal Federal julgou a ação em Brasília.', {
  ignore_labels: ['O'],
});
// Aggregate B-/I- + WordPiece (##) yourself.

⚠️ For legal text, prefer the Python/Optimum path. Transformers.js v3's TokenClassificationPipeline drops [UNK] tokens and lacks aggregation_strategy. Legal entities contain out-of-vocab ordinals/symbols (, [UNK]), so spans like art. 5º da Constituição Federal lose pieces in JS. The ONNX model labels those tokens correctly (see token-level parity); the loss is in the JS pipeline, not the model.

Classes (LeNER-Br, 6 types)

ORGANIZACAO, PESSOA, TEMPO, LOCAL, LEGISLACAO, JURISPRUDENCIA

Attribution

Not affiliated with the original authors. Cite the original work in any publication.

Downloads last month
20
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rchuluc/bertimbau-large-lener_br-onnx

Dataset used to train rchuluc/bertimbau-large-lener_br-onnx