bertimbau-large-inferbr-nli-onnx Model Card

Overview

This is an ONNX conversion of felipesfpaula/bertimbau-large-InferBr-NLI for use with Transformers.js (v3+) and ONNX Runtime.

No changes to the encoder weights. The config.json is patched so the id2label map uses the canonical NLI names (contradiction / entailment / neutral) — required by the zero-shot-classification pipeline in Transformers.js. Semantics of each class index are preserved from the upstream checkpoint.

Conversion pipeline: rchuluc/bertimbau-large-inferbr-nli-onnx on GitHub (scripts, parity check, reproduction steps).

Attribution

Original model: felipesfpaula/bertimbau-large-InferBr-NLI — BERTimbau-large fine-tuned on the InferBR Brazilian-Portuguese NLI dataset (3 classes). Reports 93.95% accuracy / 0.7596 macro-F1 on the InferBR test set.
Base model: neuralmind/bert-large-portuguese-cased (BERTimbau-large).
Training dataset: InferBR (Premise–Hypothesis pairs in PT-BR, splits: train / validation / test).
License: MIT (inherited from upstream).

Not affiliated with the original authors. Please cite the original work in any publication.

Classes (NLI, 3 categories)

contradiction, entailment, neutral

Index mapping is preserved from the upstream checkpoint:

index	upstream label	this repo label
0	`LABEL_0` (Contradiction)	`contradiction`
1	`LABEL_1` (Entailment)	`entailment`
2	`LABEL_2` (Neutral)	`neutral`

Files

config.json                 # id2label patched: contradiction / entailment / neutral
tokenizer.json
tokenizer_config.json
special_tokens_map.json
vocab.txt
onnx/
  model.onnx                # fp32 — 1.34 GB
  model_quantized.onnx      # dynamic int8 (QUInt8) — 336 MB

Usage — Transformers.js (JavaScript/TypeScript)

Zero-shot classification (recommended)

The zero-shot-classification pipeline handles the NLI premise/hypothesis pair internally via the hypothesis_template.

import { pipeline } from '@huggingface/transformers';

const clf = await pipeline(
  'zero-shot-classification',
  'rchuluc/bertimbau-large-inferbr-nli-onnx',
  { dtype: 'q8' }, // or 'fp32'
);

const result = await clf(
  'Os bailes funks nas favelas paulistanas são fundamentais para a juventude local.',
  [
    'anglicismo',
    'neologismo',
    'gíria de nicho',
    'termo técnico apropriado',
    'marcador identitário',
    'estrangeirismo não-inglês',
  ],
  { hypothesis_template: 'Este texto é sobre {}.' },
);

console.log(result);
// { sequence: '...', labels: [...], scores: [...] }  — sorted desc

NLI direct (premise + hypothesis)

⚠️ pipeline('text-classification') in Transformers.js v3 does not implement text_pair — to run NLI directly, drop to the tokenizer + model API:

import {
  AutoTokenizer,
  AutoModelForSequenceClassification,
  softmax,
} from '@huggingface/transformers';

const tokenizer = await AutoTokenizer.from_pretrained(
  'rchuluc/bertimbau-large-inferbr-nli-onnx',
);
const model = await AutoModelForSequenceClassification.from_pretrained(
  'rchuluc/bertimbau-large-inferbr-nli-onnx',
  { dtype: 'q8' },
);

const enc = await tokenizer(
  'O baile da DZ7 reunia milhares de pessoas em Paraisópolis.',
  {
    text_pair: 'Existia um baile funk grande em Paraisópolis.',
    truncation: true,
    padding: 'max_length',
    max_length: 128,
    return_tensors: 'pt',
  },
);
const out = await model(enc);
const probs = softmax(Array.from(out.logits.data));
const id2label = model.config.id2label;
// probs[0] = contradiction, probs[1] = entailment, probs[2] = neutral

Usage — Python (Optimum / ONNX Runtime)

from optimum.onnxruntime import ORTModelForSequenceClassification
from transformers import AutoTokenizer
import torch

model = ORTModelForSequenceClassification.from_pretrained(
    "rchuluc/bertimbau-large-inferbr-nli-onnx",
    file_name="onnx/model_quantized.onnx",  # or "onnx/model.onnx"
)
tok = AutoTokenizer.from_pretrained("rchuluc/bertimbau-large-inferbr-nli-onnx")

inputs = tok(
    "O baile da DZ7 reunia milhares de pessoas em Paraisópolis.",
    "Existia um baile funk grande em Paraisópolis.",
    return_tensors="pt", truncation=True, padding="max_length", max_length=128,
)
logits = model(**inputs).logits
probs = torch.softmax(logits, dim=-1)[0]
print({model.config.id2label[i]: float(probs[i]) for i in range(3)})

Verified Parity

12 PT-BR NLI control pairs (cultural / civic domain), reference = PyTorch fp32 BertForSequenceClassification:

dtype	parity vs PyTorch fp32	accuracy vs ground truth	latency (onnxruntime-node, CPU)	size
fp32	(not benchmarked)	(not benchmarked)	—	1.34 GB
q8	12/12 (100%)	12/12 (100%)	48 ms/pair	336 MB

Maximum absolute logit divergence reported by Optimum during export: 0.00011 (well within practical tolerance).

Technical Notes

id2label patch. Upstream ships labels as LABEL_0/LABEL_1/LABEL_2. The Transformers.js zero-shot-classification pipeline requires the entailment label to be discoverable by case-insensitive substring match — so this repo patches config.json to use contradiction/entailment/neutral (preserving the original index order documented by the upstream model card).
No CRF / no aggregation needed. Unlike the BERTimbau NER variant, NLI heads are plain sequence-classification (single label per input). No BIO/WordPiece post-processing.
text_pair on the JS pipeline. Transformers.js v3 pipeline('text-classification') ignores text_pair. For NLI direct, use AutoTokenizer + AutoModelForSequenceClassification. The zero-shot-classification pipeline is unaffected — it builds the pair internally from the hypothesis_template.
Quantization. onnxruntime.quantization.quantize_dynamic with weight_type=QUInt8. No static calibration; q8 matched fp32 PyTorch perfectly on the 12-pair control set.
Opset. ONNX opset 14 (compatible with onnxruntime-node ≥ 1.21 / @huggingface/transformers ≥ 3.8).

Citation

@misc{felipesfpaula2025bertimbau-inferbr-nli,
  author = {Paula, Felipe S. F.},
  title  = {bertimbau-large-InferBr-NLI},
  year   = {2025},
  url    = {https://huggingface.co/felipesfpaula/bertimbau-large-InferBr-NLI}
}

@inproceedings{souza2020bertimbau,
  author    = {Fábio Souza and Rodrigo Nogueira and Roberto Lotufo},
  title     = {{BERT}imbau: Pretrained {BERT} Models for {B}razilian {P}ortuguese},
  booktitle = {9th Brazilian Conference on Intelligent Systems (BRACIS)},
  year      = {2020}
}

Downloads last month: 21

Model tree for rchuluc/bertimbau-large-inferbr-nli-onnx

Base model

neuralmind/bert-large-portuguese-cased

Finetuned

felipesfpaula/bertimbau-large-InferBr-NLI

Quantized

(1)

this model