---
license: apache-2.0
language:
- en
- ru
- es
- de
- fr
---
# HRM-Grammar-Light ONNX (int8)

Lightweight multilingual grammar correction model, exported to ONNX and quantized to int8 for efficient inference.

## What does this model do?
It corrects text in multiple languages. Given a prompt like:

```
corregir español: el casa es grande
```

it generates the corrected version:

```
La casa es grande.
```

## Included files
- `hrm_grammar_light.onnx`: standard precision (fp32) ONNX export     ACT 4   o  ACT 5 
- `hrm_grammar_light_int8.onnx`: int8 quantized ONNX (smaller, faster)

## How to use (Python + ONNX Runtime)
```python
import onnxruntime as ort
import numpy as np

# Load model
session = ort.InferenceSession("hrm_grammar_light_int8.onnx")

# Prepare your input sequence (tokenized, see below)
input_ids = np.array([[...]], dtype=np.int64)  # shape (1, seq_len)
attention_mask = np.ones_like(input_ids)

# Run inference
outputs = session.run(["logits"], {
    "input_ids": input_ids,
    "labels": None,
    "attention_mask": attention_mask,
    "language_ids": None
})
logits = outputs[0]  # (1, seq_len, vocab_size)
```

## Tokenization
This model uses the `t5-small` vocabulary (Hugging Face Transformers). To tokenize:
```python
from transformers import T5Tokenizer

tokenizer = T5Tokenizer.from_pretrained("t5-small")
prompt = "corregir español: el casa es grande"
input_ids = tokenizer(prompt, return_tensors="np", padding="max_length", max_length=256, truncation=True)["input_ids"]
```

## Notes
- The model expects prompts like: `corregir <language>: <text>`  EX: corregir ingles: .....
- Supports Spanish, English, French, German, Russian, and more (see original README).
- Output is logits; to decode, apply argmax and then use `tokenizer.decode`.
- For maximum speed, use the int8 model.

## License
https://huggingface.co/dreuxx26/HRM-Grammar-Light