--- license: apache-2.0 language: - en - ru - es - de - fr --- # HRM-Grammar-Light ONNX (int8) Lightweight multilingual grammar correction model, exported to ONNX and quantized to int8 for efficient inference. ## What does this model do? It corrects text in multiple languages. Given a prompt like: ``` corregir español: el casa es grande ``` it generates the corrected version: ``` La casa es grande. ``` ## Included files - `hrm_grammar_light.onnx`: standard precision (fp32) ONNX export ACT 4 o ACT 5 - `hrm_grammar_light_int8.onnx`: int8 quantized ONNX (smaller, faster) ## How to use (Python + ONNX Runtime) ```python import onnxruntime as ort import numpy as np # Load model session = ort.InferenceSession("hrm_grammar_light_int8.onnx") # Prepare your input sequence (tokenized, see below) input_ids = np.array([[...]], dtype=np.int64) # shape (1, seq_len) attention_mask = np.ones_like(input_ids) # Run inference outputs = session.run(["logits"], { "input_ids": input_ids, "labels": None, "attention_mask": attention_mask, "language_ids": None }) logits = outputs[0] # (1, seq_len, vocab_size) ``` ## Tokenization This model uses the `t5-small` vocabulary (Hugging Face Transformers). To tokenize: ```python from transformers import T5Tokenizer tokenizer = T5Tokenizer.from_pretrained("t5-small") prompt = "corregir español: el casa es grande" input_ids = tokenizer(prompt, return_tensors="np", padding="max_length", max_length=256, truncation=True)["input_ids"] ``` ## Notes - The model expects prompts like: `corregir : ` EX: corregir ingles: ..... - Supports Spanish, English, French, German, Russian, and more (see original README). - Output is logits; to decode, apply argmax and then use `tokenizer.decode`. - For maximum speed, use the int8 model. ## License https://huggingface.co/dreuxx26/HRM-Grammar-Light