How to use from the
Use from the
MLX library
# Download the model from the Hub
pip install huggingface_hub[hf_xet]

huggingface-cli download --local-dir openmed-persian-pii-google-mbert-mlx-int4 Reza2kn/openmed-persian-pii-google-mbert-mlx-int4

OpenMed Persian PII Google mBERT MLX INT4

Verified 4-bit MLX INT4 export of google-bert/bert-base-multilingual-cased fine-tuned for Persian/Iranian PII token classification.

Metrics

Dense held-out test F1: 0.9708

Runtime held-out slice eval (test, first 2,000 rows, max_length=256):

{
  "model": "artifacts/google-mbert-pii-4bit/mlx-custom",
  "dataset": "data/final_splits_audited/combined_clean",
  "split": "test",
  "rows": 2000,
  "max_length": 256,
  "batch_size": 16,
  "precision": 0.9730430274753759,
  "recall": 0.976142494961971,
  "f1": 0.9745902969333118,
  "accuracy": 0.9946134593879415
}

Fixture/runtime verification:

{
  "status": "converted_mlx_int4",
  "weights": "artifacts/google-mbert-pii-4bit/mlx-custom/weights.safetensors",
  "bits": 4,
  "group_size": 64,
  "max_length": 256,
  "verification": {
    "name": "mlx_int4",
    "shape": [
      2,
      256,
      39
    ],
    "argmax_match_rate_vs_unquantized_mlx": 0.966796875,
    "max_abs_diff_vs_unquantized_mlx": 9.43897533416748,
    "mean_abs_diff_vs_unquantized_mlx": 0.10117268562316895
  }
}

Runtime Contract

Use this model behind the same production wrapper as the ONNX/CoreML releases:

  1. sliding-window inference, usually max_length=256 and stride around 96;
  2. offset-based span reconstruction;
  3. whitespace trimming and overlap de-duplication;
  4. deterministic regex/rule assists for email, phone, national ID, postal code, date, card number, and IMEI exclusion;
  5. cue-word correction around Persian labels such as کد ملی, شماره تماس, کدپستی, and ایمیل.

"""Minimal MLX wrapper contract.

This repo includes a custom BERT token-classification MLX runtime script in the source project. Load weights.safetensors into the same module shape, tokenize with the bundled tokenizer, run sliding windows, then reconstruct spans from offsets and apply the same regex/rule postprocessing used by the ONNX/CoreML packages. """

Compact and cleaner on mixed Persian/Latin/email text.

Downloads last month
41
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Reza2kn/openmed-persian-pii-google-mbert-mlx-int4

Finetuned
(991)
this model

Collection including Reza2kn/openmed-persian-pii-google-mbert-mlx-int4