indira โ€” LFM2-8B-A1B citation/voice LoRA (q4 ONNX, WebGPU)

A LoRA fine-tune of LiquidAI/LFM2-8B-A1B, exported to q4 ONNX for fully in-browser inference via transformers.js on WebGPU.

It is the optional "Best" generation tier of the indira project โ€” a citation-grounded Q&A + discovery site over Indira Gandhi's primary-source corpus (parliamentary speeches, Government-of-India volumes). The model answers in a first-person, [N]-cited style grounded in retrieved passages, and declines when the sources don't support an answer.

Verified in-browser on Apple Metal WebGPU: ~1.2 s to first token, ~48 tok/s.

Lineage

Base LiquidAI/LFM2-8B-A1B (8.3B params, ~1B active MoE)
Training LoRA via mlx_lm on mlx-community/LFM2-8B-A1B-4bit (Apple Silicon)
SFT teacher Claude (synthetic cited Q&A; teacher โ‰ซ trained model)
Export fuse โ†’ remap (MLXโ†’HF) โ†’ Liquid4All/onnx-export --q4 with symmetric QMoE
Format q4 ONNX, ~5.3 GB (model_q4.onnx + 3 external-data chunks)

Note on the export: the MoE experts are quantized symmetrically (no zero-points). onnxruntime-web's WebGPU backend only runs symmetric QMoE โ€” an asymmetric export loads but errors at the first QMoE node. transformers.js_config.use_external_data_format is set to {"model_q4.onnx": 3} so all three weight chunks are preloaded.

Usage (transformers.js, WebGPU)

import { pipeline } from '@huggingface/transformers'; // v4.2.0+
const gen = await pipeline('text-generation',
  'naklitechie/indira-lfm2-8b-a1b-onnx', { dtype: 'q4', device: 'webgpu' });
const out = await gen(
  [{ role: 'user', content: 'What did Indira Gandhi say about national unity?' }],
  { max_new_tokens: 256 });

Requires hardware WebGPU (gated to capable GPUs in the app). Designed to run after retrieval โ€” given relevant passages it produces a grounded, cited answer; with no supporting passages it declines.

Notes

  • Sibling models: naklitechie/nehru-โ€ฆ, naklitechie/gandhi-โ€ฆ.
  • The base model's license (LFM Open License) governs use; see the base repo.
  • Built primarily from public-domain / archival primary sources.
Downloads last month
49
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for naklitechie/indira-lfm2-8b-a1b-onnx

Adapter
(5)
this model