Instructions to use naklitechie/indira-lfm2-8b-a1b-onnx with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers.js
How to use naklitechie/indira-lfm2-8b-a1b-onnx with Transformers.js:
// npm i @huggingface/transformers import { pipeline } from '@huggingface/transformers'; // Allocate pipeline const pipe = await pipeline('text-generation', 'naklitechie/indira-lfm2-8b-a1b-onnx');
indira โ LFM2-8B-A1B citation/voice LoRA (q4 ONNX, WebGPU)
A LoRA fine-tune of LiquidAI/LFM2-8B-A1B,
exported to q4 ONNX for fully in-browser inference via
transformers.js on WebGPU.
It is the optional "Best" generation tier of the indira project โ a
citation-grounded Q&A + discovery site over Indira Gandhi's primary-source corpus
(parliamentary speeches, Government-of-India volumes). The model answers in a
first-person, [N]-cited style grounded in retrieved passages, and declines
when the sources don't support an answer.
Verified in-browser on Apple Metal WebGPU: ~1.2 s to first token, ~48 tok/s.
Lineage
| Base | LiquidAI/LFM2-8B-A1B (8.3B params, ~1B active MoE) |
| Training | LoRA via mlx_lm on mlx-community/LFM2-8B-A1B-4bit (Apple Silicon) |
| SFT teacher | Claude (synthetic cited Q&A; teacher โซ trained model) |
| Export | fuse โ remap (MLXโHF) โ Liquid4All/onnx-export --q4 with symmetric QMoE |
| Format | q4 ONNX, ~5.3 GB (model_q4.onnx + 3 external-data chunks) |
Note on the export: the MoE experts are quantized symmetrically (no zero-points).
onnxruntime-web's WebGPU backend only runs symmetric QMoE โ an asymmetric export loads
but errors at the first QMoE node. transformers.js_config.use_external_data_format is set
to {"model_q4.onnx": 3} so all three weight chunks are preloaded.
Usage (transformers.js, WebGPU)
import { pipeline } from '@huggingface/transformers'; // v4.2.0+
const gen = await pipeline('text-generation',
'naklitechie/indira-lfm2-8b-a1b-onnx', { dtype: 'q4', device: 'webgpu' });
const out = await gen(
[{ role: 'user', content: 'What did Indira Gandhi say about national unity?' }],
{ max_new_tokens: 256 });
Requires hardware WebGPU (gated to capable GPUs in the app). Designed to run after retrieval โ given relevant passages it produces a grounded, cited answer; with no supporting passages it declines.
Notes
- Sibling models:
naklitechie/nehru-โฆ,naklitechie/gandhi-โฆ. - The base model's license (LFM Open License) governs use; see the base repo.
- Built primarily from public-domain / archival primary sources.
- Downloads last month
- 49
Model tree for naklitechie/indira-lfm2-8b-a1b-onnx
Base model
LiquidAI/LFM2-8B-A1B