routangseng-qwen35-0.8b-abliterated-lora-onnx

ONNX export of bobber/routangseng-qwen35-0.8b-abliterated with identity-override LoRA merged.

Build Process

  1. Base model: huihui-ai/Huihui-Qwen3.5-0.8B-abliterated
  2. LoRA fine-tuning: Identity-override LoRA (rank 64, alpha 128, 251 training rows focused on identity responses)
  3. Merge: LoRA adapter merged into base model
  4. ONNX: Weight transplant into reference graph from onnx-community/Qwen3.5-0.8B-ONNX
  5. Quantization: q8 (MatMul-only for decoder, full dynamic for embed/vision)

⚠️ Known Issues

  • Identity-override LoRA degrades analytical depth: The LoRA was trained heavily on short identity responses (1-2 sentences), which biases the model toward brevity. For deep structured analysis, use the base abliterated ONNX at bobber/routangseng-qwen35-0.8b-abliterated-onnx instead.

Usage with transformers.js

import { Qwen3_5ForConditionalGeneration, AutoProcessor } from '@huggingface/transformers';

const model = await Qwen3_5ForConditionalGeneration.from_pretrained(
  'bobber/routangseng-qwen35-0.8b-abliterated-lora-onnx',
  { dtype: { embed_tokens: 'q8', vision_encoder: 'q8', decoder_model_merged: 'q8' }, device: 'webgpu' }
);

Files

  • onnx/decoder_model_merged_quantized.onnx + .onnx_data — Decoder (q8, MatMul-only quantization)
  • onnx/embed_tokens_quantized.onnx + .onnx_data — Embeddings (q8)
  • onnx/vision_encoder_quantized.onnx + .onnx_data — Vision encoder (q8, from reference, not fine-tuned)
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for bobber/routangseng-qwen35-0.8b-abliterated-lora-onnx