[8th March 2026 Update]: Upload a new model (overriding the original one) without adapter on lm_head. Lora adapter on lm_head invokes a subtle issue on Peft that leads to device mismatch when using multi-gpus.

Model Card for Model ID

This model replicates LatentQA activation decoder for Llama-3.1-8B-Instruct. The original release of LatentQA decoder is Llama-3-8B-Instruct-based. This model has the same explantion results on the "promote_veganism" example documented in the repo of original work.

Model Details

Activation decoder to interpret Llama-3.1-8B-Instruct's internal activations with natural language descriptions.

  • Developed by: Tony Wu
  • License: MIT
  • Finetuned from model: Llama-3.1-8B-Instruct

Uses

Please refer to https://github.com/aypan17/latentqa

Bias, Risks, and Limitations

The LatentQA fine-tuning is not extensive. The decoder is only good at answering questions analogous to fine-tuning dataset.

Framework versions

  • PEFT 0.18.0
Downloads last month
7
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for tony10101105/LatentQA-Llama-3.1-8B-Instruct-Lora-14000steps

Adapter
(2461)
this model

Paper for tony10101105/LatentQA-Llama-3.1-8B-Instruct-Lora-14000steps