RetMask Trained checkpoints for the paper "From Interpretability to Performance: Optimizing Retrieval Heads for Long-Context Language Models" maym15/Llama-3.1-8B-Instruct-RetMask Text Generation • 8B • Updated Apr 20 • 6 maym15/Qwen3-8B-RetMask Text Generation • 8B • Updated Apr 20 • 9 maym15/Olmo-3-7B-Instruct-RetMask Text Generation • 7B • Updated Apr 21 • 18 maym15/Olmo-3-7B-Think-RetMask Text Generation • 7B • Updated Apr 21 • 3
RetMask Trained checkpoints for the paper "From Interpretability to Performance: Optimizing Retrieval Heads for Long-Context Language Models" maym15/Llama-3.1-8B-Instruct-RetMask Text Generation • 8B • Updated Apr 20 • 6 maym15/Qwen3-8B-RetMask Text Generation • 8B • Updated Apr 20 • 9 maym15/Olmo-3-7B-Instruct-RetMask Text Generation • 7B • Updated Apr 21 • 18 maym15/Olmo-3-7B-Think-RetMask Text Generation • 7B • Updated Apr 21 • 3