--- language: - multilingual base_model: - google/t5gemma-2-270m-270m pipeline_tag: text-ranking datasets: - KaLM-Embedding/KaLM-embedding-finetuning-data - Shitao/bge-m3-data tags: - reranker - encoder-decoder - FBNL license: mit ---

KaLM-Reranker-V1: Fast but Not Late Interaction for Compressed Document Reranking

We present `KaLM-Reranker-V1`, a fast but not late-interaction (FBNL) reranker that decouples query and passage computation while retaining expressive relevance modeling. Built on an encoder-decoder architecture, KaLM-Reranker-V1 uses the encoder to pre-encode passages with Matryoshka embedding pooling, while the decoder models the system instruction, user instruction, and query intent; cross-attention then captures relevance between the query context and passage representations. This design makes KaLM-Reranker-V1 efficient through decoupled passage encoding, yet not late interaction, by preserving rich relevance modeling through cross-attention. We instantiate KaLM-Reranker-V1 in three sizes, `Nano`, `Small`, and `Large`, with `0.27B`, `1B`, and `4B` activated parameters, respectively. ![kalm-reranker-v1 architecture](./assets/framework.jpg) Extensive experiments on BEIR, MIRACL, and LMEB show that the KaLM-Reranker-V1 series achieves competitive reranking performance compared with strong industrial rerankers while significantly reducing online overhead. # Model Details | Models | Activated Params. | Non-Embedding Params. | Embedding Params. | #Layers | Sequence Length | Document Token Dim. | MEP Support | Instruction Aware | | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | | [KaLM-Reranker-V1-Nano](https://huggingface.co/KaLM-Embedding/KaLM-Reranker-V1-Nano) | 0.27B | 100M | 168M | 18 | 128K | 640 | 1x-32x | Yes | | [KaLM-Reranker-V1-Small](https://huggingface.co/KaLM-Embedding/KaLM-Reranker-V1-Small) | 1B | 698M | 302M | 26 | 128K | 1152 | 1x-32x | Yes | | [KaLM-Reranker-V1-Large](https://huggingface.co/KaLM-Embedding/KaLM-Reranker-V1-Large) | 4B | 3209M | 675M | 34 | 128K | 2560 | 1x-32x | Yes | # Prompt Template ```python f": {document}" ``` ```python ( f"user\n" f"Judge whether the Document meets the requirements based on the Query and the Instruct provided. Note that the answer can only be \"yes\" or \"no\".\n\n" f": {task_instruction}\n" f": {query}\n" f"model\n\n\n\n" ) ``` ![kalm-reranker-v1 template](./assets/template.jpg) # Evaluation ## BEIR ## MIRACL ## LMEB