Text Ranking
Transformers
Safetensors
multilingual
t5gemma2
text2text-generation
reranker
encoder-decoder
FBNL
Retrieval
RAG
Yuki131's picture
Update README.md
8b36257 verified
|
Raw
History Blame
3.02 kB
metadata
language:
  - multilingual
base_model:
  - google/t5gemma-2-270m-270m
pipeline_tag: text-ranking
datasets:
  - KaLM-Embedding/KaLM-embedding-finetuning-data
  - Shitao/bge-m3-data
tags:
  - reranker
  - encoder-decoder
  - FBNL
license: mit

KaLM-Reranker-V1: Fast but Not Late Interaction for Compressed Document Reranking

HF Collection Paper

We present KaLM-Reranker-V1, a fast but not late-interaction (FBNL) reranker that decouples query and passage computation while retaining expressive relevance modeling.

Built on an encoder-decoder architecture, KaLM-Reranker-V1 uses the encoder to pre-encode passages with Matryoshka embedding pooling, while the decoder models the system instruction, user instruction, and query intent; cross-attention then captures relevance between the query context and passage representations. This design makes KaLM-Reranker-V1 efficient through decoupled passage encoding, yet not late interaction, by preserving rich relevance modeling through cross-attention.

We instantiate KaLM-Reranker-V1 in three sizes, Nano, Small, and Large, with 0.27B, 1B, and 4B activated parameters, respectively.

kalm-reranker-v1 architecture

Extensive experiments on BEIR, MIRACL, and LMEB show that the KaLM-Reranker-V1 series achieves competitive reranking performance compared with strong industrial rerankers while significantly reducing online overhead.

Model Details

Models Activated Params. Non-Embedding Params. Embedding Params. #Layers Sequence Length Document Token Dim. MEP Support Instruction Aware
KaLM-Reranker-V1-Nano 0.27B 100M 168M 18 128K 640 1x-32x Yes
KaLM-Reranker-V1-Small 1B 698M 302M 26 128K 1152 1x-32x Yes
KaLM-Reranker-V1-Large 4B 3209M 675M 34 128K 2560 1x-32x Yes

Prompt Template

 f"<Document>: {document}"
(
    f"<bos><start_of_turn>user\n"
    f"Judge whether the Document meets the requirements based on the Query and the Instruct provided. Note that the answer can only be \"yes\" or \"no\".\n\n"
    f"<Instruct>: {task_instruction}\n"
    f"<Query>: {query}<end_of_turn>\n"
    f"<start_of_turn>model\n\n\n\n"
)

kalm-reranker-v1 template

Evaluation

BEIR

MIRACL

LMEB