Model Details

Model Description

This repository contains the merged full model produced from the best Qwen3 reranker training run in QAnchor, a finance-oriented RAG and reranking pipeline for question answering over Chinese A-share annual reports.

This model is a Qwen3-based cross-encoder reranker fine-tuned to score and rank candidate document chunks for Chinese financial question answering. It is intended for same-document reranking over chunks extracted from annual reports and related financial filings.

Model type: Cross-encoder reranker implemented as a sequence classification model
Language(s) (NLP): Chinese (primary fine-tuning domain)
License: Apache-2.0
Finetuned from model: tomaarsen/Qwen3-Reranker-0.6B-seq-cls

Results

Compared against the same retrieval pipeline without finetuning, the fine-tuned Qwen3 reranker achieved:

Metric	Base	Finetuned	Absolute Gain	Relative Gain
MRR@10	0.6115	0.7758	+0.1643	+26.9%
NDCG@10	0.7572	0.8761	+0.1189	+15.7%
P@10	0.1920	0.2280	+0.0360	+18.8%

Why this base model

This reranker is fine-tuned from tomaarsen/Qwen3-Reranker-0.6B-seq-cls, a sequence-classification adaptation of Qwen/Qwen3-Reranker-0.6B.

We selected this base because the QAnchor training pipeline optimizes grouped candidate scores with a custom listwise ranking loss built on top of outputs.logits. Using the seq-cls variant allows the model to fit naturally into a standard AutoModelForSequenceClassification + LoRA training workflow.

By contrast, the original Qwen3 reranker uses a different scoring path based on the final-token "yes" / "no" logits of a causal language model, which would require a different training interface.

Model Sources

Repository: QAnchor
Base model: tomaarsen/Qwen3-Reranker-0.6B-seq-cls
Upstream reranker: Qwen/Qwen3-Reranker-0.6B

Uses

Direct Use

This model is intended to rerank candidate document chunks returned by a first-stage retriever.

Recommended use cases include:

Chinese financial-document reranking
same-document candidate reranking
question answering pipelines over A-share annual reports and related filings

Downstream Use

Typical downstream usage is:

Retrieve a candidate set with a first-stage retriever
Format each (query, document) pair using the training-time template
Score each candidate with this reranker
Sort by score and keep top-k

Out-of-Scope Use

This model is not intended for:

standalone generative QA
direct answer generation without retrieval
cross-document retrieval without a first-stage candidate generator
legal, accounting, or investment advice
settings where training-time formatting is not preserved

Bias, Risks, and Limitations

This model inherits limitations from both the upstream Qwen3 reranker family and the QAnchor training setup.

Key limitations include:

The fine-tuning domain is Chinese financial QA, especially A-share annual reports
The training and evaluation setup assumes same-document reranking rather than open-domain retrieval
The model is sensitive to input formatting and was trained with pair_format=qwen3_template
Training data is weakly supervised and not publicly released
No official hosted inference SLA or latency benchmark is provided in this repository

Recommendations

Preserve the qwen3_template formatting logic at inference time
Use this model only after a first-stage retriever has produced a candidate set
Validate quality on your own financial-document distribution before production use
Do not treat reranker scores as calibrated probabilities or final answers

How to Get Started with the Model

Python

from transformers import AutoModelForSequenceClassification, AutoTokenizer

repo_id = "souflex56/qanchor-reranker-qwen3-0.6b-merged"

tokenizer = AutoTokenizer.from_pretrained(repo_id)
model = AutoModelForSequenceClassification.from_pretrained(repo_id)

Training-time formatting

This model was trained with pair_format=qwen3_template, not a plain raw (query, document) pair.

Conceptually, each pair is formatted as:

<|im_start|>system
Judge whether the Document meets the requirements based on the Query and the Instruct provided. Note that the answer can only be "yes" or "no".<|im_end|>
<|im_start|>user
<Instruct>: Given a web search query, retrieve relevant passages that answer the query
<Query>: {query}
<Document>: {document}<|im_end|>
<|im_start|>assistant
<think>

</think>

A minimal helper is:

def format_qwen3_template(query: str, document: str) -> str:
    return (
        '<|im_start|>system\n'
        'Judge whether the Document meets the requirements based on the Query and the Instruct provided. '
        'Note that the answer can only be "yes" or "no".<|im_end|>\n'
        '<|im_start|>user\n'
        '<Instruct>: Given a web search query, retrieve relevant passages that answer the query\n'
        f'<Query>: {query}\n'
        f'<Document>: {document}<|im_end|>\n'
        '<|im_start|>assistant\n'
        '<think>\n\n</think>\n\n'
    )

Training Details

Training Data

This model was fine-tuned on weakly supervised reranker data constructed in the QAnchor pipeline.

The released model does not include the training dataset. Public release covers model artifacts, metadata, and documentation only.

Training data characteristics:

Chinese financial QA domain
A-share annual reports and related financial filings
Sample structure: query + pos_text + neg_texts
Reverse-mined weak supervision with blacklist-based isolation from gold evaluation data

Dataset statistics for the best released run:

Train queries: 179
Dev queries: 20
Train samples: 1274
Dev samples: 247

Training Procedure

Preprocessing

Key preprocessing steps in the QAnchor pipeline:

PDF chunking into parent/child hierarchical chunks
first-stage retrieval with embedding + BM25 + RRF
reverse mining to construct positive / hard-negative triplets
blacklist filtering and query-level train/dev splitting

Training Hyperparameters

Training regime: no mixed precision flags enabled (fp16=false, bf16=false)
Pair format: qwen3_template
Max length: 768
Max negatives: 7
Learning rate: 2e-5
Epochs: 3
Batch size: 1
Gradient accumulation steps: 8

Speeds, Sizes, Times

Hardware type: NVIDIA GeForce RTX 4090
Training runtime (best released run): approximately 1752.8s (29.2 min)
Deployment note: no official hosted inference benchmark is published in this repository

Evaluation

Testing Data, Factors & Metrics

Testing Data

Evaluation was performed on the QAnchor Stage 1 gold evaluation setting:

Gold eval queries: 50
Candidates per query: 20
Candidate source: Hybrid RRF retrieval output

Factors

The main evaluation setting measures reranking quality for:

Chinese financial QA
same-document reranking
candidate chunks from annual reports and related filings

Technical Specifications

Model Architecture and Objective

Base architecture: Qwen3 reranker family
Fine-tuning interface: sequence classification
Objective: candidate reranking for query-document pairs
Training objective in QAnchor: listwise softmax cross-entropy over grouped candidates

Compute Infrastructure

Hardware

Single NVIDIA GeForce RTX 4090 GPU

Software

Transformers
PEFT
Accelerate
PyTorch

Citation

If you use this model, please cite the QAnchor repository and the upstream Qwen3 reranker family.

@misc{qanchor_reranker_2026,
  title={QAnchor Qwen3 Reranker Release},
  author={souflex56},
  year={2026},
  howpublished={\url{https://github.com/souflex56/QAnchor}}
}

Model Card Contact

For questions about this release, please open an issue in the QAnchor repository.

Downloads last month: 1

Safetensors

Model size

0.6B params

Tensor type

F32

Model tree for souflex56/qanchor-reranker-qwen3-0.6b-merged

Base model

Qwen/Qwen3-0.6B-Base

Finetuned

tomaarsen/Qwen3-Reranker-0.6B-seq-cls

Finetuned

(4)

this model