Instructions to use AmirMohseni/curvebench-gemma-3-12b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use AmirMohseni/curvebench-gemma-3-12b with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("google/gemma-3-12b-it") model = PeftModel.from_pretrained(base_model, "AmirMohseni/curvebench-gemma-3-12b") - Notebooks
- Google Colab
- Kaggle
metadata
base_model: google/gemma-3-12b-it
library_name: peft
datasets:
- AmirMohseni/CurveBench-Easy
- AmirMohseni/CurveBench
arxiv: 2605.14068
tags:
- grpo
- trl
- lora
- vision-language-model
- topological-reasoning
- curvebench
curvebench-gemma-3-12b
This is a LoRA adapter for google/gemma-3-12b-it, fine-tuned with GRPO on CurveBench-Easy using verifiable rewards for topological tree prediction.
It corresponds to model-c in the CurveBench paper (reward: tree isomorphism (0.7) + node count (0.3)).
- Paper: CurveBench: A Benchmark for Exact Topological Reasoning over Nested Jordan Curves
- Training dataset: AmirMohseni/CurveBench-Easy
- Evaluation dataset: AmirMohseni/CurveBench
- Collection: AmirMohseni/curvebench
- GitHub: Amir-Mohseni/CurveBench
Usage
Option 1 — vLLM (recommended for serving)
Start the server with the LoRA adapter loaded on top of the base model:
vllm serve google/gemma-3-12b-it \
--enable-lora \
--lora-modules grpo-region-tree=AmirMohseni/curvebench-gemma-3-12b \
--max-lora-rank 4 \
--max-model-len 32768 \
--gpu-memory-utilization 0.90 \
--dtype bfloat16 \
--trust-remote-code
Then query it with the OpenAI-compatible API:
from openai import OpenAI
from datasets import load_dataset
import base64
from io import BytesIO
client = OpenAI(base_url="http://localhost:8000/v1", api_key="token")
# Load the first test image from the benchmark
ds = load_dataset("AmirMohseni/CurveBench-Easy", split="total_test")
image = ds[0]["image"]
buf = BytesIO()
image.save(buf, format="PNG")
image_b64 = base64.b64encode(buf.getvalue()).decode()
response = client.chat.completions.create(
model="grpo-region-tree",
messages=[{
"role": "user",
"content": [
{
"type": "image_url",
"image_url": {"url": f"data:image/png;base64,{image_b64}"},
},
{
"type": "text",
"text": (
"The image shows a set of pairwise non-intersecting closed curves drawn on a plane. "
"Each curve creates a boundary between an interior region and its surroundings. "
"Output the containment tree of the regions as a list of edges in the format: "
"[(parent, child), ...] where 0 is the outermost (unbounded) region."
),
},
],
}],
max_tokens=2048,
)
print(response.choices[0].message.content)
print("Ground truth:", ds[0]["tree"])
Option 2 — PEFT + Transformers (offline)
Load the base model and apply the LoRA adapter directly:
from peft import PeftModel
from transformers import AutoModelForImageTextToText, AutoProcessor
from datasets import load_dataset
import torch
base_id = "google/gemma-3-12b-it"
adapter_id = "AmirMohseni/curvebench-gemma-3-12b"
processor = AutoProcessor.from_pretrained(base_id, trust_remote_code=True)
model = AutoModelForImageTextToText.from_pretrained(
base_id, torch_dtype=torch.bfloat16, device_map="auto", trust_remote_code=True
)
model = PeftModel.from_pretrained(model, adapter_id)
# Load the first test image from the benchmark
ds = load_dataset("AmirMohseni/CurveBench-Easy", split="total_test")
image = ds[0]["image"]
prompt = (
"The image shows a set of pairwise non-intersecting closed curves drawn on a plane. "
"Each curve creates a boundary between an interior region and its surroundings. "
"Output the containment tree of the regions as a list of edges in the format: "
"[(parent, child), ...] where 0 is the outermost (unbounded) region."
)
inputs = processor(
text=processor.apply_chat_template(
[{"role": "user", "content": [{"type": "image"}, {"type": "text", "text": prompt}]}],
add_generation_prompt=True,
),
images=[image],
return_tensors="pt",
).to(model.device)
output = model.generate(**inputs, max_new_tokens=2048)
print(processor.decode(output[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))
print("Ground truth:", ds[0]["tree"])
Training curves
Training reward
Eval reward
Training procedure
Trained with GRPO using a fork of TRL with multimodal support: AmirTuring/trl @ curvebench.
- Method: GRPO (Group Relative Policy Optimization)
- Base model: google/gemma-3-12b-it
- Training split:
total_train(210 images) from CurveBench-Easy - Reward: tree isomorphism (0.7) + node count (0.3)
- LoRA rank (r): 4 | LoRA alpha: 8
Framework versions
- TRL: 0.1.0
- Transformers: 4.57.1
- Pytorch: 2.8.0
- Datasets: 4.3.0
- Tokenizers: 0.22.1
Citation
@misc{mohseni2026curvebench,
title={CurveBench: A Benchmark for Exact Topological Reasoning over Nested Jordan Curves},
author={Amirreza Mohseni and Mona Mohammadi and Morteza Saghafian and Naser Talebizadeh Sardari},
year={2026},
eprint={2605.14068},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2605.14068},
}
Cite GRPO as:
@article{shao2024deepseekmath,
title = {{DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models}},
author = {Zhihong Shao and Peiyi Wang and Qihao Zhu and Runxin Xu and Junxiao Song and Mingchuan Zhang and Y. K. Li and Y. Wu and Daya Guo},
year = 2024,
eprint = {arXiv:2402.03300},
}
Cite TRL as:
@misc{vonwerra2022trl,
title = {{TRL: Transformer Reinforcement Learning}},
author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallou{\'e}dec},
year = 2020,
journal = {GitHub repository},
publisher = {GitHub},
howpublished = {\url{https://github.com/huggingface/trl}}
}

