AmirMohseni/CurveBench-Easy
Viewer • Updated • 600 • 207 • 1
How to use AmirMohseni/curvebench-gemma-3-12b with PEFT:
from peft import PeftModel
from transformers import AutoModelForCausalLM
base_model = AutoModelForCausalLM.from_pretrained("google/gemma-3-12b-it")
model = PeftModel.from_pretrained(base_model, "AmirMohseni/curvebench-gemma-3-12b")This is a LoRA adapter for google/gemma-3-12b-it, fine-tuned with GRPO on CurveBench-Easy using verifiable rewards for topological tree prediction.
It corresponds to model-c in the CurveBench paper (reward: tree isomorphism (0.7) + node count (0.3)).
Start the server with the LoRA adapter loaded on top of the base model:
vllm serve google/gemma-3-12b-it \
--enable-lora \
--lora-modules grpo-region-tree=AmirMohseni/curvebench-gemma-3-12b \
--max-lora-rank 4 \
--max-model-len 32768 \
--gpu-memory-utilization 0.90 \
--dtype bfloat16 \
--trust-remote-code
Then query it with the OpenAI-compatible API:
from openai import OpenAI
from datasets import load_dataset
import base64
from io import BytesIO
client = OpenAI(base_url="http://localhost:8000/v1", api_key="token")
# Load the first test image from the benchmark
ds = load_dataset("AmirMohseni/CurveBench-Easy", split="total_test")
image = ds[0]["image"]
buf = BytesIO()
image.save(buf, format="PNG")
image_b64 = base64.b64encode(buf.getvalue()).decode()
response = client.chat.completions.create(
model="grpo-region-tree",
messages=[{
"role": "user",
"content": [
{
"type": "image_url",
"image_url": {"url": f"data:image/png;base64,{image_b64}"},
},
{
"type": "text",
"text": (
"The image shows a set of pairwise non-intersecting closed curves drawn on a plane. "
"Each curve creates a boundary between an interior region and its surroundings. "
"Output the containment tree of the regions as a list of edges in the format: "
"[(parent, child), ...] where 0 is the outermost (unbounded) region."
),
},
],
}],
max_tokens=2048,
)
print(response.choices[0].message.content)
print("Ground truth:", ds[0]["tree"])
Load the base model and apply the LoRA adapter directly:
from peft import PeftModel
from transformers import AutoModelForImageTextToText, AutoProcessor
from datasets import load_dataset
import torch
base_id = "google/gemma-3-12b-it"
adapter_id = "AmirMohseni/curvebench-gemma-3-12b"
processor = AutoProcessor.from_pretrained(base_id, trust_remote_code=True)
model = AutoModelForImageTextToText.from_pretrained(
base_id, torch_dtype=torch.bfloat16, device_map="auto", trust_remote_code=True
)
model = PeftModel.from_pretrained(model, adapter_id)
# Load the first test image from the benchmark
ds = load_dataset("AmirMohseni/CurveBench-Easy", split="total_test")
image = ds[0]["image"]
prompt = (
"The image shows a set of pairwise non-intersecting closed curves drawn on a plane. "
"Each curve creates a boundary between an interior region and its surroundings. "
"Output the containment tree of the regions as a list of edges in the format: "
"[(parent, child), ...] where 0 is the outermost (unbounded) region."
)
inputs = processor(
text=processor.apply_chat_template(
[{"role": "user", "content": [{"type": "image"}, {"type": "text", "text": prompt}]}],
add_generation_prompt=True,
),
images=[image],
return_tensors="pt",
).to(model.device)
output = model.generate(**inputs, max_new_tokens=2048)
print(processor.decode(output[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))
print("Ground truth:", ds[0]["tree"])
Training reward
Eval reward
Trained with GRPO using a fork of TRL with multimodal support: AmirTuring/trl @ curvebench.
total_train (210 images) from CurveBench-Easy@misc{mohseni2026curvebench,
title={CurveBench: A Benchmark for Exact Topological Reasoning over Nested Jordan Curves},
author={Amirreza Mohseni and Mona Mohammadi and Morteza Saghafian and Naser Talebizadeh Sardari},
year={2026},
eprint={2605.14068},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2605.14068},
}
Cite GRPO as:
@article{shao2024deepseekmath,
title = {{DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models}},
author = {Zhihong Shao and Peiyi Wang and Qihao Zhu and Runxin Xu and Junxiao Song and Mingchuan Zhang and Y. K. Li and Y. Wu and Daya Guo},
year = 2024,
eprint = {arXiv:2402.03300},
}
Cite TRL as:
@misc{vonwerra2022trl,
title = {{TRL: Transformer Reinforcement Learning}},
author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallou{\'e}dec},
year = 2020,
journal = {GitHub repository},
publisher = {GitHub},
howpublished = {\url{https://github.com/huggingface/trl}}
}