Instructions to use Laplaces-Red-Devils/v01-QA-CoT with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use Laplaces-Red-Devils/v01-QA-CoT with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-3B-Instruct") model = PeftModel.from_pretrained(base_model, "Laplaces-Red-Devils/v01-QA-CoT") - Transformers
How to use Laplaces-Red-Devils/v01-QA-CoT with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Laplaces-Red-Devils/v01-QA-CoT") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("Laplaces-Red-Devils/v01-QA-CoT", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use Laplaces-Red-Devils/v01-QA-CoT with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Laplaces-Red-Devils/v01-QA-CoT" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Laplaces-Red-Devils/v01-QA-CoT", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/Laplaces-Red-Devils/v01-QA-CoT
- SGLang
How to use Laplaces-Red-Devils/v01-QA-CoT with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Laplaces-Red-Devils/v01-QA-CoT" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Laplaces-Red-Devils/v01-QA-CoT", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Laplaces-Red-Devils/v01-QA-CoT" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Laplaces-Red-Devils/v01-QA-CoT", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use Laplaces-Red-Devils/v01-QA-CoT with Docker Model Runner:
docker model run hf.co/Laplaces-Red-Devils/v01-QA-CoT
QA Stage 2 — COT Reasoning (NL + FOL → Answer + Explanation)
LoRA adapter for Qwen/Qwen2.5-3B-Instruct, fine-tuned on the Logic-Based Educational Queries dataset. Given natural-language premises, their FOL translations, and a question, the model reasons step-by-step and outputs a JSON answer with explanation.
Pipeline
NL premises + FOL premises + Question
|
QA COT Model (this adapter)
|
{"answer": "B", "explanation": "Premise 1 states..."}
This model is Stage 2 in a two-stage ensemble:
- Stage 1 (FOL Model): NL → FOL (fol-v05-cot-augmented)
- Stage 2 (This Model): NL + FOL + Question → Answer + Explanation
Accuracy (Dev set, 40 samples)
| Epoch | Raw Accuracy | Avg Latency |
|---|---|---|
| 1 | 20.0% | 20.6s |
| 5 | 52.5% | 13.2s |
| 10 | 47.5% | 17.0s |
| 15 | 50.0% | 11.3s |
| 19 | 55.0% | 10.3s |
| 20 | 55.0% | 10.3s |
| 25 | 55.0% | 11.4s |
| 29 | 57.5% | 11.2s |
| 30 | 52.5% | 11.7s |
Best raw accuracy: 57.5% (23/40) at epoch 29
Adjusted Accuracy (corrected gold labels)
5 samples in the dev set have gold label errors — the gold explanation contradicts the gold answer label. After manual verification, the model predicted correctly on all 5.
| Sample | Gold (wrong) | Corrected | Pred | Evidence |
|---|---|---|---|---|
| 4 | Unknown | A | A | Explanation: "Option A is most effective because..." |
| 20 | No | Yes | Yes | Explanation: "So such a programmer exists" |
| 25 | No | Yes | Yes | Explanation: "Therefore, JavaScript supports..." |
| 28 | No | Yes | Yes | Explanation: "Thus, all committee members approve" |
| 29 | No | Yes | Yes | Explanation: "Thus, all faculty members think..." |
Adjusted accuracy: 70.0% (28/40)
Training Details
Hyperparameters
| Parameter | Value |
|---|---|
| Base model | Qwen/Qwen2.5-3B-Instruct |
| Method | LoRA (PEFT) |
| LoRA r | 8 |
| LoRA alpha | 16 |
| LoRA dropout | 0.05 |
| Target modules | q_proj, k_proj, v_proj, o_proj |
| Trainable params | 3,686,400 (0.12%) |
| Epochs | 30 (early stop patience=7) |
| Batch size | 1 (gradient accumulation=8, effective=8) |
| Learning rate | 2e-5 |
| Warmup ratio | 0.05 |
| Weight decay | 0.01 |
| Precision | INT8 (bitsandbytes) |
| Max seq length | 3500 |
| Seed | 42 |
Training Loss Curve
| Epoch | Train Loss | Eval Loss | Token Accuracy |
|---|---|---|---|
| 1 | 1.487 | 1.468 | 69.8% |
| 3 | 0.854 | 0.420 | 89.9% |
| 5 | 0.370 | 0.380 | 90.4% |
| 10 | 0.345 | 0.347 | 91.1% |
| 15 | 0.333 | 0.328 | 91.5% |
| 20 | 0.328 | 0.319 | 91.7% |
| 24 | 0.321 | 0.316 | 91.8% |
| 25 | 0.315 | 0.316 | 91.9% |
Best eval_loss: 0.3155 at step 1944 (epoch 24)
Dataset
- Task: Logic-Based Educational Queries (MCQ + Yes/No)
- Train: 647 QA samples (328 records)
- Dev: 79 QA samples (41 records)
- Test: 81 QA samples (41 records)
- Input: NL premises + FOL premises + Question
- Output: JSON
{"answer": "<label>", "explanation": "<reasoning>"}
Training Infrastructure
- Hardware: NVIDIA L4 (24GB VRAM)
- Platform: Modal Cloud
- Training time: ~6 hours (30 epochs)
Usage
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-3B-Instruct", device_map="auto")
model = PeftModel.from_pretrained(base_model, "Laplaces-Red-Devils/qa-v01-cot-Qwen2.5-3B-Instruct")
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-3B-Instruct")
messages = [
{"role": "system", "content": "You are a logic-based educational QA system..."},
{"role": "user", "content": "Premises (NL):\n1. If a student attends lectures...\n\nPremises (FOL):\n1. ∀x (AttendsLectures(x) → UnderstandsMaterial(x))\n\nQuestion:\nWhich conclusion is best supported?\nA. ...\nB. ..."},
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
output = model.generate(**inputs, max_new_tokens=200, do_sample=False)
print(tokenizer.decode(output[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))
# {"answer": "B", "explanation": "Premise 1 states..."}
Framework Versions
- PEFT: 0.19.1
- Transformers: 4.52.4
- TRL: 0.18.1
- PyTorch: 2.10.0+cu128
- BitsAndBytes: 0.46.0
Team
Laplace's Red Devils — EXACT 2026 Competition
- Downloads last month
- -
Install from pip and serve model
# Install vLLM from pip: pip install vllm# Start the vLLM server: vllm serve "Laplaces-Red-Devils/v01-QA-CoT"# Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Laplaces-Red-Devils/v01-QA-CoT", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'