---
license: apache-2.0
language:
- en
base_model: openai-community/gpt2
tags:
- claritas
- data-selection
- mathematical-reasoning
- training-dynamics
- gpt2
datasets:
- DKYoon/SlimPajama-6B
- openbmb/UltraData-Math
---
# Claritas-GPT2: Trajectory-to-Trait Optimized Reasoning Model
**[[Paper](https://arxiv.org/abs/your-arxiv-id)]**
> **Claritas** (Latin for "clarity") is a training dynamics framework that optimizes Large Language Models by analyzing gradient trajectories. This repository contains the **model weights** for the lightweight implementation described in the paper *"Claritas: Trajectory-to-Trait Framework for Complex Reasoning Training Optimization in Large Language Models"*.
## 📝 Model Overview
This model serves as a reproducible validation of the **Claritas Dynamic Data Selection Algorithm**. Unlike standard Supervised Fine-Tuning (SFT) that treats all data equally, this model was trained by dynamically selecting high-value, conflict-free samples.
**Key Implementation Details:**
1.  **Gradient Spectral Fingerprint (GSF)**: High-dimensional gradients were compressed into 128-dimensional signatures to track sample influence.
2.  **Counterfactual Trajectory Contrast (CTC)**: Samples were scored based on their contribution to mathematical reasoning capabilities.
3.  **Dynamic Conflict Graph**: The training process identified and excluded samples with opposing gradient directions to maximize learning efficiency.
## 📊 Performance Highlights
While this specific release uses a GPT-2 backbone for accessibility, the underlying methodology has been validated on LLaMA-2-7B/70B in our paper, demonstrating significant improvements:
| Method | Data Usage | GSM8K | MATH | BBH |
| :--- | :---: | :---: | :---: | :---: |
| Standard SFT | 100% | 52.4 | 12.1 | 38.5 |
| Random Selection | 60% | 48.1 | 10.3 | 35.2 |
| **Claritas (Ours)** | **60%** | **55.8** | **14.5** | **41.2** |
> **Key Result**: The Claritas framework cuts training tokens by **40%** while boosting Pass@1 accuracy by **+2.5 points** on average over full-data SFT.
## 🚀 How to Use
You can easily load this model using the Hugging Face `transformers` library for inference.
### Installation
```bash
pip install transformers torch
```
### Inference Example
```python
import torch
from transformers import GPT2LMHeadModel, GPT2Tokenizer
# Load the Claritas-optimized model
model_name = "Muki182/claritas-gpt2" # Replace with your repo ID
model = GPT2LMHeadModel.from_pretrained(model_name)
tokenizer = GPT2Tokenizer.from_pretrained(model_name)
# Set pad token if not set
tokenizer.pad_token = tokenizer.eos_token
# Prepare input
prompt = "Question: If I have 3 apples and eat 1, how many are left?\nAnswer:"
inputs = tokenizer(prompt, return_tensors="pt")
# Generate
with torch.no_grad():
    outputs = model.generate(
        inputs["input_ids"], 
        max_new_tokens=50, 
        do_sample=False,
        pad_token_id=tokenizer.eos_token_id
    )
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
## ⚙️ Training Specifications
This model was trained using the Claritas algorithm with the following configurations:
*   **Base Model**: `gpt2`
*   **Training Data**: GSM8K (Mathematical Reasoning)
*   **Algorithm**: Claritas (GSF + CTC + Dynamic Conflict Graph)
*   **Spectral Dimension**: 128
*   **Conflict Threshold**: -0.5
*   **CTC Threshold**: 0.2
*   **Optimizer**: AdamW
## 📖 Citation
If you use this model or reference the Claritas framework in your research, please cite our paper:
```bibtex
@article{feng2026claritas,
  title={Claritas: Trajectory-to-Trait Framework for Complex Reasoning Training Optimization in Large Language Models},
  author={Feng, Junjie},
  journal={arXiv preprint arXiv:your-arxiv-id},
  year={2026}
}
```
## License
This model is licensed under the **Apache License 2.0**.