--- license: other library_name: transformers tags: - reasoning - context-learning - pretraining - synthetic-data - transformers --- # 0.99zoo_op2-20+0.01teacher_op2 99% zoo op2-20, 1% teacher op2 pretraining mixture. This directory contains the final op2-only pretraining checkpoint and corresponding final RL checkpoints. This is a context B pretraining checkpoint where the teacher component uses only op2. ## Citation ```bibtex @misc{zhang2025interplaypretrainingmidtrainingrl, title={On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models}, author={Charlie Zhang and Graham Neubig and Xiang Yue}, year={2025}, eprint={2512.07783}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2512.07783}, } ```