ssu-project
/

OLMo-2-1124-7B-Instruct-ig-random

Model card Files Files and versions

OLMo-2-1124-7B-Instruct-ig-random / README.md

atsuki-yamaguchi's picture

atsuki-yamaguchi

Upload README.md with huggingface_hub

6051e13 verified 7 months ago

|

History Blame Contribute Delete

1.51 kB


	---
	license: apache-2.0
	datasets:
	- allenai/MADLAD-400
	language:
	- ig
	base_model:
	- allenai/OLMo-2-1124-7B-Instruct
	---
	# OLMo 2 1124 7B Instruct for Igbo: SSU-Rand

	This model is built on top of OLMo 2 1124 7B Instruct adapted for Igbo using 200M target language tokens sampled from MADLAD-400. The model is adapted using the SSU-Rand approach (i.e., randomly selecting parameters to update by column).

	## Model Description

	- Language: Igbo
	- License: Apache 2.0
	- Fine-tuned from model: [allenai/OLMo-2-1124-7B-Instruct](https://huggingface.co/allenai/OLMo-2-1124-7B-Instruct)


	## Model Sources

	- Repository: https://github.com/gucci-j/ssu
	- Paper: https://arxiv.org/abs/2512.04844


	## How to Get Started with the Model
	Use the code below to get started with the model.
	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM

	model = AutoModelForCausalLM.from_pretrained(
	"ssu-project/OLMo-2-1124-7B-Instruct-ig-random"
	)
	tokenizer = AutoTokenizer.from_pretrained(
	"ssu-project/OLMo-2-1124-7B-Instruct-ig-random"
	)
	```


	## Citation
	```
	@misc{yamaguchi2025mitigatingcatastrophicforgettingtarget,
	title={Mitigating Catastrophic Forgetting in Target Language Adaptation of LLMs via Source-Shielded Updates},
	author={Atsuki Yamaguchi and Terufumi Morishita and Aline Villavicencio and Nikolaos Aletras},
	year={2025},
	eprint={2512.04844},
	archivePrefix={arXiv},
	primaryClass={cs.CL},
	url={https://arxiv.org/abs/2512.04844},
	}
	```