Instructions to use nickdee96/nllb-200-600m-kikuyu-english with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use nickdee96/nllb-200-600m-kikuyu-english with Transformers:
# Load model directly from transformers import AutoTokenizer, AutoModelForMultimodalLM tokenizer = AutoTokenizer.from_pretrained("nickdee96/nllb-200-600m-kikuyu-english") model = AutoModelForMultimodalLM.from_pretrained("nickdee96/nllb-200-600m-kikuyu-english") - Notebooks
- Google Colab
- Kaggle
Model Card: NLLB-200-distilled-600M Fine-tuned for Kikuyu-English Translation
Model Overview
Model Name: nickdee96/nllb-200-600M-kikuyu-english
Base Model: facebook/nllb-200-distilled-600M
Task: Neural Machine Translation (Kikuyu ↔ English)
Language Pair: Kikuyu (kik_Latn) → English (eng_Latn)
Model Type: Sequence-to-Sequence Transformer
License: Same as base model (CC-BY-NC)
Model Description
This model is a fine-tuned version of Facebook's NLLB-200-distilled-600M (No Language Left Behind) specifically optimized for Kikuyu to English translation. The model leverages the multilingual capabilities of NLLB-200 and has been fine-tuned on a curated dataset of Kikuyu-English parallel text pairs to improve translation quality for this under-resourced language pair.
Key Features
- Specialized for Kikuyu: Fine-tuned specifically for Kikuyu (Gĩkũyũ) language translation
- High-quality training data: Trained on 17,514 carefully filtered translation pairs
- Optimized performance: Achieved low validation loss (0.0375) after fine-tuning
- Production ready: Includes both HuggingFace Transformers format and optimized inference setup
Training Details
Dataset
- Source: Kikuyu translation pairs dataset from LDRI language project
- Total pairs: 17,514 valid translation pairs
- Training split: 15,762 pairs (90%)
- Validation split: 1,752 pairs (10%)
- Data filtering: Removed short texts (<3 characters) and common non-linguistic responses
- Domain: Conversational and agricultural content
Training Configuration
- Base model: facebook/nllb-200-distilled-600M
- Training epochs: 3
- Batch size: 4 per device
- Gradient accumulation steps: 8 (effective batch size: 32)
- Learning rate: 5e-5
- Scheduler: Cosine annealing
- Warmup steps: 100
- Max sequence length: 512 tokens
- Optimization: FP16 precision for memory efficiency
Training Infrastructure
- Hardware: CUDA-enabled GPU
- Framework: HuggingFace Transformers 4.x
- Training time: ~1 hour 14 minutes (4,454.86 seconds)
- Memory optimization: FP16, gradient accumulation, device mapping
Performance
Training Metrics
- Final training loss: 1.1482
- Final validation loss: 0.0375
- Training progression: Consistent loss reduction from 3.64 to 0.036 over training steps
- Convergence: Model showed good convergence with stable validation loss
Translation Examples
| Kikuyu | Model Translation | Reference |
|---|---|---|
| Ũrendia atĩa | How do you sell? | How you are selling |
| Nĩgetha ũkamenya kana hĩndĩ ĩrĩa ndahanda kĩmera gĩkĩ rĩ | So that you can know when I planted this summer | So that you can know when I plant in this season |
Model Capabilities
- Strong semantic understanding: Captures meaning accurately even with complex sentence structures
- Context awareness: Handles conversational contexts and agricultural terminology
- Code-switching handling: Can process mixed language inputs with [cs] markers
- Robust to variations: Performs well on different sentence lengths and structures
Usage
HuggingFace Transformers
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM, pipeline
# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("nickdee96/nllb-200-600M-kikuyu-english")
model = AutoModelForSeq2SeqLM.from_pretrained("nickdee96/nllb-200-600M-kikuyu-english")
# Create translation pipeline
translator = pipeline(
"translation",
model=model,
tokenizer=tokenizer,
src_lang="kik_Latn",
tgt_lang="eng_Latn"
)
# Translate Kikuyu to English
result = translator("Ũrendia atĩa")
print(result[0]['translation_text'])
Direct Model Usage
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
tokenizer = AutoTokenizer.from_pretrained("nickdee96/nllb-200-600M-kikuyu-english")
model = AutoModelForSeq2SeqLM.from_pretrained("nickdee96/nllb-200-600M-kikuyu-english")
# Tokenize input
inputs = tokenizer("Ũrendia atĩa", return_tensors="pt")
# Generate translation
outputs = model.generate(**inputs, max_length=512)
translation = tokenizer.decode(outputs[0], skip_special_tokens=True)
Limitations and Considerations
Known Limitations
- Domain specificity: Primarily trained on conversational and agricultural content
- Data size: Limited by the size of available Kikuyu-English parallel data
- Dialectal variations: May not capture all Kikuyu dialectal variations
- Formal register: May be less effective on highly formal or technical texts
- One-way optimization: Primarily optimized for Kikuyu→English direction
Ethical Considerations
- Language preservation: Contributes to digital language preservation for Kikuyu
- Cultural sensitivity: Trained on authentic conversational data respecting cultural context
- Bias mitigation: Filtered training data to remove inappropriate content
- Fair use: Intended for research and educational purposes
Recommended Use Cases
- ✅ Educational tools: Language learning applications
- ✅ Research: Computational linguistics and MT research
- ✅ Documentation: Helping preserve and translate Kikuyu texts
- ✅ Communication: Assisting with basic Kikuyu-English communication
Not Recommended For
- ❌ Critical translations: Legal, medical, or safety-critical contexts without human review
- ❌ Real-time interpretation: High-stakes real-time communication
- ❌ Commercial applications: Without proper validation and testing
- ❌ Formal documents: Official government or business documents without review
Technical Specifications
Model Architecture
- Architecture: M2M100ForConditionalGeneration (NLLB variant)
- Parameters: ~600M (distilled version)
- Encoder layers: 12
- Decoder layers: 12
- Hidden size: 1024
- Attention heads: 16
- Vocabulary size: Model-specific NLLB vocabulary
Input/Output Specifications
- Input language: Kikuyu (kik_Latn)
- Output language: English (eng_Latn)
- Max input length: 512 tokens
- Tokenization: SentencePiece-based (NLLB tokenizer)
- Special tokens: Language-specific prefix tokens
Performance Benchmarks
- Inference speed: GPU-optimized for real-time translation
- Memory requirements: ~2.5GB GPU memory for inference
- Batch processing: Supports batched inference for efficiency
Citation and Acknowledgments
Citation
If you use this model in your research, please cite:
@model{kikuyu-english-nllb-2024,
title={Fine-tuned NLLB-200 for Kikuyu-English Translation},
author={nickdee96},
year={2024},
url={https://huggingface.co/nickdee96/nllb-200-600M-kikuyu-english},
note={Fine-tuned from facebook/nllb-200-distilled-600M}
}
Acknowledgments
- Base model: Facebook AI Research - NLLB-200 team
- Training framework: HuggingFace Transformers
- Dataset: LDRI Language Project
- Community: Kikuyu language speakers and researchers
Related Work
- NLLB-200: No Language Left Behind: Scaling Human-Centered Machine Translation
- Original NLLB models: Available on HuggingFace Model Hub
- Multilingual NMT: Research in low-resource language translation
Model Versions and Updates
Version History
- v1.0 (Current): Initial fine-tuned release
- Training dataset: 17,514 pairs
- Base model: facebook/nllb-200-distilled-600M
- Training epochs: 3
- Validation loss: 0.0375
Future Updates
- Potential expansion with additional training data
- Bidirectional translation capability (English→Kikuyu)
- Domain-specific fine-tuning variants
- Performance optimizations and model compression
Contact and Support
For questions, issues, or collaboration opportunities:
- Model creator: nickdee96
- Repository: ldri-language
- Issues: Please report issues through the repository's issue tracker
This model card was generated on September 26, 2025, and reflects the current state of the model at the time of publication.
- Downloads last month
- 68
Model tree for nickdee96/nllb-200-600m-kikuyu-english
Base model
facebook/nllb-200-distilled-600M