--- language: - ja library_name: transformers tags: - myllm - causal-lm - custom-code - safetensors pipeline_tag: text-generation --- # lambda-160m lambda-160m is an experimental Japanese causal language model created with a custom `myllm` decoder-only Transformer implementation. All training code is publicly available at [KeisukeMiyamoto1324/myllm](https://github.com/KeisukeMiyamoto1324/myllm). ## Model Details | Item | Value | |---|---:| | Parameters | 164.5M | | Architecture | Decoder-only Transformer | | Model type | `myllm` | | Context length | 1024 tokens | | Tokenizer | Byte-level BPE | | Vocabulary size | 65,536 | | Layers | 16 | | Hidden size | 768 | | Attention heads | 12 | | FFN size | 3,072 | ## Training Data The model was pretrained on a Japanese text mixture. | Dataset | Notes | |---|---| | `hotchpotch/fineweb-2-edu-japanese` | Japanese web text, Wikipedia domains excluded | | `MK0727/CleanedWiki-jp` | Japanese Wikipedia-style text, ramped from 50% training progress | ## Training Setup This model was trained on a single RTX PRO 6000. | Item | Value | |---|---:| | Optimizer | AdamW | | Learning rate | 2e-4 | | LR schedule | Warmup cosine | | Warmup steps | 2,000 | | Minimum LR ratio | 0.1 | | Batch size | 96 | | Max steps | 40,960 | ## Usage This repository uses custom Transformers code, so `trust_remote_code=True` is required. ```python from transformers import AutoModelForCausalLM from transformers import AutoTokenizer repo_id = "MK0727/lambda-160m" tokenizer = AutoTokenizer.from_pretrained(repo_id, trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained(repo_id, trust_remote_code=True) inputs = tokenizer("日本の首都は、", return_tensors="pt") outputs = model.generate(**inputs, max_new_tokens=64) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` ## Limitations This model is not instruction-tuned or safety-aligned. It may generate incorrect, biased, unsafe, or low-quality text. The model was trained on a limited Japanese corpus mixture and has not been evaluated on standard benchmarks.