Text Generation
Transformers
PyTorch
Japanese
japanese
llama
from-scratch
solo-developer
keigo
family-conversation
Instructions to use FiShota/hinomoto-100m-v7 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use FiShota/hinomoto-100m-v7 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="FiShota/hinomoto-100m-v7")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("FiShota/hinomoto-100m-v7", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use FiShota/hinomoto-100m-v7 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "FiShota/hinomoto-100m-v7" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "FiShota/hinomoto-100m-v7", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/FiShota/hinomoto-100m-v7
- SGLang
How to use FiShota/hinomoto-100m-v7 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "FiShota/hinomoto-100m-v7" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "FiShota/hinomoto-100m-v7", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "FiShota/hinomoto-100m-v7" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "FiShota/hinomoto-100m-v7", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use FiShota/hinomoto-100m-v7 with Docker Model Runner:
docker model run hf.co/FiShota/hinomoto-100m-v7
Initial upload of HinoMoto-100M-v4
Browse files- README.md +86 -0
- config.json +13 -0
- generation_config.json +6 -0
- manifest.json +12 -0
- pytorch_model.bin +3 -0
- tokenizer.json +0 -0
README.md
ADDED
|
@@ -0,0 +1,86 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Model Card: HinoMoto-100M-v4
|
| 2 |
+
|
| 3 |
+
## Model Details
|
| 4 |
+
- **Name**: HinoMoto-100M-smoke-v4
|
| 5 |
+
- **Architecture**: Llama-style decoder-only transformer
|
| 6 |
+
- 12 layers, 8 heads, d_model 512, d_ff 1408, RoPE base 10000
|
| 7 |
+
- Tied input/output embeddings
|
| 8 |
+
- SwiGLU FFN, RMSNorm, no bias
|
| 9 |
+
- **Parameters**: 43.4M (excl. embed) / 100M total
|
| 10 |
+
- **Vocabulary**: 9,506 (byte-level BPE, custom-trained on balanced corpus)
|
| 11 |
+
- **Context length**: 512 (training) / extensible via RoPE
|
| 12 |
+
- **License**: Apache-2.0
|
| 13 |
+
- **Date**: 2026-04-26
|
| 14 |
+
- **Developed by**: HinoMoto Project (solo dev)
|
| 15 |
+
- **Repository**: https://github.com/your-org/hinomoto (TBA)
|
| 16 |
+
|
| 17 |
+
## Intended Use
|
| 18 |
+
- **Primary**: Research on Japanese family-conversation LMs
|
| 19 |
+
- **Secondary**: Tokenizer + corpus + training infrastructure 検証
|
| 20 |
+
- **Out of scope**: Production deployment, factual QA, code generation
|
| 21 |
+
|
| 22 |
+
## Training
|
| 23 |
+
- **Hardware**: 1x RTX 3090 (24GB)
|
| 24 |
+
- **Compute**: ~13 min for 10k steps
|
| 25 |
+
- **Throughput**: 16k tok/s
|
| 26 |
+
- **Optimizer**: AdamW (β=0.9, 0.95), weight_decay 0.1, grad_clip 1.0
|
| 27 |
+
- **LR schedule**: warmup 200, cosine decay (min_lr_ratio 0.1), peak 3e-4
|
| 28 |
+
- **Effective batch**: 2 × grad_accum 4 = 8 sequences × 512 tokens = 4096 tokens/step
|
| 29 |
+
- **Total tokens seen**: ~41M
|
| 30 |
+
|
| 31 |
+
## Training Data
|
| 32 |
+
- **Corpus**: balanced v6 (75 MB)
|
| 33 |
+
- Diet records 53% (国会会議録, public domain)
|
| 34 |
+
- Aozora Bunko 25% (青空文庫, copyright expired)
|
| 35 |
+
- SFT data 21% (Dolly-15k-ja + own family conv)
|
| 36 |
+
- **Tokenizer**: tokenizer_v3_32k_clean.json (vocab 9,506)
|
| 37 |
+
- **Decontamination**: 8-gram overlap with bench q (mean 12-16%, max 81%, see `bench/HinoMotoBench-ja/decontamination_report.json`)
|
| 38 |
+
|
| 39 |
+
## Evaluation (HinoMotoBench-ja v0.5)
|
| 40 |
+
| Axis | Score | Note |
|
| 41 |
+
|---|---|---|
|
| 42 |
+
| family (n=110) | 6.36/12 (53%) | mean total |
|
| 43 |
+
| keigo (n=70) | 17% | pass rate |
|
| 44 |
+
| silence (n=50) | 16% | pass rate |
|
| 45 |
+
| degenerate (family) | 5.5% | rate of broken outputs |
|
| 46 |
+
|
| 47 |
+
With stop-token cleaning (evaluator v3): family 6.75/12 (56%), degenerate 0%.
|
| 48 |
+
|
| 49 |
+
## Limitations
|
| 50 |
+
- **Scale**: 100M is a research smoke model, not production
|
| 51 |
+
- **Domain bias**: Diet record corpus introduces formal/political vocabulary bias
|
| 52 |
+
- **Overfit risk**: 41M tokens × 100M params is below Chinchilla optimal
|
| 53 |
+
- **No alignment**: No DPO/RLHF; raw pretraining outputs only
|
| 54 |
+
- **Not safety-tuned**: Outputs may include unfiltered Diet language
|
| 55 |
+
|
| 56 |
+
## Bias and Risks
|
| 57 |
+
- Diet records reflect predominantly male, formal, politically diverse but Japan-centric viewpoints
|
| 58 |
+
- Aozora skews toward Meiji-Showa era literature (modern usage gaps)
|
| 59 |
+
- SFT family data is small (~500 samples); intimate-context outputs may be less reliable
|
| 60 |
+
|
| 61 |
+
## Reproducibility
|
| 62 |
+
- **Seed**: 0 (single-seed; 3-seed planned)
|
| 63 |
+
- **Manifest**: `artifacts/smoke_100m_v4_10000step/corpus_manifest.json`
|
| 64 |
+
- **Config**: `configs/main_run_100m_v3.json`
|
| 65 |
+
- **Code commit**: TBA after 0.2.0 tag
|
| 66 |
+
|
| 67 |
+
## Citation
|
| 68 |
+
```bibtex
|
| 69 |
+
@misc{hinomoto2026,
|
| 70 |
+
title = {HinoMoto: A Solo-Built Japanese Family-Conversation Language Model},
|
| 71 |
+
author = {Project HinoMoto},
|
| 72 |
+
year = {2026},
|
| 73 |
+
note = {v0.2.0, https://github.com/your-org/hinomoto}
|
| 74 |
+
}
|
| 75 |
+
```
|
| 76 |
+
|
| 77 |
+
## Stability Features (opt-in)
|
| 78 |
+
- `--ema-decay 0.999`: EMA shadow weights, periodic + final auto-save
|
| 79 |
+
- `--spike-detect --spike-window 30 --spike-ratio 4.0 --spike-min-avg-floor 1.5`: Loss spike detection with auto-halt
|
| 80 |
+
|
| 81 |
+
## Acknowledgements
|
| 82 |
+
- 国立国会図書館 (Diet records API, public domain)
|
| 83 |
+
- 青空文庫 (Aozora Bunko volunteers)
|
| 84 |
+
- kunishou/databricks-dolly-15k-ja
|
| 85 |
+
- HuggingFace `datatrove`, `transformers`, `peft`
|
| 86 |
+
- PyTorch SDPA flash attention
|
config.json
ADDED
|
@@ -0,0 +1,13 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"vocab_size": 9506,
|
| 3 |
+
"d_model": 512,
|
| 4 |
+
"n_layers": 12,
|
| 5 |
+
"n_heads": 8,
|
| 6 |
+
"max_seq_len": 1024,
|
| 7 |
+
"d_ff": 1408,
|
| 8 |
+
"rope_base": 10000,
|
| 9 |
+
"dropout_p": 0.1,
|
| 10 |
+
"tie_embeddings": true,
|
| 11 |
+
"norm_eps": 1e-06,
|
| 12 |
+
"bias": false
|
| 13 |
+
}
|
generation_config.json
ADDED
|
@@ -0,0 +1,6 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"max_new_tokens": 80,
|
| 3 |
+
"temperature": 0.7,
|
| 4 |
+
"top_p": 0.9,
|
| 5 |
+
"do_sample": true
|
| 6 |
+
}
|
manifest.json
ADDED
|
@@ -0,0 +1,12 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"files": [
|
| 3 |
+
"README.md",
|
| 4 |
+
"config.json",
|
| 5 |
+
"generation_config.json",
|
| 6 |
+
"manifest.json",
|
| 7 |
+
"pytorch_model.bin",
|
| 8 |
+
"tokenizer.json"
|
| 9 |
+
],
|
| 10 |
+
"ckpt_source": "artifacts/smoke_100m_v7_20k_ema/ckpt_step_020000_final.pt",
|
| 11 |
+
"tokenizer_source": "artifacts/tokenizer_v3_32k_clean.json"
|
| 12 |
+
}
|
pytorch_model.bin
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:92caf835d996b6ea3d70f5f7a6bc9d80ac8ae6750ba852a0c10bad07c509cb6c
|
| 3 |
+
size 173701442
|
tokenizer.json
ADDED
|
The diff for this file is too large to render.
See raw diff
|
|
|