--- license: cc0-1.0 language: - en pipeline_tag: text-generation library_name: transformers tags: - llama datasets: - gaianet/paris - krinal/fifa_2022 - gaianet/bible_bot - gaianet/trumpVSharris - cornell-movie-review-data/rotten_tomatoes - stas/openwebtext-10k - EleutherAI/lambada_openai --- # ToyLlama 30M Second version of ToyLlama model. See more ToyLlamas in my profile. ## Example generations All were generated with `test.py` interactive CLI. (Usage: `python3 test.py`) ``` >>> Enter prompt (temp: 0.7): New York is ------------------------------------------------------------ New York is the only place to live in. The newer cities are still a corporate, non-governmental and governmental people. The nationwide government is the largest industrial community, and the largest private sector. It is the best place to live in. In fact, in the early 1980s the government is not a major industrial economy. In the United States, the largest industrial sector of the world is the largest industrial and political sector. The major political sector has not been built up by the labour-owners of the world, and the economy has lost its long-term growth. It has its own largest ------------------------------------------------------------ >>> Enter prompt (temp: 0.7): The best thing ------------------------------------------------------------ The best thing about the GLOBALS was that there was a good chance to get this information out of the world, but that's a very good idea. So, there is a good thing about this stuff that I have already received, but it's always a good idea. It's just that there is a few of people who want to know what is going on here. There is a reason to be going on here, and that is why I was reading this, and that is why I didn't. This is a very good idea. I've heard that there is a lot of people that have done it, and I don't know the truth about it. It ------------------------------------------------------------ >>> Enter prompt (temp: 0.7): If you would ever travel to New Your, USA, remember that ------------------------------------------------------------ If you would ever travel to New Your, USA, remember that the street is probably the same, so it is very hard to find a little splitoff of air, so you can use it. You can be a lonely wedding basket, or you can be a lonely wedding, or a lonely wedding. Next week, the first Saturday of the month was in a frenzy. The town was very good for the rest of the week, and the time was already over. The time was just as good as the time. The day was six years old. So it would be nice to be a little better to know ------------------------------------------------------------ >>> Enter prompt (temp: 0.7): Two plus two equals ------------------------------------------------------------ Two plus two equals (7.0%) of the total population of the world. Malatesta, Libya, and Kuwaiti were also involved in the creation of a group of people who were involved in the research and education community. This group was called the "Scots" of the Papua Nagasaki. The group was contacted by the International Institute of Riverside, the group of people who were involved in the Papua Solo, the U.S. Army. They were the main members of the group. On March 5, 1969, the group became involved in the Papua Association of S ------------------------------------------------------------ ``` ## Training information Trained for 2 hours on 147M training tokens using one RX 6600 (8GB VRAM) using [this](https://cows.info.gf/games/Python/E/Expirements%20-%20LLaMA2%20trainer%20(2026)[sapbot].zip) script written by Gemini 3.1 Pro Preview. | Parameter | Value | |-----------|-------| | Loss | 3.1682 | | Epoches | 1 | | grad_norm | 0.6302 | | Learning rate | 5e-4 | | Batch size | 8 | | Gradient accumulation steps | 4 | ### Training data Training data was HF datasets (see "Datasets used to train sapbot/toyllama-30m") + multiple other sites: - cows.info.gf (4.6KB) - en.wikipedia.org (352.6KB) - github.com (454.7KB) - gutenberg.org (6MB) - habr.com (79.4KB) - textfiles.com (455MB) - www.reddit.com (3.8KB) ## Advanced module information Detailed information about EXACT model. | Parameter | Value | |-----------|-------| | Hidden size | 512 | | Intermediate size | 1536 | | Hidden layers | 8 | | Attention heads | 8 | | Key value heads | 4 |