--- base_model: - migtissera/Tess-3-Llama-3.1-70B - Nexesenex/Llama_3.x_70b_L3.3_Dolphin_128K_v1.02 - WhiteRabbitNeo/Llama-3.1-WhiteRabbitNeo-2-70B - huihui-ai/Llama-3.1-Nemotron-70B-Instruct-HF-abliterated - hitachi-nlp/Llama-3.1-70B-FLDx2 - huihui-ai/Tess-R1-Limerick-Llama-3.1-70B-abliterated library_name: transformers tags: - mergekit - merge --- # about Original name : NexesMess/Llama_3.x_70b_Dolnemlimwhitessachi_v1.0 Release name : Dolmen v1.2 Replacing : https://huggingface.co/NexesMess/Llama_3.x_70b_Dolnemhertulimtess_v1.0 (Dolmen v1.0) --- # changes OUT : - Tulu goes out, too "messy" and somehow "shackling" the model. - Hermes goes out, despite its quality, due to its tokenizer with a differt EOS token. I had too much problems with that model. IN : - Hitachi FLDx2 enters, for intelligence and low perplexity (it's my new champion, PPL wikitext Eng 512 2.84 instead of 2.92), joining Tess 3 in those roles. - WhiteRabbitNeo enters, as a stabilizator. --- # feel There's definitively progress. The prose is more structured, and the EOS is now working. --- # credits Blackroot, for his own observations on his Mirai Series, and the hint about the Hitachi model, which I then tested with delight. --- # merge This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit). ## Merge Details ### Merge Method This model was merged using the [Model Stock](https://arxiv.org/abs/2403.19522) merge method using [Nexesenex/Llama_3.x_70b_L3.3_Dolphin_128K_v1.02](https://huggingface.co/Nexesenex/Llama_3.x_70b_L3.3_Dolphin_128K_v1.02) as a base. ### Models Merged The following models were included in the merge: * [migtissera/Tess-3-Llama-3.1-70B](https://huggingface.co/migtissera/Tess-3-Llama-3.1-70B) * [WhiteRabbitNeo/Llama-3.1-WhiteRabbitNeo-2-70B](https://huggingface.co/WhiteRabbitNeo/Llama-3.1-WhiteRabbitNeo-2-70B) * [huihui-ai/Llama-3.1-Nemotron-70B-Instruct-HF-abliterated](https://huggingface.co/huihui-ai/Llama-3.1-Nemotron-70B-Instruct-HF-abliterated) * [hitachi-nlp/Llama-3.1-70B-FLDx2](https://huggingface.co/hitachi-nlp/Llama-3.1-70B-FLDx2) * [huihui-ai/Tess-R1-Limerick-Llama-3.1-70B-abliterated](https://huggingface.co/huihui-ai/Tess-R1-Limerick-Llama-3.1-70B-abliterated) ### Configuration The following YAML configuration was used to produce this model: ```yaml merge_method: model_stock models: - model: Nexesenex/Llama_3.x_70b_L3.3_Dolphin_128K_v1.02 parameters: weight: 1.0 - model: huihui-ai/Llama-3.1-Nemotron-70B-Instruct-HF-abliterated parameters: weight: 1.0 - model: huihui-ai/Tess-R1-Limerick-Llama-3.1-70B-abliterated parameters: weight: 1.0 - model: WhiteRabbitNeo/Llama-3.1-WhiteRabbitNeo-2-70B parameters: weight: 1.0 - model: migtissera/Tess-3-Llama-3.1-70B parameters: weight: 1.0 - model: hitachi-nlp/Llama-3.1-70B-FLDx2 parameters: weight: 1.0 base_model: Nexesenex/Llama_3.x_70b_L3.3_Dolphin_128K_v1.02 dtype: bfloat16 out_dtype: bfloat16 parameters: int8_mask: true normalize: true rescale: false filter_wise: false smooth: false allow_negative_weights: false chat_template: auto tokenizer: source: union ```