--- base_model: - FreedomIntelligence/AceGPT-v2-70B - yentinglin/Llama-3-Taiwan-70B-Instruct - Delta-Vector/Shimamura-70B - Undi95/Sushi-v1.4 - flammenai/Llama3.1-Flammades-70B - Bllossom/llama-3-Korean-Bllossom-70B - kldzj/Llama-3.3-70B-Instruct-heretic - rinna/llama-3-youko-70b - shuoxing/llama3-70b-full-pretrain-junk-tweet-1m-en-no-packing - Mawdistical/Anthrobomination-70B - deepcogito/cogito-v2-preview-llama-70B - watt-ai/watt-tool-70B - flammenai/Mahou-1.5-llama3.1-70B - shisa-ai/shisa-v2-llama3.3-70b - zerofata/L3.3-GeneticLemonade-Unleashed-v3-70B library_name: transformers tags: - mergekit - mergekitty - merge --- # KaraKaraWitch/BlenderCartel-llama33-70B-Pt2 This is a merge of pre-trained language models created using [mergekitty](https://github.com/allura-org/mergekitty). ## Merge Details ### Merge Method This model was merged using the [SCE](https://arxiv.org/abs/2408.07990) merge method using [deepcogito/cogito-v2-preview-llama-70B](https://huggingface.co/deepcogito/cogito-v2-preview-llama-70B) as a base. ### Models Merged The following models were included in the merge: * [FreedomIntelligence/AceGPT-v2-70B](https://huggingface.co/FreedomIntelligence/AceGPT-v2-70B) * [yentinglin/Llama-3-Taiwan-70B-Instruct](https://huggingface.co/yentinglin/Llama-3-Taiwan-70B-Instruct) * [Delta-Vector/Shimamura-70B](https://huggingface.co/Delta-Vector/Shimamura-70B) * [Undi95/Sushi-v1.4](https://huggingface.co/Undi95/Sushi-v1.4) * [flammenai/Llama3.1-Flammades-70B](https://huggingface.co/flammenai/Llama3.1-Flammades-70B) * [Bllossom/llama-3-Korean-Bllossom-70B](https://huggingface.co/Bllossom/llama-3-Korean-Bllossom-70B) * [kldzj/Llama-3.3-70B-Instruct-heretic](https://huggingface.co/kldzj/Llama-3.3-70B-Instruct-heretic) * [rinna/llama-3-youko-70b](https://huggingface.co/rinna/llama-3-youko-70b) * [shuoxing/llama3-70b-full-pretrain-junk-tweet-1m-en-no-packing](https://huggingface.co/shuoxing/llama3-70b-full-pretrain-junk-tweet-1m-en-no-packing) * [Mawdistical/Anthrobomination-70B](https://huggingface.co/Mawdistical/Anthrobomination-70B) * [watt-ai/watt-tool-70B](https://huggingface.co/watt-ai/watt-tool-70B) * [flammenai/Mahou-1.5-llama3.1-70B](https://huggingface.co/flammenai/Mahou-1.5-llama3.1-70B) * [shisa-ai/shisa-v2-llama3.3-70b](https://huggingface.co/shisa-ai/shisa-v2-llama3.3-70b) * [zerofata/L3.3-GeneticLemonade-Unleashed-v3-70B](https://huggingface.co/zerofata/L3.3-GeneticLemonade-Unleashed-v3-70B) ### Configuration The following YAML configuration was used to produce this model: ```yaml models: - model: zerofata/L3.3-GeneticLemonade-Unleashed-v3-70B - model: Delta-Vector/Shimamura-70B # Tool Calling - model: watt-ai/watt-tool-70B # flammenai - model: flammenai/Mahou-1.5-llama3.1-70B - model: flammenai/Llama3.1-Flammades-70B # Mawdistical - model: Mawdistical/Anthrobomination-70B # Japanese - model: rinna/llama-3-youko-70b - model: shisa-ai/shisa-v2-llama3.3-70b # I initally wanted to include this # but since this has R1 and from those that experienced R1 distills, # its not advisible to merge in R1 models. # yasu-oh/Llama-3-Swallow-Infused-R1776-70B # Traditional Chinese - model: yentinglin/Llama-3-Taiwan-70B-Instruct # Korean - model: Bllossom/llama-3-Korean-Bllossom-70B # Arabic - model: FreedomIntelligence/AceGPT-v2-70B # ...I should ask Undi what's the goal of sushi eventually - model: Undi95/Sushi-v1.4 # Unaligned base instruct - model: kldzj/Llama-3.3-70B-Instruct-heretic # Tweet slop for junk fooding - model: shuoxing/llama3-70b-full-pretrain-junk-tweet-1m-en-no-packing merge_method: sce base_model: deepcogito/cogito-v2-preview-llama-70B select_topk: 0.2 parameters: normalize: true dtype: bfloat16 ```