---
license: gemma
pipeline_tag: image-text-to-text
base_model:
  - treadon/gemma4-E2B-it-Abliterated-AND-Disinhibited-USE-THIS
  - ValiantLabs/gemma-4-E2B-it-ShiningValiant3
tags:
- merge
- mergekit
- lazymergekit
- gemma
- gemma4
- gemma4_E2B
- google
- treadon/gemma4-E2B-it-Abliterated-AND-Disinhibited-USE-THIS
- ValiantLabs/gemma-4-E2B-it-ShiningValiant3
---

___

# EN

The previous merger turned out to be working, now I am checking the merger of the models using the `Arcee Fusion` algorithm.
Additionally, this is a check of the merge being carried out on my device.

An error occurred and `Arcee Fusion` was not used, so it will be in the next merge.

Some information on the merger can be found here: https://huggingface.co/zelk12/Mergekit_Gemma-4-E2B

___

#GGUF

If I don’t delete it and forget to post it, then GGUF probably Q6 of this model can be found here:
https://huggingface.co/zelk12/MT1_gemma-4-E2B-Q6_K-GGUF

You can also often find GGUF and imatrix GGUF here:
[mradermacher](https://huggingface.co/mradermacher)
Most often I use their quanta myself if possible.

Well, sometimes he also posts my models:
[Otakadelic](https://huggingface.co/Otakadelic)
But in addition, he himself is also involved in combining models.

___

# Information

What does the model name `MT-Gen_gemma-4-E2B` mean?
- `MT` = `merge test`, just a merge test and a number, per generation.
- `Gen` - what generation of associations this is, in general, it is not tied to anything, just an additional number, but usually the generation can change when I test, some completely new options. I also try not to include models of the same generation into the association.

Based on the name of the original model `Gemma-4-E2B-it`
Gemma is Google's family of open models.
- `4` is essentially a generation of the model.
- `E2B` - means that the model uses all effective parameters, which when calculated are equal to 2 billion ordinary parameters, or so.
- `it` = `instruction tuned`, means that the model is prepared to work with instructions for this model to form a chat.

The hypothetical model itself can work with a 128k context. The sliding attention window has a size of 512 tokens.
A variable aspect ratio function and image encoding options in 70, 140, 280, 560, 1120 tokens have been introduced.
Additionally, `Gemma-4-E2B-it` models are usually capable of working with audio data.

Data on tokens may have changed due to the fact that this is not pure Gemma-4-E2B, but its combinations.

___
___

# RU

Предыдущее объединение оказалось, рабочим, теперь проверяю объединение моделей, при помощи алгоритма `Arcee Fusion`.
Дополнительно, это проверка объединения проводимого на моём устройстве.

Произошла ошибка, и `Arcee Fusion` не использовался, значит, он будет в следующем объединении.

Некоторая информация по объединению, находится здесь: https://huggingface.co/zelk12/Mergekit_Gemma-4-E2B

___

# GGUF

Если я не удалю и не забуду выложить, тогда GGUF вероятно Q6 этой модели можно будете найти здесь:
https://huggingface.co/zelk12/MT1_gemma-4-E2B-Q6_K-GGUF

Также нередко GGUF и imatrix GGUF можно найти зесь:
[mradermacher](https://huggingface.co/mradermacher)
Чаще всего я сам использую их кванты если это возможно.

Ну и иногда он тоже выкладывает мои модели:
[Otakadelic](https://huggingface.co/Otakadelic)
Но кроме того, он ещё и сам занимается объединением моделей.

___

# Информация

Что значит название моделией `MT1_gemma-4-E2B`
- `MT` = `merge test`, просто проверка объединений и номер, в поколении.
- `Gen` - какое это поколение объединений, в целом, оно мало к чему привязано, просто дополнительная цифра, но обычно поколение может изменится, когда тестирую, какие-то совсем новые варианты. Также стараюсь не вводить в состав объединения, модели с тем же поколением.

По названию оригинальной модели `Gemma-4-E2B-it`
Gemma - семейство открытых моделей Google.
- `4` - это по своей сути поколение модели.
- `E2B` - значит что у модели всего используются эффективные параметры, которые при выполнении равны 2 миллиардам обычных параметров, или около того.
- `it` = `instruction tuned`, значит что модель подготовлена работать с инструкциями для данной модели, для формирования чата.

Сама модель гипотетический может работать с контекстом 128к. Скользящее окно внимания имеет размер 512 токена.
Введена функция переменного соотношения сторон и варианты кодировки изображения в 70, 140, 280, 560, 1120 токенов.
Дополнительно модели `Gemma-4-E2B-it` обычно способны работать с аудио данными.

Данные по токенам могли измениться, из-за того что это не чистая Gemma-4-E2B, а её объединения.

___
___

# MT1_gemma-4-E2B

MT1_gemma-4-E2B is a merge of the following models using [LazyMergekit](https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb?usp=sharing):
* [treadon/gemma4-E2B-it-Abliterated-AND-Disinhibited-USE-THIS](https://huggingface.co/treadon/gemma4-E2B-it-Abliterated-AND-Disinhibited-USE-THIS)
* [ValiantLabs/gemma-4-E2B-it-ShiningValiant3](https://huggingface.co/ValiantLabs/gemma-4-E2B-it-ShiningValiant3)

## 🧩 Configuration

```yamlmodels:
  - model: treadon/gemma4-E2B-it-Abliterated-AND-Disinhibited-USE-THIS
    parameters:
      density: 0.7
      weight: 0.6

  - model: ValiantLabs/gemma-4-E2B-it-ShiningValiant3
    parameters:
      density: 0.5
      weight: 0.4

merge_method: linear
base_model: treadon/gemma4-E2B-it-Abliterated-AND-Disinhibited-USE-THIS
parameters:
  normalize: true
dtype: bfloat16
tokenizer_source: base
```

## 💻 Usage

```python
!pip install -qU transformers accelerate

from transformers import AutoTokenizer
import transformers
import torch

model = "zelk12/MT1_gemma-4-E2B"
messages = [{"role": "user", "content": "What is a large language model?"}]

tokenizer = AutoTokenizer.from_pretrained(model)
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    torch_dtype=torch.float16,
    device_map="auto",
)

outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])
```