chenjn168 RedSparkie commited on
Commit
b32625e
·
0 Parent(s):

Duplicate from RedSparkie/gemma-4-E2B-it-Uncensored-MAX-litert-lm

Browse files

Co-authored-by: RedRedRed <RedSparkie@users.noreply.huggingface.co>

Files changed (3) hide show
  1. .gitattributes +35 -0
  2. README.md +71 -0
  3. gemma4_to_litertlm.ipynb +306 -0
.gitattributes ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ *.7z filter=lfs diff=lfs merge=lfs -text
2
+ *.arrow filter=lfs diff=lfs merge=lfs -text
3
+ *.bin filter=lfs diff=lfs merge=lfs -text
4
+ *.bz2 filter=lfs diff=lfs merge=lfs -text
5
+ *.ckpt filter=lfs diff=lfs merge=lfs -text
6
+ *.ftz filter=lfs diff=lfs merge=lfs -text
7
+ *.gz filter=lfs diff=lfs merge=lfs -text
8
+ *.h5 filter=lfs diff=lfs merge=lfs -text
9
+ *.joblib filter=lfs diff=lfs merge=lfs -text
10
+ *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
+ *.mlmodel filter=lfs diff=lfs merge=lfs -text
12
+ *.model filter=lfs diff=lfs merge=lfs -text
13
+ *.msgpack filter=lfs diff=lfs merge=lfs -text
14
+ *.npy filter=lfs diff=lfs merge=lfs -text
15
+ *.npz filter=lfs diff=lfs merge=lfs -text
16
+ *.onnx filter=lfs diff=lfs merge=lfs -text
17
+ *.ot filter=lfs diff=lfs merge=lfs -text
18
+ *.parquet filter=lfs diff=lfs merge=lfs -text
19
+ *.pb filter=lfs diff=lfs merge=lfs -text
20
+ *.pickle filter=lfs diff=lfs merge=lfs -text
21
+ *.pkl filter=lfs diff=lfs merge=lfs -text
22
+ *.pt filter=lfs diff=lfs merge=lfs -text
23
+ *.pth filter=lfs diff=lfs merge=lfs -text
24
+ *.rar filter=lfs diff=lfs merge=lfs -text
25
+ *.safetensors filter=lfs diff=lfs merge=lfs -text
26
+ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
+ *.tar.* filter=lfs diff=lfs merge=lfs -text
28
+ *.tar filter=lfs diff=lfs merge=lfs -text
29
+ *.tflite filter=lfs diff=lfs merge=lfs -text
30
+ *.tgz filter=lfs diff=lfs merge=lfs -text
31
+ *.wasm filter=lfs diff=lfs merge=lfs -text
32
+ *.xz filter=lfs diff=lfs merge=lfs -text
33
+ *.zip filter=lfs diff=lfs merge=lfs -text
34
+ *.zst filter=lfs diff=lfs merge=lfs -text
35
+ *tfevents* filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,71 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model:
4
+ - prithivMLmods/gemma-4-E2B-it-Uncensored-MAX
5
+ tags:
6
+ - litert-lm
7
+ - uncensored
8
+ - abliterated
9
+ - edge-gallery
10
+ - on-device
11
+ language:
12
+ - en
13
+ ---
14
+
15
+ # gemma-4-E2B-it-Uncensored-MAX → LiteRT-LM
16
+
17
+ Conversión de [prithivMLmods/gemma-4-E2B-it-Uncensored-MAX](https://huggingface.co/prithivMLmods/gemma-4-E2B-it-Uncensored-MAX) a formato `.litertlm` para **Google AI Edge Gallery** en Android.
18
+
19
+ ## 🚀 Cómo convertir (Google Colab, gratis)
20
+
21
+ El notebook está **probado y listo** para ejecutar en Colab:
22
+
23
+ 📓 **[`gemma4_to_litertlm.ipynb`](https://huggingface.co/RedSparkie/gemma-4-E2B-it-Uncensored-MAX-litert-lm/blob/main/gemma4_to_litertlm.ipynb)**
24
+
25
+ ### Pasos:
26
+ 1. **Descarga** el notebook y ábrelo en Google Colab
27
+ 2. Selecciona runtime: **GPU (T4) + RAM Alta** (`hm`)
28
+ → Entorno de ejecución → Cambiar tipo → T4 + RAM Alta
29
+ 3. **Pon tu token** de HuggingFace (con permisos de escritura) en la primera celda
30
+ 4. **Ejecuta** todas las celdas (~30-45 min)
31
+ 5. El `.litertlm` se sube automáticamente aquí
32
+
33
+ ### ¿Qué hace?
34
+ 1. **Extrae solo el decoder de texto** del modelo multimodal (4.8 GB vs 9.6 GB total)
35
+ → Mantiene la key naming correcta (`model.language_model.*`)
36
+ 2. **Crea config modificado** con `vision_config=None`, `audio_config=None`
37
+ → `Gemma4ForConditionalGeneration` solo instancia el language model
38
+ 3. **Convierte a TFLite** via `litert-torch` con cuantización INT8
39
+ 4. **Empaqueta como `.litertlm`** con `externalize_embedder=True` (requerido por Gemma4)
40
+ 5. **Sube a HuggingFace**
41
+
42
+ ### Si pesa >2 GB
43
+ Cambia `"dynamic_wi8_afp32"` → `"dynamic_wi4_afp32"` en la celda 4 (INT4 en vez de INT8, mitad de tamaño)
44
+
45
+ ## 📱 Uso (una vez convertido)
46
+
47
+ ### Edge Gallery (Android)
48
+ 1. Instala [Google AI Edge Gallery](https://play.google.com/store/apps/details?id=com.google.ai.edge.gallery)
49
+ 2. Añade modelo via URL de HuggingFace
50
+ 3. ¡Chatea!
51
+
52
+ ### CLI
53
+ ```bash
54
+ pip install litert-lm
55
+ litert-lm import --from-huggingface-repo RedSparkie/gemma-4-E2B-it-Uncensored-MAX-litert-lm gemma-4-E2B-it-Uncensored-MAX.litertlm uncensored-max
56
+ litert-lm run uncensored-max
57
+ ```
58
+
59
+ ## Detalles técnicos
60
+
61
+ | | |
62
+ |---|---|
63
+ | **Modelo base** | [prithivMLmods/gemma-4-E2B-it-Uncensored-MAX](https://huggingface.co/prithivMLmods/gemma-4-E2B-it-Uncensored-MAX) |
64
+ | **Arquitectura** | Gemma 4 E2B (text decoder only, ~1.4B params) |
65
+ | **Formato** | LiteRT-LM (`.litertlm`) |
66
+ | **Cuantización** | INT8 dynamic (`dynamic_wi8_afp32`) |
67
+ | **Contexto** | 4096 tokens |
68
+ | **Tamaño estimado** | ~1.5-2.0 GB |
69
+ | **Convertido con** | `litert-torch` v0.9.0 |
70
+
71
+ ⚠️ Modelo abliterated/uncensored. Úsalo con responsabilidad.
gemma4_to_litertlm.ipynb ADDED
@@ -0,0 +1,306 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "nbformat": 4,
3
+ "nbformat_minor": 0,
4
+ "metadata": {
5
+ "colab": {
6
+ "provenance": [],
7
+ "gpuType": "T4",
8
+ "machine_shape": "hm"
9
+ },
10
+ "kernelspec": {
11
+ "name": "python3",
12
+ "display_name": "Python 3"
13
+ },
14
+ "language_info": {
15
+ "name": "python"
16
+ },
17
+ "accelerator": "GPU"
18
+ },
19
+ "cells": [
20
+ {
21
+ "cell_type": "markdown",
22
+ "metadata": {},
23
+ "source": [
24
+ "# 🚀 Convertir Gemma 4 E2B Uncensored-MAX a LiteRT-LM\n",
25
+ "\n",
26
+ "Convierte el modelo a formato `.litertlm` para **Google AI Edge Gallery** en Android.\n",
27
+ "\n",
28
+ "**⚠️ IMPORTANTE:** Usa runtime con **GPU + RAM Alta**: Entorno de ejecución → Cambiar tipo → T4 + RAM Alta (hm)\n",
29
+ "\n",
30
+ "### Instrucciones:\n",
31
+ "1. Ejecuta celda **1️⃣** → pon tu token\n",
32
+ "2. Ejecuta celda **2️⃣** → instala dependencias. **El runtime se reiniciará, es normal.**\n",
33
+ "3. Tras el reinicio, ejecuta **3️⃣**, **4️⃣** y **5️⃣** en orden\n",
34
+ "\n",
35
+ "**Tiempo:** ~30-45 min"
36
+ ]
37
+ },
38
+ {
39
+ "cell_type": "code",
40
+ "execution_count": null,
41
+ "metadata": {},
42
+ "outputs": [],
43
+ "source": [
44
+ "#@title 1️⃣ Configuración\n",
45
+ "HF_TOKEN = \"\" #@param {type:\"string\"}\n",
46
+ "OUTPUT_REPO = \"RedSparkie/gemma-4-E2B-it-Uncensored-MAX-litert-lm\" #@param {type:\"string\"}\n",
47
+ "SOURCE_MODEL = \"prithivMLmods/gemma-4-E2B-it-Uncensored-MAX\" #@param {type:\"string\"}\n",
48
+ "\n",
49
+ "import json, os\n",
50
+ "os.makedirs('/content/cfg', exist_ok=True)\n",
51
+ "with open('/content/cfg/config.json', 'w') as f:\n",
52
+ " json.dump({'HF_TOKEN': HF_TOKEN, 'OUTPUT_REPO': OUTPUT_REPO, 'SOURCE_MODEL': SOURCE_MODEL}, f)\n",
53
+ "print('✅ Config guardada')\n",
54
+ "assert HF_TOKEN, '❌ ¡Pon tu token de HuggingFace arriba!'"
55
+ ]
56
+ },
57
+ {
58
+ "cell_type": "code",
59
+ "execution_count": null,
60
+ "metadata": {},
61
+ "outputs": [],
62
+ "source": [
63
+ "#@title 2️⃣ Instalar dependencias (reinicia el runtime)\n",
64
+ "# Colab trae torch/torchao/transformers viejos que son incompatibles.\n",
65
+ "# Necesitamos versiones exactas que funcionen juntas.\n",
66
+ "!pip install -q --upgrade \\\n",
67
+ " \"transformers>=5.7.0\" \\\n",
68
+ " \"torchao>=0.17.0\" \\\n",
69
+ " litert-torch \\\n",
70
+ " litert-lm \\\n",
71
+ " huggingface_hub \\\n",
72
+ " sentencepiece \\\n",
73
+ " protobuf \\\n",
74
+ " safetensors \\\n",
75
+ " psutil\n",
76
+ "\n",
77
+ "# Verificar\n",
78
+ "import torch, torchao, transformers\n",
79
+ "print(f'torch {torch.__version__} | torchao {torchao.__version__} | transformers {transformers.__version__}')\n",
80
+ "\n",
81
+ "# Test rápido: ¿funciona torchao.quantization.pt2e?\n",
82
+ "try:\n",
83
+ " import torchao.quantization.pt2e.quantize_pt2e\n",
84
+ " print('✅ torchao.quantization.pt2e OK')\n",
85
+ "except ImportError:\n",
86
+ " print('⚠️ torchao.quantization.pt2e no disponible, forzando reinstalación...')\n",
87
+ " import subprocess\n",
88
+ " subprocess.check_call(['pip', 'install', '-q', '--force-reinstall', 'torchao>=0.17.0'])\n",
89
+ "\n",
90
+ "# Test: ¿Gemma4 disponible?\n",
91
+ "try:\n",
92
+ " from transformers import Gemma4Config\n",
93
+ " print('✅ Gemma4 disponible')\n",
94
+ "except ImportError:\n",
95
+ " print('⚠️ Gemma4 no disponible, forzando reinstalación...')\n",
96
+ " import subprocess\n",
97
+ " subprocess.check_call(['pip', 'install', '-q', '--force-reinstall', 'transformers>=5.7.0'])\n",
98
+ "\n",
99
+ "# Reiniciar runtime para cargar todo limpio\n",
100
+ "print('\\n🔄 Reiniciando runtime...')\n",
101
+ "print(' Después del reinicio, ejecuta desde la celda 3️⃣')\n",
102
+ "import IPython\n",
103
+ "IPython.Application.instance().kernel.do_shutdown(True)"
104
+ ]
105
+ },
106
+ {
107
+ "cell_type": "code",
108
+ "execution_count": null,
109
+ "metadata": {},
110
+ "outputs": [],
111
+ "source": [
112
+ "#@title 3️⃣ Preparar modelo (extraer solo texto)\n",
113
+ "# Recuperar config\n",
114
+ "import json, os\n",
115
+ "with open('/content/cfg/config.json') as f:\n",
116
+ " cfg = json.load(f)\n",
117
+ "HF_TOKEN = cfg['HF_TOKEN']\n",
118
+ "OUTPUT_REPO = cfg['OUTPUT_REPO']\n",
119
+ "SOURCE_MODEL = cfg['SOURCE_MODEL']\n",
120
+ "\n",
121
+ "# Verificar versiones\n",
122
+ "import torch, torchao, transformers\n",
123
+ "print(f'torch {torch.__version__} | torchao {torchao.__version__} | transformers {transformers.__version__}')\n",
124
+ "from transformers import Gemma4Config\n",
125
+ "import torchao.quantization.pt2e.quantize_pt2e\n",
126
+ "print('✅ Todo OK')\n",
127
+ "\n",
128
+ "import sys, gc, shutil, time\n",
129
+ "from huggingface_hub import hf_hub_download\n",
130
+ "from safetensors import safe_open\n",
131
+ "from safetensors.torch import save_file\n",
132
+ "import psutil\n",
133
+ "\n",
134
+ "def memlog(l=''):\n",
135
+ " m = psutil.virtual_memory()\n",
136
+ " print(f' [{l}] RAM: {m.available/(1024**3):.1f}/{m.total/(1024**3):.1f} GB')\n",
137
+ "\n",
138
+ "MODEL_DIR = '/content/model'\n",
139
+ "OUTPUT_DIR = '/content/output'\n",
140
+ "os.makedirs(MODEL_DIR, exist_ok=True)\n",
141
+ "os.makedirs(OUTPUT_DIR, exist_ok=True)\n",
142
+ "start_time = time.time()\n",
143
+ "memlog('inicio')\n",
144
+ "\n",
145
+ "print('📥 Descargando índice...')\n",
146
+ "idx_path = hf_hub_download(SOURCE_MODEL, 'model.safetensors.index.json', token=HF_TOKEN)\n",
147
+ "with open(idx_path) as f:\n",
148
+ " index = json.load(f)\n",
149
+ "\n",
150
+ "shard_lm = {}\n",
151
+ "for key, shard in index['weight_map'].items():\n",
152
+ " if key.startswith('model.language_model.'):\n",
153
+ " shard_lm.setdefault(shard, []).append(key)\n",
154
+ "\n",
155
+ "total_shards = len(shard_lm)\n",
156
+ "print(f' {sum(len(v) for v in shard_lm.values())} tensores en {total_shards} shards')\n",
157
+ "\n",
158
+ "weight_map = {}\n",
159
+ "for i, sn in enumerate(sorted(shard_lm)):\n",
160
+ " keys = shard_lm[sn]\n",
161
+ " out_name = f'model-{i+1:05d}-of-{total_shards:05d}.safetensors'\n",
162
+ " out_path = os.path.join(MODEL_DIR, out_name)\n",
163
+ " \n",
164
+ " if os.path.exists(out_path) and os.path.getsize(out_path) > 100:\n",
165
+ " print(f' {out_name} ya existe, skip')\n",
166
+ " with safe_open(out_path, framework='pt') as f:\n",
167
+ " for k in f.keys(): weight_map[k] = out_name\n",
168
+ " continue\n",
169
+ " \n",
170
+ " print(f' 📦 {sn} → {out_name} ({len(keys)} tensores)')\n",
171
+ " shard_path = hf_hub_download(SOURCE_MODEL, sn, token=HF_TOKEN)\n",
172
+ " \n",
173
+ " tensors = {}\n",
174
+ " with safe_open(shard_path, framework='pt') as f:\n",
175
+ " for key in keys:\n",
176
+ " tensors[key] = f.get_tensor(key)\n",
177
+ " \n",
178
+ " save_file(tensors, out_path)\n",
179
+ " for k in tensors: weight_map[k] = out_name\n",
180
+ " print(f' 💾 {os.path.getsize(out_path)/(1024**2):.0f} MB')\n",
181
+ " del tensors; gc.collect()\n",
182
+ " memlog(f'shard {i+1}')\n",
183
+ "\n",
184
+ "with open(os.path.join(MODEL_DIR, 'model.safetensors.index.json'), 'w') as f:\n",
185
+ " json.dump({'metadata': {}, 'weight_map': weight_map}, f)\n",
186
+ "\n",
187
+ "print('\\n📝 Config...')\n",
188
+ "config = transformers.AutoConfig.from_pretrained(SOURCE_MODEL, token=HF_TOKEN)\n",
189
+ "cd = config.to_dict()\n",
190
+ "cd['vision_config'] = None\n",
191
+ "cd['audio_config'] = None\n",
192
+ "for k in ['vision_soft_tokens_per_image','image_token_id','boi_token_id',\n",
193
+ " 'eoi_token_id','audio_token_id','boa_token_id','eoa_token_id',\n",
194
+ " 'eoa_token_index','video_token_id']:\n",
195
+ " cd.pop(k, None)\n",
196
+ "with open(os.path.join(MODEL_DIR, 'config.json'), 'w') as f:\n",
197
+ " json.dump(cd, f, indent=2)\n",
198
+ "\n",
199
+ "for fn in ['tokenizer.json','tokenizer_config.json','chat_template.jinja','generation_config.json']:\n",
200
+ " try:\n",
201
+ " shutil.copy(hf_hub_download(SOURCE_MODEL, fn, token=HF_TOKEN), os.path.join(MODEL_DIR, fn))\n",
202
+ " print(f' ✓ {fn}')\n",
203
+ " except: pass\n",
204
+ "\n",
205
+ "del config; gc.collect()\n",
206
+ "cache_dir = os.path.expanduser('~/.cache/huggingface/hub')\n",
207
+ "if os.path.exists(cache_dir):\n",
208
+ " for d in os.listdir(cache_dir):\n",
209
+ " if d.startswith('models--'):\n",
210
+ " shutil.rmtree(os.path.join(cache_dir, d), ignore_errors=True)\n",
211
+ "gc.collect()\n",
212
+ "print(f'\\n✅ Modelo preparado')\n",
213
+ "memlog('listo')"
214
+ ]
215
+ },
216
+ {
217
+ "cell_type": "code",
218
+ "execution_count": null,
219
+ "metadata": {},
220
+ "outputs": [],
221
+ "source": [
222
+ "#@title 4️⃣ Convertir a .litertlm\n",
223
+ "from litert_torch.generative.export_hf import export as export_lib\n",
224
+ "\n",
225
+ "print('🚀 Convirtiendo a LiteRT-LM...')\n",
226
+ "print(' Esto tarda 15-30 min.')\n",
227
+ "memlog('pre-export')\n",
228
+ "conversion_start = time.time()\n",
229
+ "\n",
230
+ "export_lib.export(\n",
231
+ " model=MODEL_DIR,\n",
232
+ " output_dir=OUTPUT_DIR,\n",
233
+ " task='text_generation',\n",
234
+ " bundle_litert_lm=True,\n",
235
+ " quantization_recipe='dynamic_wi8_afp32',\n",
236
+ " cache_length=4096,\n",
237
+ " prefill_lengths=[256],\n",
238
+ " use_jinja_template=True,\n",
239
+ " keep_temporary_files=True,\n",
240
+ " trust_remote_code=False,\n",
241
+ " experimental_lightweight_conversion=True,\n",
242
+ " externalize_embedder=True,\n",
243
+ ")\n",
244
+ "\n",
245
+ "print(f'\\n✅ Conversión en {(time.time()-conversion_start)/60:.1f} min')\n",
246
+ "memlog('post-export')"
247
+ ]
248
+ },
249
+ {
250
+ "cell_type": "code",
251
+ "execution_count": null,
252
+ "metadata": {},
253
+ "outputs": [],
254
+ "source": [
255
+ "#@title 5️⃣ Verificar y subir\n",
256
+ "litertlm = os.path.join(OUTPUT_DIR, 'model.litertlm')\n",
257
+ "\n",
258
+ "if not os.path.exists(litertlm):\n",
259
+ " print('❌ model.litertlm no encontrado. Archivos:')\n",
260
+ " for r,d,fs in os.walk(OUTPUT_DIR):\n",
261
+ " for f in fs:\n",
262
+ " fp = os.path.join(r,f)\n",
263
+ " print(f' {os.path.relpath(fp,OUTPUT_DIR)}: {os.path.getsize(fp)/(1024**2):.1f} MB')\n",
264
+ "else:\n",
265
+ " size_gb = os.path.getsize(litertlm) / (1024**3)\n",
266
+ " print(f'📊 model.litertlm: {size_gb:.2f} GB')\n",
267
+ " if size_gb <= 2.0: print('✅ ¡Cabe en 2 GB!')\n",
268
+ " else: print(f'⚠️ {size_gb:.2f} GB — Cambia a dynamic_wi4_afp32 en celda 4')\n",
269
+ " \n",
270
+ " print(f'\\n📤 Subiendo a {OUTPUT_REPO}...')\n",
271
+ " from huggingface_hub import HfApi\n",
272
+ " api = HfApi(token=HF_TOKEN)\n",
273
+ " try: api.create_repo(OUTPUT_REPO, exist_ok=True)\n",
274
+ " except: pass\n",
275
+ " \n",
276
+ " api.upload_file(path_or_fileobj=litertlm,\n",
277
+ " path_in_repo='gemma-4-E2B-it-Uncensored-MAX.litertlm',\n",
278
+ " repo_id=OUTPUT_REPO, commit_message='Add LiteRT-LM model')\n",
279
+ " \n",
280
+ " readme = f\"\"\"---\\nlicense: apache-2.0\\nbase_model:\\n- {SOURCE_MODEL}\\ntags:\\n - litert-lm\\n - uncensored\\n - edge-gallery\\nlanguage:\\n- en\\n---\\n\\n# gemma-4-E2B-it-Uncensored-MAX (LiteRT-LM)\\n\\nLiteRT-LM conversion for **Google AI Edge Gallery**.\\n\\n| | |\\n|---|---|\\n| **Base** | [{SOURCE_MODEL}](https://huggingface.co/{SOURCE_MODEL}) |\\n| **Format** | `.litertlm` |\\n| **Quant** | INT8 |\\n| **Context** | 4096 |\\n| **Size** | {size_gb:.2f} GB |\\n\\n## Usage\\n1. Install [Edge Gallery](https://play.google.com/store/apps/details?id=com.google.ai.edge.gallery)\\n2. Add model via HF URL\\n3. Chat!\\n\\n⚠️ Uncensored. Use responsibly.\\n\"\"\"\n",
281
+ " api.upload_file(path_or_fileobj=readme.encode(), path_in_repo='README.md',\n",
282
+ " repo_id=OUTPUT_REPO, commit_message='README')\n",
283
+ " \n",
284
+ " print(f'\\n🎉 ¡LISTO!')\n",
285
+ " print(f'📱 https://huggingface.co/{OUTPUT_REPO}')\n",
286
+ " print(f'📊 {size_gb:.2f} GB')\n",
287
+ " print(f'⏱️ {(time.time()-start_time)/60:.0f} min total')"
288
+ ]
289
+ },
290
+ {
291
+ "cell_type": "markdown",
292
+ "metadata": {},
293
+ "source": [
294
+ "## 🔧 Troubleshooting\n",
295
+ "\n",
296
+ "| Error | Solución |\n",
297
+ "|---|---|\n",
298
+ "| `KeyError: 'gemma4'` | `transformers` viejo. Re-ejecuta celda 2️⃣ y reinicia runtime |\n",
299
+ "| `No module 'torchao.quantization.pt2e'` | `torchao` viejo. Re-ejecuta celda 2️⃣ y reinicia runtime |\n",
300
+ "| OOM / Se queda sin memoria | Usa runtime **RAM Alta** (hm) |\n",
301
+ "| Modelo > 2 GB | Cambia `dynamic_wi8_afp32` → `dynamic_wi4_afp32` en celda 4️⃣ |\n",
302
+ "| `External embedder required` | Ya solucionado con `externalize_embedder=True` |"
303
+ ]
304
+ }
305
+ ]
306
+ }