DarrenJiaImbue
/

gemma-4-e4b-it-bouncer-litertlm

ai-text-detection

Model card Files Files and versions

gemma-4-e4b-it-bouncer-litertlm / README.md

DarrenJiaImbue's picture

Upload README.md with huggingface_hub

af4faf4 verified 12 days ago

|

1.92 kB

	---
	language: en
	library_name: litert-lm
	tags:
	- on-device
	- gemma
	- gemma-4
	- ai-text-detection
	- bouncer
	---

	# Gemma 4 E4B IT — Bouncer on-device classifier

	`.litertlm` bundle and hot-swappable LoRA adapter for the
	[imbue-ai/bouncer-private](https://github.com/imbue-ai/bouncer-private)
	iOS app, intended to be consumed by a forked LiteRT-LM runtime
	(millanatimbue/LiteRT-LM @ expose-aux-tensor-outputs).

	## Contents

	\| File \| Size \| Purpose \|
	\|---\|---\|---\|
	\| `model.litertlm` \| 3.9 GB \| Gemma 4 E4B IT base + classifier head + LoRA tensor input slots \|
	\| `lora_adapter.tflite` \| 18 MB \| Attention-only LoRA (rank=8) hot-swapped at session creation \|
	\| `tokenizer.json`, `tokenizer_config.json` \| reference copies (also embedded in `model.litertlm`) \|

	## How it's used

	```swift
	let conv = try await engine.createConversation(with: cfg)
	try conv.setScopedLoraFile(loraAdapterURL) // hot-swap LoRA
	try await conv.sendMessage(.text(text), // prefill + 1 decode step
	optionalArgs: .init(maxOutputTokens: 1))
	let logits = try conv.getAuxiliaryOutput(name: "classifier_logits")
	// argmax(logits) → bucket 0 (human) ... bucket 3 (AI)
	```

	For generation (no classification), don't call `setScopedLoraFile` —
	LoRA inputs default to zero and the model runs as base IT.

	## Build details

	- Quantization: `gemma4_mixed48` (Google's recommended Gemma 4 mixed
	int4/int8 recipe; same family as upstream
	`litert-community/gemma-4-E4B-it-litert-lm`)
	- Cache length: 1024 tokens (matches iOS `EngineConfig.maxNumTokens`)
	- Source: `google/gemma-4-E4B-it` text decoder, fine-tuned with PEFT
	attention-only LoRA + a 4-class NormedLinear head (LayerNorm +
	Linear) over the last input token's hidden state.
	- Conversion: forked `google-ai-edge/litert-torch` with Gemma 4
	classifier_head + LoRA-input wiring (see
	[EditLens](https://github.com/pangramlabs/EditLens) for the
	converter patches).