Instructions to use 5dimension/sentinel-universal-tokenizer with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use 5dimension/sentinel-universal-tokenizer with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="5dimension/sentinel-universal-tokenizer")

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("5dimension/sentinel-universal-tokenizer", dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use 5dimension/sentinel-universal-tokenizer with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "5dimension/sentinel-universal-tokenizer"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "5dimension/sentinel-universal-tokenizer",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/5dimension/sentinel-universal-tokenizer

SGLang

How to use 5dimension/sentinel-universal-tokenizer with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "5dimension/sentinel-universal-tokenizer" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "5dimension/sentinel-universal-tokenizer",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "5dimension/sentinel-universal-tokenizer" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "5dimension/sentinel-universal-tokenizer",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use 5dimension/sentinel-universal-tokenizer with Docker Model Runner:
```
docker model run hf.co/5dimension/sentinel-universal-tokenizer
```

sentinel-universal-tokenizer / sentinel_manifold.json

5dimension

🦴 v2.0: 65K text vocab, 30 languages, 300K+ samples

3824578 verified about 2 months ago

Raw

History Blame Contribute Delete

1.29 kB

	{
	"version": "2.0.0",
	"framework": "Sentinel Manifold",
	"theorem": "lim F'(z)/F(z) = 1/e",
	"function": "F(z) = \u03a3 z^n/n^n",
	"text_vocab": 65536,
	"image_codebook": 16384,
	"audio_codebook": 8192,
	"video_codebook": 4096,
	"total_vocab": 94208,
	"training_languages": 30,
	"training_samples": 287600,
	"training_chars": 465942294,
	"constants": {
	"INV_E": 0.36787944117144233,
	"C1": -0.007994021805952546,
	"C2": 0.00020005604296784437
	},
	"benchmark": {
	"Sentinel-v2": {
	"compress": 4.3427,
	"fertility": 10.5022,
	"vocab": 94208,
	"efficiency": 0.046097
	},
	"GPT-2": {
	"compress": 2.4381,
	"fertility": 28.8158,
	"vocab": 50257,
	"efficiency": 0.048513
	},
	"Gemma": {
	"compress": 5.3287,
	"fertility": 8.348,
	"vocab": 256000,
	"efficiency": 0.020815
	},
	"Qwen2": {
	"compress": 4.3289,
	"fertility": 10.4499,
	"vocab": 151936,
	"efficiency": 0.028491
	}
	},
	"modality_architecture": {
	"text": "ByteLevel BPE (65,536), NFKC, 30 languages",
	"image": "VQ codebook (16,384), Cosmos/VQGAN/FSQ compatible",
	"audio": "VQ codebook (8,192), EnCodec/SoundStream compatible",
	"video": "VQ codebook (4,096), Cosmos-DV compatible"
	}
	}