--- tags: [eoq, quantized, entropy-coding] base_model: Qwen/Qwen3.5-4B license: apache-2.0 --- # Qwen3.5-4B EOQ Q5 Quantized with **EOQ (Entropy-Optimal Quantization)**: absmax Q5 + rANS entropy coding. ## Benchmark (RTX PRO 6000 Blackwell 96GB VRAM) | Format | Size | PPL (WikiText-2) | tok/s | |--------|------|------------------|-------| | FP16 | 8412 MB | 7.58 | 54.0 | | GGUF Q4_K_M | 2709 MB | ~ref | ~ref | | **EOQ Q5** | **2398 MB** | **7.77** | **54.7** | EOQ Q5 is **11.5% smaller** than GGUF Q4_K_M. PPL degradation vs FP16: +0.19 points. ## Cross-Model Validation | Model | FP16 PPL | EOQ Q5 Size | EOQ Q5 PPL | Delta | |-------|----------|-------------|------------|-------| | Qwen2.5-0.5B | 10.87 | 279 MB | 11.69 | +0.83 | | Qwen2.5-3B | 6.54 | 1,724 MB | 6.77 | +0.23 | | Qwen3.5-4B | 7.58 | 2,398 MB | 7.77 | +0.18 | | Qwen3.5-27B | 5.65 | 15,353 MB | 5.94 | +0.31 | | Qwen3.5-35B-A3B | 5.19 | 19,680 MB | 5.39 | +0.21 | ## Inference Speed EOQ models are stored as dequantized FP16 safetensors. Inference speed is **identical to FP16** (no quantized kernels). EOQ advantage is **smaller file size** at comparable quality, not speed. Measured: 54.7 tok/s (same as FP16: 54.0 tok/s) ## Usage Version: ImageMagick 7.1.2-13 Q16-HDRI aarch64 23522 https://imagemagick.org Copyright: (C) 1999 ImageMagick Studio LLC License: https://imagemagick.org/license/ Features: Cipher DPC HDRI Modules Delegates (built-in): bzlib heic jng jpeg lcms ltdl lzma png tiff webp xml zlib zstd Compiler: clang (17.0.0) Usage: import [options ...] [ file ] Image Settings: -adjoin join images into a single multi-image file -border include window border in the output image -channel type apply option to select image channels -colorspace type alternate image colorspace -comment string annotate image with comment -compress type type of pixel compression when writing the image -define format:option define one or more image format options -density geometry horizontal and vertical density of the image -depth value image depth -descend obtain image by descending window hierarchy -display server X server to contact -dispose method layer disposal method -dither method apply error diffusion to image -delay value display the next image after pausing -encipher filename convert plain pixels to cipher pixels -endian type endianness (MSB or LSB) of the image -encoding type text encoding type -filter type use this filter when resizing an image -format "string" output formatted image characteristics -frame include window manager frame -gravity direction which direction to gravitate towards -identify identify the format and characteristics of the image -interlace type None, Line, Plane, or Partition -interpolate method pixel color interpolation method -label string assign a label to an image -limit type value Area, Disk, Map, or Memory resource limit -monitor monitor progress -page geometry size and location of an image canvas -pause seconds seconds delay between snapshots -pointsize value font point size -quality value JPEG/MIFF/PNG compression level -quiet suppress all warning messages -regard-warnings pay attention to warning messages -repage geometry size and location of an image canvas -respect-parentheses settings remain in effect until parenthesis boundary -sampling-factor geometry horizontal and vertical sampling factor -scene value image scene number -screen select image from root window -seed value seed a new sequence of pseudo-random numbers -set property value set an image property -silent operate silently, i.e. don't ring any bells -snaps value number of screen snapshots -support factor resize support: > 1.0 is blurry, < 1.0 is sharp -synchronize synchronize image to storage device -taint declare the image as modified -transparent-color color transparent color -treedepth value color tree depth -verbose print detailed information about the image -virtual-pixel method Constant, Edge, Mirror, or Tile -window id select window with this id or name root selects whole screen Image Operators: -annotate geometry text annotate the image with text -colors value preferred number of colors in the image -crop geometry preferred size and location of the cropped image -encipher filename convert plain pixels to cipher pixels -extent geometry set the image size -geometry geometry preferred size or location of the image -help print program options -monochrome transform image to black and white -negate replace every pixel with its complementary color -quantize colorspace reduce colors in this colorspace -resize geometry resize the image -rotate degrees apply Paeth rotation to the image -strip strip image of all profiles and comments -thumbnail geometry create a thumbnail of the image -transparent color make this color transparent within the image -trim trim image edges -type type image type Miscellaneous Options: -debug events display copious debugging information -help print program options -list type print a list of supported option arguments -log format format of debugging information -version print version information By default, 'file' is written in the MIFF image format. To specify a particular image format, precede the filename with an image format name and a colon (i.e. ps:image) or specify the image type as the filename suffix (i.e. image.ps). Specify 'file' as '-' for standard input or output. ## What is EOQ? EOQ combines block-wise absmax quantization with rANS entropy coding. Simple quantization that matches complex GGUF K-quants in quality-per-byte. ## GitHub https://github.com/caiovicentino/eoq-quantization