Georgios Mastrapas commited on
Commit
ddfa80b
·
1 Parent(s): bd51ae6

chore: update arch graph

Browse files
Files changed (2) hide show
  1. README.md +2 -2
  2. assets/jvlm_architecture.png +2 -2
README.md CHANGED
@@ -54,7 +54,7 @@ inference: false
54
 
55
  [Blog](https://jina.ai/news/jina-vlm-small-multilingual-vision-language-model/) | API | AWS | Azure | GCP | [Arxiv](https://arxiv.org/abs/2512.04032)
56
 
57
- `jina-vlm` is a 2.4B parameter vision-language model that achieves state-of-the-art multilingual visual question answering among open 2B-scale VLMs. The model couples a SigLIP2 vision encoder with a Qwen3 language backbone through an attention-pooling connector that enables token-efficient processing of arbitrary-resolution images. Training data comprises approximately 5M multimodal samples and 12B text tokens across 29 languages, with roughly half in English and the remainder spanning high- and moderate-resource languages.
58
 
59
  ![jina-vlm architecture](./assets/jvlm_architecture.png)
60
 
@@ -993,4 +993,4 @@ If you find `jina-vlm` useful in your research, please cite our [technical repor
993
 
994
  ## License
995
 
996
- `jina-vlm` is licensed under CC BY-NC 4.0. For commercial usage inquiries, feel free to [contact us](https://jina.ai/contact-sales/).
 
54
 
55
  [Blog](https://jina.ai/news/jina-vlm-small-multilingual-vision-language-model/) | API | AWS | Azure | GCP | [Arxiv](https://arxiv.org/abs/2512.04032)
56
 
57
+ `jina-vlm` is a token-efficient 2.4B parameter vision-language model that achieves state-of-the-art multilingual VQA performance among open 2B-scale VLMs. The model couples a SigLIP2 vision encoder with a Qwen3 language decoder and makes use of image tiling and attention-pooling for token-efficient processing of arbitrary-resolution images.
58
 
59
  ![jina-vlm architecture](./assets/jvlm_architecture.png)
60
 
 
993
 
994
  ## License
995
 
996
+ `jina-vlm` is licensed under CC BY-NC 4.0. For commercial usage inquiries, feel free to [contact us](https://jina.ai/contact-sales/).
assets/jvlm_architecture.png CHANGED

Git LFS Details

  • SHA256: 8941f6788e95e12904ac301bff2f37089a1b2421e2c44c4cffa1743a62a3915e
  • Pointer size: 131 Bytes
  • Size of remote file: 248 kB

Git LFS Details

  • SHA256: 6a336b7e3eda7a97ffbd2d1b79e6d0eeafc2320157027a7cec21a132adfbd707
  • Pointer size: 132 Bytes
  • Size of remote file: 2.84 MB