CCCCyx commited on
Commit
3411f0a
·
verified ·
1 Parent(s): 9b57d0f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -40,8 +40,8 @@ Specifically, the pretraining pipeline is structured into the following four pro
40
 
41
  ### ✨ Highlights
42
 
43
- - 📐 **Native Dynamic Resolution** MOSS-VL-Base-0408 natively processes images and video frames at their original aspect ratios and resolutions without forced resizing, padding, or cropping. By preserving the raw spatial layout, it faithfully captures fine visual details across diverse formats—from high-resolution photographs and dense document scans to ultra-wide screenshots.
44
- - 🎞️ **Native Interleaved Image & Video Inputs** The model accepts arbitrary combinations of images and videos within a single sequence. Through a unified end-to-end pipeline, it seamlessly handles complex mixed-modality prompts, multi-image comparisons, and interleaved visual narratives without requiring modality-specific pre-processing or separate routing logic.
45
 
46
 
47
  ## 🏗 Model Architecture
 
40
 
41
  ### ✨ Highlights
42
 
43
+ - 📐 **Native Dynamic Resolution** MOSS-VL-Base-0408 natively processes images and video frames at their original aspect ratios and resolutions. By preserving the raw spatial layout, it faithfully captures fine visual details across diverse formats—from high-resolution photographs and dense document scans to ultra-wide screenshots.
44
+ - 🎞️ **Native Interleaved Image & Video Inputs** The model accepts arbitrary combinations of images and videos within a single sequence. Through a unified end-to-end pipeline, it seamlessly handles complex mixed-modality prompts, multi-image comparisons, and interleaved visual narratives without requiring modality-specific pre-processing or separate routing logic.
45
 
46
 
47
  ## 🏗 Model Architecture