Video-Text-to-Text
Safetensors
English
llama
multimodal
fine-grained
instance-understanding
Eval Results (legacy)
wjpoom's picture
Update README.md
11a4bee verified