Video-Text-to-Text
Transformers
Safetensors
English
soccer_qa_4b
soccer
video-qa
question-answering
vision-language
multimodal
sports-analysis
Instructions to use sportsvision/soccer-qa-4b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use sportsvision/soccer-qa-4b with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("sportsvision/soccer-qa-4b", dtype="auto") - Notebooks
- Google Colab
- Kaggle
File size: 458 Bytes
0e37bb2 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | {
"model_type": "soccer_qa_4b",
"architectures": [
"SoccerQA4BModel"
],
"vision_dim": 1408,
"projection_dim": 2048,
"text_dim": 3072,
"img_size": 256,
"num_frames": 16,
"max_length": 256,
"temperature": 0.7,
"imagenet_mean": [
0.485,
0.456,
0.406
],
"imagenet_std": [
0.229,
0.224,
0.225
],
"hidden_size": 3072,
"vocab_size": 128257,
"model_description": "Soccer video question answering model"
} |