Video-Text-to-Text
Transformers
Safetensors
English
soccer_qa_4b
soccer
video-qa
question-answering
vision-language
multimodal
sports-analysis
Instructions to use sportsvision/soccer-qa-4b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use sportsvision/soccer-qa-4b with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("sportsvision/soccer-qa-4b", dtype="auto") - Notebooks
- Google Colab
- Kaggle
| { | |
| "model_type": "soccer_qa_4b", | |
| "architectures": [ | |
| "SoccerQA4BModel" | |
| ], | |
| "vision_dim": 1408, | |
| "projection_dim": 2048, | |
| "text_dim": 3072, | |
| "img_size": 256, | |
| "num_frames": 16, | |
| "max_length": 256, | |
| "temperature": 0.7, | |
| "imagenet_mean": [ | |
| 0.485, | |
| 0.456, | |
| 0.406 | |
| ], | |
| "imagenet_std": [ | |
| 0.229, | |
| 0.224, | |
| 0.225 | |
| ], | |
| "hidden_size": 3072, | |
| "vocab_size": 128257, | |
| "model_description": "Soccer video question answering model" | |
| } |