--- library_name: transformers tags: [] --- ## Model Details - Model: ReasonCLIP-L14-336-S0-Rea - Base model: [openai/clip-vit-large-patch14-336](https://huggingface.co/openai/clip-vit-large-patch14-336) - Architecture: CLIP ViT-L/14 - Image resolution: 336 - Training stage: Stage 0 - Reasoning - Training data: Only reasoning caption-image pairs from [ReasonLite-42M](https://huggingface.co/datasets/RISys-Lab/ReasonCLIPLite-42M) and [ReasonPro-16M](https://huggingface.co/datasets/RISys-Lab/ReasonCLIPPro-16M) ## Usage ```python from transformers import CLIPModel, CLIPProcessor model_id = "fesvhtr/ReasonCLIP-L14-336-S0-Rea" model = CLIPModel.from_pretrained(model_id) processor = CLIPProcessor.from_pretrained(model_id) ``` For the full checkpoint list, see the [ReasonCLIP model card](https://github.com/RISys-Lab/ReasonCLIP/blob/main/doc/model_card.md).