2025 02 27 | Makarand Tapaswi

One paper accepted to CVPR 2025 on benchmarking video-language models for their ability to understand compositionality! We propose a strict form of video-language entailment that is amenable to modern VLMs. Try it out! arXiv, HuggingFace