Makarand Tapaswi

Hi! I am a Principal Machine Learning Scientist at Wadhwani AI, a non-profit on using AI for Social Good, and an Assistant Professor at the Computer Vision group at IIIT Hyderabad, India.
At Wadhwani AI, we are developing AI solutions that create social impact. In particular, I work on several projects in education and MNCH (maternal, newborn, and child health).
At IIIT, I continue to work on projects at the intersection of video and language understanding, especially related to analyzing stories.
News [archives]
Feb 2025 | One paper accepted to CVPR 2025 on benchmarking video-language models for their ability to understand compositionality! We propose a strict form of video-language entailment that is amenable to modern VLMs. Try it out! arXiv, HuggingFace |
Jan 2025 | Are LLMs good at resolving coreference between people in complicated stories? Our benchmark paper, IdentifyMe, accepted to NAACL 2025 indicates not! arXiv |
Jan 2025 | Our paper on building generalizable models to detect pathologies in Chest X-rays is accepted to ISBI 2025! The extended paper is on arXiv. |
Jan 2025 | Promoted to Principal ML Scientist at Wadhwani AI. Happy to have the opportunity to contribute to various projects on social impact 🙏 |
Dec 2024 | Our paper on studying the properties of a container simply by listening to sounds of pouring water is accepted to ICASSP 2025! Check out the project page or the extended paper on arXiv. |
Dec 2024 | Congratulations to Darshan for successfully defending his thesis and completing the MS by Research program! Through difficult times, Darshan has persevered and produced some of the best work of our group. |
Dec 2024 | Our paper on fine-grained image captioning is accepted to TMLR! Important work that reveals a lot about image captioning systems with a lot of interesting findings. Check out the project page, arXiv, or Manu’s twitter thread. |
Oct 2024 | Our paper on predicting a video’s memorability and exploration of where humans and models look while predicting memorability of a video is accepted to WACV 2025! arXiv This is our first collaboration with Dr. Vishnu Sreekumar’s group that does interesting work on memory! |
Sep 2024 | Our paper on Major Entity Identification is accepted to EMNLP 2024! arXiv. |
Sep 2024 | Happy to be serving as Area Chair for CVPR 2025! |