Makarand Tapaswi

Wadhwani AI, IIIT Hyderabad

Hi! I am a Senior Machine Learning Scientist at Wadhwani AI and an Assistant Professor at the Computer Vision group at IIIT Hyderabad, India.

At Wadhwani AI, we are developing AI solutions that create social impact. In particular, my primary project is estimating the weight of newborns from a video, with the goal to empower primary healthcare workers and facilities to improve lives of at risk low-birth-weight babies.

At IIIT, I continue to work on projects at the intersection of video and language understanding, especially related to analyzing stories.

news [archives]

Jul 2022 One paper accepted to ECCV 2022! Another work on vision-and-language navigation, we show that 3D unlabeled environments can be repurposed to generate meaningful training data with pseudo 3D object labels and GPT-2 based captions. ArXiv coming soon
Jun 2022 One paper accepted to IROS 2022! We show that modeling physics as a differentiable ODE allows us to dramatically improve the performance of 3D approximate trajectory reconstruction in Real2Sim. This also removes the need for expensive RL as trajectories can be re-targeted directly to the robot. ArXiv coming soon
Mar 2022 One paper accepted to CVPR 2022! Vision-and-language navigation can benefit strongly from graph transformers. ArXiv
Jan 2022 Promoted to Senior ML Scientist at Wadhwani AI.
Nov 2021 Gave the keynote talk at a really interesting workshop on media understanding focusing on context and environment. Hosted by Google and USC’s Center for Computational Media Intelligence (CCMI).
Oct 2021 Happy to give a talk at Adobe Research Bengaluru a few days ago! Some exciting work on document processing there.
Sep 2021 One paper on long-tail image classification accepted to ICVGIP 2021. Rather than re-sampling from the “tail class”, we adapt a recent few-shot learning work to analyze the impact of feature generation.
Jul 2021 One paper accepted to ICCV 2021! We propose in-domain, self-supervised pretraining using Airbnb listings to improve Vision-and-Language Navigation models. ArXiv Github
Jul 2021 Launched new website based on the al-folio theme. Time to say goodbye to my old self-made Jinja+Python website and embrace Liquid+Jekyll!
Jul 2021 Excited to join IIIT Hyderabad as an Assistant Professor!