profile_picture
Kaustav Kundu
Senior Applied Scientist, Amazon Web Services AI Labs
kaustavk-at-amazon-dot-com

I am a Senior Applied Scientist at Amazon Web Services (AWS) AI Labs, where I work on Computer Vision and Machine Learning. My research interest is to build multi-modal models with limited supervision, which can be used across diverse in-domain and out-of-domain scenarios and can reason with its environment.

I did my PhD (ABD) in the Department of Computer Science at University of Toronto, where I worked with Professor Raquel Urtasun and Professor Sanja Fidler. My thesis was on the topic of Efficient Search Strategies in 3D for Visual Scene Understanding. I completed my MS at Toyota Technological Insitute at Chicago (Sept ‘12 - Dec ‘13), under the supervision of Professor Raquel Urtasun.

Academia

University of Toronto
2014 - 2017
Ph.D. (ABD) Computer Science
Advisors - Professor Raquel Urtasun and Professor Sanja Fidler.
Toyota Technological Insitute at Chicago
2012 - 2013
MS Computer Science
Advisors - Professor Raquel Urtasun
IIIT Hyderabad, India
2008 - 2012
B.Tech (Hons.) Computer Science and Engineering
Honours project at Centre for Visual Information Technology (CVIT) lab, advised by Professor P.J. Narayanan.

News

  • [12/23] Our work on image generation was featured in multiple news articles - Bloomberg, TechCrunch, TheVerge, etc.
  • [06/21] Outstanding reviewer at CVPR.
  • [05/21] Our work on backward compaitibility was featured in TWIML podcast, BetterML blog, news.
  • [11/20] Outstanding reviewer at CVPR.
  • [06/17] Awarded Best Paper Honorable Mention award for our Polygon-RNN paper at CVPR.

Publications

Hierarchical Self-supervised Representation Learning for Movie Understanding, 2022, CVPR
Fanyi Xiao , Kaustav Kundu , Joseph Tighe , Davide Modolo
Id-Free Person Similarity Learning, 2022, CVPR
Bing Shuai , Xinyu Li , Kaustav Kundu , Joseph Tighe
TubeR: Tubelet Transformer for Video Action Detection, 2022, CVPR (oral)
Jiaojiao Zhao , Yanyi Zhang , Xinyu Li , Hao Chen , Bing Shuai , Mingze Xu , Chunhui Liu , Kaustav Kundu , Yuanjun Xiong , Davide Modolo , Ivan Marsic , Cees GM Snoek , Joseph Tighe
What to Look at and Where: Semantic and Spatial Refined Transformer for Detecting Human-Object Interactions, 2022, CVPR (oral)
ASM Iftekhar , Hao Chen , Kaustav Kundu , Xinyu Li , Joseph Tighe , Davide Modolo
Positive-congruent training: Towards regression-free model updates, 2021, CVPR (oral)
Sijie Yan , Yuanjun Xiong , Kaustav Kundu , Shuo Yang , Siqi Deng , Meng Wang , Wei Xia , Stefano Soatto
Exploiting weakly supervised visual patterns to learn from partial annotations, 2020, NeurIPS
Kaustav Kundu , Erhan Bas , Michael Lam , Hao Chen , Davide Modolo , Joseph Tighe
Pose Estimation for Objects with Rotational Symmetry, 2018, IROS
Enric Corona , Kaustav Kundu , Sanja Fidler
SurfConv: Bridging 3D and 2D Convolution for RGBD Images, 2018, CVPR
Hang Chu , Wei-Chiu Ma , Kaustav Kundu , Raquel Urtasun , Sanja Fidler
3D Object Proposals using Stereo Imagery for Accurate Object Class Detection, 2017, TPAMI
Xiaozhi Chen , Kaustav Kundu , Yukun Zhu , Humin Ma , Sanja Fidler , Raquel Urtasun.
Annotating Object Instances with a Polygon-RNN, 2017, CVPR (oral) [Best Paper Honorable Mention award]
Lluís Castrejón , Kaustav Kundu , Raquel Urtasun , Sanja Fidler
Exploiting Semantic Information and Deep Matching for Optical Flow, 2016, ECCV
Min Bai , Wenjie Luo , Kaustav Kundu , Raquel Urtasun
Monocular 3D Object Detection for Autonomous Driving, 2016, CVPR
Xiaozhi Chen , Kaustav Kundu , Ziyu Zhang , Humin Ma , Sanja Fidler , Raquel Urtasun
3D Object Proposals for Accurate Object Class Detection, 2015, Journal of Machine Learning
Xiaozhi Chen , Kaustav Kundu , Yukun Zhu , Andrew Berneshawi , Humin Ma , Sanja Fidler , Raquel Urtasun.
Rent3D: Floor-Plan Priors for Monocular Layout Estimation, 2015, CVPR (oral)
Chenxi Liu , Alexander Schwing , Kaustav Kundu , Raquel Urtasun , Sanja Fidler

Patents

System and method for vision-based event detection, 2024, US Patent 11869065
Jayan Eledath , Nikhil Chacko , Alessandro Bergamo , Kaustav Kundu , Marian George , Jingjing Liu , Nishit Desai , Pahal Dalal , Keshav Tripathi
Detecting interactions with non-discretized items and associating interactions with actors using digital images, 2023, US Patent 11580785
Kaustav Kundu , Pahal Dalal , Nishit Desai , Jayan Eledath , Geoffrey Franz , Gerard Medioni , Hoi Cheung Pang , Rakesh Ramakrishnan
Content moderation using object detection and image classification, 2022, US Patent 11423265
Hao Chen , Hao Wu , Hao Li , Michael Lam , Xinyu Li , Kaustav Kundu , Meng Wang , Joseph Tighe , Rahul Bhotika