Himangi Mittal

I am a Ph.D. student in the Robotics Institute (RI) at Carnegie Mellon University (CMU), working with Prof. Shubham Tulsiani. I graduated with a Master of Science in Robotics (MSR) from the Robotics Institute (RI) at Carnegie Mellon University (CMU) where I worked with Prof. Abhinav Gupta and collaborated with Prof. Pedro Morgado at UW-Madison. Previously, I worked as a Research Assistant at CMU with Prof. David Held at the R-Pad Lab, in collaboration with Pittsburgh-based autonomous driving company, Argo AI.

During my Masters at CMU, I had worked on self-supervised representation learning methods for multimodal audio-visual videos and as a RA at CMU, I worked on self-supervised algorithms for 3D LiDAR point clouds.

I served in the organizing committee of WiCV@CVPR 2024, DEI Social Event, and Challenges/Opportunities for ECRs in Fast Paced AI Social Event!

Email  /  CV  /  Google Scholar  /  Twitter  /  Github  /  Linkedin

profile photo
News
  • June 2024: Member of the organizing committee at WiCV@CVPR 2024, DEI Social Event, and Challenges/Opportunities for ECRs in Fast Paced AI Social Event!
  • February 2024: Paper accepted at CVPR 2024.
  • August 2023 : Started my Ph.D. in the Robotics Institute (RI) at Carnegie Mellon University (CMU).
  • May 2023 : Started research internship at Honda Research Institute (HRI), San Jose, California.
  • Jan 2023 : Teaching Assistant for 16-825: Learning for 3D Vision.
  • Sep 2022 : Paper accepted at NeurIPS 2022.
  • Oct 2021 : Paper accepted at BMVC 2021 (Oral).
  • Apr 2021 - Dec 2021: I will be serving as a reviewer for ICCV 2021, AAAI 2022, WACV 2022, and CVPR 2022.
  • Aug 2021: Journal paper accepted in PAA (in collaboration with Robert Bosch, India).
  • Feb 2021: Accepted as a Master of Science in Robotics (MSR) student at Carnegie Mellon University for Fall 2021.
  • July 2020: Presented a short paper at RSS Workshop on Self-Supervised Robot Learning 2020.
  • Feb 2020: Paper accepted at CVPR 2020 (Oral).
Research

I am interested in self-supervised learning, multi-modal machine learning, video representation learning, generative models, point clouds and autonomous driving.

Can't make an Omelette without Breaking some Eggs: Plausible Action Anticipation using Large Video-Language Models
Himangi Mittal, Nakul Agarwal, Shao-Yuan Lo, Kwonjoon Lee
[CVPR 2024]
Paper / Arxiv

We leverage a large video-language model for anticipating action sequences that are plausible in the real-world. We develop the understanding of plausibility of an action sequence in a large video-language model by introducing two objective functions, a counterfactual-based plausible action sequence learning loss and a long-horizon action repetition loss.

Learning State-Aware Visual Representations from Audible Interactions
Himangi Mittal, Pedro Morgado, Unnat Jain, Abhinav Gupta
[NeurIPS 2022]
ECCV 2022 Workshop on Visual Object-oriented Learning meets Interaction (VOLI): Discovery, Representations, and Applications
Sight and Sound Workshop (CVPR 2023)
Arxiv / Code / Video

We propose a self-supervised algorithm to learn representations from untrimmed, egocentric videos containing audible interactions. Our method uses the audio signals in two unique ways: (1) to identify moments in time that are conducive to better self-supervised learning and (2) to learn representations that focus on the visual state changes caused by audible interactions.

profile photo
Self-Supervised Point Cloud Completion via Inpainting
Himangi Mittal, Brian Okorn, Arpit Jangid, David Held
[BMVC 2021 - Oral (Selection rate 3.3%)]
Paper / Arxiv / Code / Conference Presentation / Webpage

A self-supervised method to complete the incomplete, partial point clouds for real-world settings like LiDAR where ground truth complete point cloud annotations are unavailable. We achieve this via inpainting where a region of the point cloud is removed and the network is trained to complete this removed region.

profile photo
Harnessing emotions for depression detection
Sahana Prabhu Muraleedhara Himangi Mittal, Rajesh Varagani, Sweccha Jha, Shivendra Singh
[Pattern Analysis and Applications Journal]
Paper

A method for multi-modal depression detection using audio, video, and textual modalities using LSTMs. This work leverages emotions to detect an early indication of depression.

profile photo
Just Go with the Flow: Self-Supervised Scene Flow Estimation
Himangi Mittal, Brian Okorn, David Held
[CVPR 2020 - Oral (Selection rate 5.7%)]
RSS 2020 Workshop on Self-Supervised Robot Learning
Paper / Arxiv / Code / Project Page / Video / Short Paper

A method of training scene flow that uses two self-supervised losses, based on nearest neighbors and cycle consistency. These self-supervised losses allow us to train our method on large unlabeled autonomous driving datasets.

profile photo
Interpreting Context of Images using Scene Graphs
Himangi Mittal, Ajith Abraham, Anuja Arora
[International Conference on Big Data Analytics (BDA), 2019]
Paper / ArXiv / Code

Predicted action and spatial relationships in images between objects detected by YOLO, then combining VGG-Net based visual features and Word2Vec based semantic features.

profile photo
Anomaly Detection using Graph Neural Networks
Anshika Chaudhary, Himangi Mittal, Anuja Arora
[International Conference on Machine Learning, Big Data, Cloud and Parallel Computing , 2019]
Paper / Code

A method to capture the anomalous behavior in a social network based on degree, betweenness, and closeness of graph nodes using Graph Neural Networks (GNN) in Keras.

profile photo
STWalk: Learning Trajectory Representations in Temporal Graphs
Supriya Pandhre, Himangi Mittal Manish Gupta, Vineeth N. Balasubramanian
[ACM India Joint International Conference on Data Science and Management of Data (CoDS-COMAD), 2018]
Paper / ArXiv / Code

Presents trajectory analysis of spatio-temporal graph nodes using DeepWalk algorithm in NetworkX (Python) for classification and detecting changing points of interest using SVMs.

Academic Service/Volunteer Work
  • Workshop Service: Member of the organizing committee at WiCV@CVPR 2024, DEI Social Event, and Challenges/Opportunities for ECRs in Fast Paced AI Social Event!
  • Meta Reviewer Service: WiCV@CVPR 2024.
  • Teaching Assistant: 16-824: Visual Learning and Recognition (Spring 2024), 16-825: Learning for 3D Vision (Spring 2023).
  • Reviewer Service: ICCV 2021, AAAI 2022, WACV 2022, CVPR 2022, CVPR 2023 (+ Emergency reviewer), ICCV 2023, NeurIPS 2023, Pattern Recognition Journal, WACV 2024 (+ Emergency reviewer), CVPR 2024, ICLR 2024, ICML 2024.
  • Volunteer at NeurIPS 2022 High School Outreach Program.
  • Mentor at CMU AI Undergraduate Mentoring Program (Fall 2022, Spring 2023, Fall 2023).
  • Mentor at Spring 2023 CMU Research Mixer for undergraduate students organized by DPAC Undergraduate Research Working Group.


Teaching
Teaching Assistant for 16-824: Visual Learning and Recognition (Spring 2024)
Teaching Assistant for 16-825: Learning for 3D Vision (Spring 2023)

Source Code