News
- June 2024: Member of the organizing committee at WiCV@CVPR 2024, DEI Social Event, and Challenges/Opportunities for ECRs in Fast Paced AI Social Event!
- February 2024: Paper accepted at CVPR 2024.
- August 2023 : Started my Ph.D. in the Robotics Institute (RI) at Carnegie Mellon University (CMU).
- May 2023 : Started research internship at Honda Research Institute (HRI), San Jose, California.
- Jan 2023 : Teaching Assistant for 16-825: Learning for 3D Vision.
- Sep 2022 : Paper accepted at NeurIPS 2022.
- Oct 2021 : Paper accepted at BMVC 2021 (Oral).
- Apr 2021 - Dec 2021: I will be serving as a reviewer for ICCV 2021, AAAI 2022, WACV 2022, and CVPR 2022.
- Aug 2021: Journal paper accepted in PAA (in collaboration with Robert Bosch, India).
- Feb 2021: Accepted as a Master of Science in Robotics (MSR) student at Carnegie Mellon University for Fall 2021.
- July 2020: Presented a short paper at RSS Workshop on Self-Supervised Robot Learning 2020.
- Feb 2020: Paper accepted at CVPR 2020 (Oral).
|
Research
I am interested in self-supervised learning, multi-modal machine learning, video representation learning, generative models, point clouds and autonomous driving.
|
|
Can't make an Omelette without Breaking some Eggs: Plausible Action Anticipation using Large Video-Language Models
Himangi Mittal,
Nakul Agarwal,
Shao-Yuan Lo,
Kwonjoon Lee
[CVPR 2024]
Paper /
Arxiv
We leverage a large video-language model for anticipating action sequences that are plausible in the real-world. We develop the understanding of plausibility of an action sequence in a large video-language model by introducing two objective functions, a counterfactual-based plausible action sequence learning loss and a long-horizon action repetition loss.
|
|
Learning State-Aware Visual Representations from Audible Interactions
Himangi Mittal,
Pedro Morgado,
Unnat Jain,
Abhinav Gupta
[NeurIPS 2022]
ECCV 2022 Workshop on Visual Object-oriented Learning meets Interaction (VOLI): Discovery, Representations, and Applications
Sight and Sound Workshop (CVPR 2023)
Arxiv /
Code /
Video
We propose a self-supervised algorithm to learn representations from untrimmed, egocentric videos containing audible interactions.
Our method uses the audio signals in two unique ways: (1) to identify moments in time that are conducive to better self-supervised learning
and (2) to learn representations that focus on the visual state changes caused by audible interactions.
|
|
Self-Supervised Point Cloud Completion via Inpainting
Himangi Mittal,
Brian Okorn,
Arpit Jangid,
David Held
[BMVC 2021 - Oral (Selection rate 3.3%)]
Paper /
Arxiv /
Code /
Conference Presentation /
Webpage
A self-supervised method to complete the incomplete, partial point clouds for real-world settings like LiDAR where ground truth complete point cloud
annotations are unavailable. We achieve this via inpainting where a region of the point cloud is removed and the network is trained to complete this removed region.
|
|
Harnessing emotions for depression detection
Sahana Prabhu Muraleedhara
Himangi Mittal,
Rajesh Varagani,
Sweccha Jha,
Shivendra Singh
[Pattern Analysis and Applications Journal]
Paper
A method for multi-modal depression detection using audio, video, and textual modalities using LSTMs. This work leverages emotions to detect an early indication of
depression.
|
|
Just Go with the Flow: Self-Supervised Scene Flow Estimation
Himangi Mittal,
Brian Okorn,
David Held
[CVPR 2020 - Oral (Selection rate 5.7%)]
RSS 2020 Workshop on Self-Supervised Robot Learning
Paper /
Arxiv /
Code /
Project Page /
Video /
Short Paper
A method of training scene flow that uses two self-supervised losses, based on nearest neighbors and cycle consistency.
These self-supervised losses allow us to train our method on large unlabeled autonomous driving datasets.
|
|
Interpreting Context of Images using Scene Graphs
Himangi Mittal,
Ajith Abraham,
Anuja Arora
[International Conference on Big Data Analytics (BDA), 2019]
Paper /
ArXiv /
Code
Predicted action and spatial relationships in images between objects detected by YOLO, then combining VGG-Net based visual features and
Word2Vec based semantic features.
|
|
Anomaly Detection using Graph Neural Networks
Anshika Chaudhary,
Himangi Mittal,
Anuja Arora
[International Conference on Machine Learning, Big Data, Cloud and Parallel Computing , 2019]
Paper /
Code
A method to capture the anomalous behavior in a social network based on degree, betweenness, and closeness of graph nodes using
Graph Neural Networks (GNN) in Keras.
|
|
STWalk: Learning Trajectory Representations in Temporal Graphs
Supriya Pandhre,
Himangi Mittal
Manish Gupta,
Vineeth N. Balasubramanian
[ACM India Joint International Conference on Data Science and Management of Data (CoDS-COMAD), 2018]
Paper /
ArXiv /
Code
Presents trajectory analysis of spatio-temporal graph nodes using DeepWalk algorithm in NetworkX (Python) for classification and detecting
changing points of interest using SVMs.
|
Academic Service/Volunteer Work
- Workshop Service: Member of the organizing committee at WiCV@CVPR 2024, DEI Social Event, and Challenges/Opportunities for ECRs in Fast Paced AI Social Event!
- Meta Reviewer Service: WiCV@CVPR 2024.
- Teaching Assistant: 16-824: Visual Learning and Recognition (Spring 2024), 16-825: Learning for 3D Vision (Spring 2023).
- Reviewer Service: ICCV 2021, AAAI 2022, WACV 2022, CVPR 2022, CVPR 2023 (+ Emergency reviewer), ICCV 2023, NeurIPS 2023, Pattern Recognition Journal, WACV 2024 (+ Emergency reviewer), CVPR 2024, ICLR 2024, ICML 2024.
- Volunteer at NeurIPS 2022 High School Outreach Program.
- Mentor at CMU AI Undergraduate Mentoring Program (Fall 2022, Spring 2023, Fall 2023).
- Mentor at Spring 2023 CMU Research Mixer for undergraduate students organized by DPAC Undergraduate Research Working Group.
|
|