Dean's PhD Fellow @ NYU | NVIDIA Fellow (2024-2025)

Haleakala, Maui, 2021

I am a 4th-year PhD student at NYU AI4CE Lab led by Prof. Chen Feng, and a visiting scholar at Tsinghua IIIS MARS Lab led by Prof. Hang Zhao. I am also a research scientist intern at NVIDIA Research, working with Dr. Jose M. Alvarez in the Autonmous Vehicle Perception Research Group, Dr. Zhiding Yu in the Learning and Perception Research Group, and Prof. Yue Wang in the Autonmous Vehicle Research Group. Before that, I had the opportunity to work as a research intern at NVIDIA AI Research advised by Prof. Anima Anandkumar in 2022, and a visiting scholar at Shanghai Jiao Tong University advised by Prof. Siheng Chen in 2021.

:speaker: I am looking for UG/MS students to work on cutting-edge research projects with me and my collaborators at NYU/NVIDIA/USC/Stanford/Tsinghua. Please send me an email if you are interested!

  • Neural Representations for Dynamic Scenes (NeRF/3DGS)
  • Vision-Language Models for Spatial Robotics
  • Embodied and Cognitive AI for Robotics
  • Generative Models for Robotic Perception and Planning
  • Dataset Curation and Autolabeling for Spatial Robotics

My research vision is to enable collaborative autonomous intelligence by empowering robots with human-level spatial, social, and self-awareness, allowing them to actively perceive and plan in unstructured environments, interact effectively with humans or other robots, and leverage as well as augment the associated memory. To this end, I draw from vision, learning, robotics, graphics, language, sensing, data science, and cognitive science. My research works include developing robust, efficient, and scalable computational models for 3D scene parsing and decision-making from high-dimensional sensory input, as well as curating large-scale datasets to effectively train and verify these models for practical applications, including but not limited to connected and autonomous vehicles, assistive and service robotics, and construction automation. I divide my research into several interconnected directions, each with its focus.

Computational Model

  • Spatial Intelligence: VoxFormer
  • Social Intelligence: AmongUS
  • Verbal Intelligence
  • Self Intelligence

Dataset and Benchmark: SSCBench


Dec 10, 2023 I have received NVIDIA Graduate Fellowship (2024-2025) (<2.0% acceptance rate). Thank you, NV :green_heart:!
Aug 25, 2023 :fire: NVIDIA featured VoxFormer together with FB-OCC! Here is the youtube video: Taking Autonomous Vehicle Occupancy Prediction into the Third Dimension - NVIDIA DRIVE Labs Ep. 30.
Jul 14, 2023 :tada: Among Us and PVT++ are accepted by ICCV 2023. See you in Paris!
Jun 19, 2023 :books: I am hosting Vision-Centric Autonomous Driving (VCAD) CVPR 2023 Workshop at Vancoucer, together with Vitor Guizilini, Yue Wang, and Hang Zhao!
Jun 18, 2023 :speaker: I give an invited talk about VoxFormer at C3DV: 1st Workshop On Compositional 3D Vision@CVPR2023.
Jun 2, 2023 :books: Our NYU team is organizing Collaborative Perception and Learning (CoPerception) ICRA 2023 Workshop at London, together with UCLA Mobility Lab and SJTU MediaBrain Group.
Apr 23, 2023 :tada: DeepExplorer is accepted at RSS 2023. See you in Daegu!
Mar 21, 2023 :tada: VoxFormer was selected as a highlight at CVPR 2023. Specifically, CVPR 2023 has received 9155 submissions, accepted 2360 papers, and selected 235 highlights (10% of accepted papers, 2.5% of submissions).
Jun 20, 2022 :speaker: I give an invited talk about egocentric 3D target prediction at EPIC Workshop@CVPR2022.
Jun 5, 2022 :speaker: I give an invited talk about collaborative and adversarial 3D perception at 3D-DLAD Workshop@IV2022.
Jul 23, 2021 :tada: FLAT is accepted at ICCV 2021 as an oral presentation. ICCV 2021 received a record number of 6236 submissions and accepted 1617 papers. ACs recommended the selection of 210 oral papers. These are 3% of all submissions and 13% of all papers.

selected publications


  1. Preprint
    SSCBench: A Large-Scale 3D Semantic Scene Completion Benchmark for Autonomous Driving
    Yiming LiSihang LiXinhao Liu, and 8 more authors
    arXiv preprint arXiv:2306.09001, 2023
  2. ICCV
    Among Us: Adversarially Robust Collaborative Perception by Consensus
    Yiming LiQi FangJiamu Bai, and 3 more authors
    In Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
  3. RSS
    Metric-Free Exploration for Topological Mapping by Task and Motion Imitation in Feature Space
    Yuhang HeIrving FangYiming Li, and 2 more authors
    In Proceedings of Robotics: Science and Systems, 2023
  4. CVPR
    VoxFormer: Sparse Voxel Transformer for Camera-based 3D Semantic Scene Completion
    Yiming LiZhiding YuChristopher Choy, and 5 more authors
    In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (highlight, top 2.5%), 2023
  5. CVPR
    DeepMapping2: Self-Supervised Large-Scale LiDAR Map Optimization
    Chao ChenXinhao LiuYiming Li, and 2 more authors
    In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023


  1. CVPR
    Egocentric Prediction of Action Target in 3D
    Yiming LiZiang CaoAndrew Liang, and 4 more authors
    In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun 2022


  1. NeurIPS
    Learning Distilled Collaboration Graph for Multi-Agent Perception
    Yiming LiShunli RenPengxiang Wu, and 3 more authors
    In Advances in Neural Information Processing Systems, Jun 2021
  2. ICCV
    Fooling LiDAR Perception via Adversarial Trajectory Perturbation
    Yiming Li, Congcong Wen, Felix Juefei-Xu, and 1 more author
    In Proceedings of the IEEE/CVF International Conference on Computer Vision (oral, top 3.0%), Jun 2021