🛠️ Projects

Featured

Foundation Models for Computer Vision Building powerful and efficient backbones for visual recognition

We are dedicated to building the next generation of visual representation models that are both powerful and efficient. Our research explores novel architectures, from high-resolution networks (HRNet) and cutting-edge Vision Transformers (EVA) to State Space Models (Vision Mamba). These foundational models serve as robust backbones for a wide array of downstream vision tasks.

3D Scene Understanding and Generation Innovating techniques to perceive, reconstruct, and generate the 3D world

Our group is at the forefront of 3D vision. Our work spans from real-time dynamic scene rendering with 4D Gaussian Splatting to fast text-to-3D asset creation using GaussianDreamer. We aim to create immersive and interactive 3D experiences by bridging the gap between 2D images and 3D understanding.

3d-vision generative-models scene-reconstruction

Perception for Autonomous Driving Developing robust and reliable perception systems for self-driving

We are developing the full stack for autonomous driving perception. Our research covers online HD map construction (MapTR), 3D object detection, and end-to-end vectorized driving systems (VAD). Our goal is to create AI that can safely and efficiently navigate complex real-world traffic scenarios.

autonomous-driving 3d-perception end-to-end-systems