.

Hello! I am a Research Scientist in NVIDIA Research, working with Prof. Song Han. I got my Ph.D degree in CUHK, supervised by Prof. Jiaya Jia. Before that, I got M.Phil. in Institute of Automation, CAS, supervised by Prof. Gaofeng Meng. My research interests mainly lie in Efficient Deep Learning, Large Language Models, and Computer Vision. I also worked closely with Prof. Xiaojuan Qi, Xiangyu Zhang, and Tao Kong. To know more about me at Google Scholar and Github. Any discussion is welcome via E-mail.


News

[2024/09] Two papers have been accepted by NeurIPS 2024, including one Oral, RL-GPT.

[2024/02] Four papers have been accepted by CVPR 2024, including one Oral, LISA.

[2024/01] LongLoRA has been accepted by ICLR 2024 as an Oral presentation.

[2023/08] We release LongLoRA , an efficient fine-tuning approach for long-context large language models. We also release LongAlpaca , this is the long instruction following dataset, LongAlpaca-12k , and the corresponding models, LongAlpaca-70B.

[2023/08] We release LISA , Reasoning Segmentation via Large Language Model.

[2023/07] We release IST-Net , this is a prior-free pose estimator with SOTA performance and efficiency.

[2023/07] Three papers accepted by ICCV 2023, IST-Net , FocalFormer3D for 3D ojbect detection and tracking, and Accelerated DETR for 3D instance segmentation.

[2023/04] We release 3D-Box-Segment-Anything . We extend Segment Anything to 3D world by combining it and VoxelNeXt. When we provide a prompt (e.g., a point / box), the result is not only 2D segmentation mask, but also 3D boxes.

[2023/03] VoxelNeXt is accepted by CVPR 2023 , a fully sparse VoxelNet. It achieves SOTA on Argoverse2 3D object detection and nuScenes LiDAR mutli-object tracking Code. VoxelNeXt has been merged into the official OpenPCDet codebase.

[2023/03] SphereFormer is accepted by CVPR 2023 , spherical window 3D transformer backbone. It achieves SOTA on SemanticKITTI and nuScenes LiDAR semantic segmentation Code.

[2023/03] We release SparseTransformer , a fast and memory-efficient libarary for sparse transformer for 3D point cloud. This project is lead by Xin Lai.

[2023/03] 3 Papers accepted by CVPR 2023!

[2022/12] We release spconv-plus , a library based on spconv and integrate several new sparse convolution types and operators that might be useful into this library.

[2022/09] Spatial Pruned Sparse Conv is accepted by NeurIPS 2022 , 50% FLOPs saving for efficient 3D object detection Code.

[2022/06] LargeKernel3D 1st NDS on nuScenes Leaderboard , the first large-kernel 3D sparse CNN backbone Code.

[2022/03] Focal Sparse Conv is accepted by CVPR 2022 Oral , a dynamic sparse convolution for high performance 3D object detection Code. Focal Sparse Conv has been merged into the official OpenPCDet codebase.

[2022/03] Scale-aware AutoAug is accepted by T-PAMI, a dynamic training strategy for journal extension Code.

[2021/03] Scale-aware AutoAug is accepted by CVPR 2021, scale-aware augmentations with automatic augmentation Code.

[2019/12] DetNAS is accepted by NeurIPS 2019 , the first work that introduces NAS into object detection Code.

[2019/10] Winner of Microsoft COCO 2019 Instance Segmentation Track.

[2019/12] RENAS is accepted by CVPR 2019 , the first work that combine reinforcement learning and evolutionary search in NAS Code.


Education

The Chinese University of Hong Kong
PhD., Computer Science and Engineering. Aug 2020 - Jul 2024.
Supervisor: Prof. Jiaya Jia.

Institute of Automation, Chinese Academy of Sciences
M.Phil., Pattern Recognition and Intelligent System. Sep 2017 - Jul 2020.
Supervisor: Prof. Gaofeng Meng and Prof. Shiming Xiang.

Beihang University
B.Eng., Major in Guide Naviation and Control, Sep 2013 - Jul 2017.


Activities

Reviewer
CVPR, ICCV, ECCV, AAAI, NeurIPS, T-PAMI, Pattern Recognition.

Teaching Assistant
CSCI 3260 Principles of Computer Graphics
ENGG 5104 Image Processing and Computer Vision