Hello! I am a final year Ph.D student in CUHK, supervised by Jiaya Jia. Before that, I got M.Phil. in Institute of Automation, CAS, supervised by Gaofeng Meng.
My research interests mainly lie in Efficient Deep Learning, Large Language Models, and Computer Vision. To know more about me at Google Scholar and Github. Any discussion is welcome via E-mail.


[2024/01] LongLoRA has been accepted by ICLR 2024 as an Oral presentation.

[2023/08] We release LongLoRA , an efficient fine-tuning approach for long-context large language models. We also release LongAlpaca , this is the long instruction following dataset, LongAlpaca-12k , and the corresponding models, LongAlpaca-70B .

[2023/08] We release LISA , Reasoning Segmentation via Large Language Model.

[2023/07] We release IST-Net , this is a prior-free pose estimator with SOTA performance and efficiency.

[2023/07] Three papers accepted by ICCV 2023, IST-Net , FocalFormer3D for 3D ojbect detection and tracking, and Accelerated DETR for 3D instance segmentation.

[2023/04] We release 3D-Box-Segment-Anything . We extend Segment Anything to 3D world by combining it and VoxelNeXt. When we provide a prompt (e.g., a point / box), the result is not only 2D segmentation mask, but also 3D boxes.

[2023/03] VoxelNeXt is accepted by CVPR 2023 , a fully sparse VoxelNet. It achieves SOTA on Argoverse2 3D object detection and nuScenes LiDAR mutli-object tracking Code. VoxelNeXt has been merged into the official OpenPCDet codebase.

[2023/03] SphereFormer is accepted by CVPR 2023 , spherical window 3D transformer backbone. It achieves SOTA on SemanticKITTI and nuScenes LiDAR semantic segmentation Code.

[2023/03] We release SparseTransformer , a fast and memory-efficient libarary for sparse transformer for 3D point cloud. This project is lead by Xin Lai.

[2023/03] 3 Papers accepted by CVPR 2023!

[2022/12] We release spconv-plus , a library based on spconv and integrate several new sparse convolution types and operators that might be useful into this library.

[2022/09] Spatial Pruned Sparse Conv is accepted by NeurIPS 2022 , 50% FLOPs saving for efficient 3D object detection Code.

[2022/06] LargeKernel3D 1st NDS on nuScenes Leaderboard , the first large-kernel 3D sparse CNN backbone Code.

[2022/03] Focal Sparse Conv is accepted by CVPR 2022 Oral , a dynamic sparse convolution for high performance 3D object detection Code. Focal Sparse Conv has been merged into the official OpenPCDet codebase.

[2022/03] Scale-aware AutoAug is accepted by T-PAMI, a dynamic training strategy for journal extension Code.

[2021/03] Scale-aware AutoAug is accepted by CVPR 2021, scale-aware augmentations with automatic augmentation Code.

[2019/12] DetNAS is accepted by NeurIPS 2019 , the first work that introduces NAS into object detection Code.

[2019/10] Winner of Microsoft COCO 2019 Instance Segmentation Track.

[2019/12] RENAS is accepted by CVPR 2019 , the first work that combine reinforcement learning and evolutionary search in NAS Code.


The Chinese University of Hong Kong
PhD., Computer Science and Engineering. Aug 2020 - now.
Supervisor: Prof. Jiaya Jia.

Institute of Automation, Chinese Academy of Sciences
M.Phil., Pattern Recognition and Intelligent System. Sep 2017 - Jul 2020.
Supervisor: Prof. Gaofeng Meng and Prof. Shiming Xiang.

Beihang University
B.Eng., Major in Guide Naviation and Control, Sep 2013 - Jul 2017.


Technical University of Munich
Visiting Study, Jan 2017 - Feb 2017.


CVPR, ICCV, ECCV, AAAI, NeurIPS, T-PAMI, Pattern Recognition.

Teaching Assistant
CSCI 3260 Principles of Computer Graphics
ENGG 5104 Image Processing and Computer Vision