Research Focus
Deep Learning
We investigate deep learning methods and develop new methods that are more efficient, robust, accurate, scalable, transferable, and explainable. The area that we are working on include domain generalization, knowledge distillation, long-tailed recognition, and self-supervised learning.
Super-Resolution
Our team is the first to introduce the use of deep neural networks to directly predict super-resolved images. Our journal paper on image super-resolution was selected as the `Most Popular Article' by IEEE Transactions on Pattern Analysis and Machine Intelligence in 2016. It remains as one of the top 10 articles to date. Popular image and video super-resolution methods developed by our team include SRCNN, ESRGAN, EDVR, GLEAN and BasicVSR.
Natural Language Processing
Visual recognition and language understanding are two challenging tasks in artificial intelligence. We investigate and develop new deep learning models that can reason from language and visual cues. The many applications based on these models include image retrieval using complex test queries, learning from weakly supervised text, aligning images and text in large data collections, and generating images from textual description.
Content Editing and Generation
We research new methods for generating high-resolution, realistic and novel contents in images and videos. We are also interested in investigating fundamental concepts in generative models. Some of our works include scene deocclusion, video inpainting, image generation, and image manipulation.
Image and Video Understanding
We explore effective and efficient methods to detect, segment and recognize objects in complex scenes. We were the champion in COCO 2019 Object Detection Challenge 2019, and Open Images Challenge 2019.
3D Scene Understanding
We explore various tasks related to 3D reconstruction and perception, e.g, 3D shape generation and 3D human recovery. Our recent work include Variational Relational Point Completion Network, Unsupervised 3D Shape Completion through GAN Inversion and LiDAR-based Panoptic Segmentation.
Distributed Learning
We research and develop new efficient GPU cluster schedulers, and a cloud resource orchestration system for deep learning inference workloads. We also investigate new federated learning methods to enable users to collaboratively learn a model while keeping al personal data in its original location.
Media Forensics
The popularization of Deepfakes on the internet has set off alarm bells among the general public and authorities, in view of the conceivable perilous implications. We have proposed two large-scale datasets for face forgery detection. See DeeperForensics and ForgeryNet.