NTIRE (https://data.vision.ee.ethz.ch/cvl/ntire21/) is the largest and most important competition on the restoration and enhancement of images and videos. The competition this year was held in conjunction with CVPR 2021.
Nanyang Associate Professor Loy Chen Change led a team from Nanyang Technological University S-Lab to develop a new deep learning method for video super-resolution called BasicVSR++. With the method, the team won the champion in three tracks under the video resolution challenges, which included the tasks of video super-resolution and enhancement of compressed video.
It is noteworthy that this was the second time the team had won this competition. The team has a long standing interest and solid technological knowhow in image restoration and enhancement. Since the last couple of years, the team has expanded their work to cover the more challenging task of video restoration and enhancement.
BasicVSR++ consists of two specially designed modifications to improve propagation and alignment of BasicVSR, an earlier work of the team. Given an input video, residual blocks are first applied to extract features from each frame. The features are then propagated under the novel second-order grid propagation scheme to enable repeated refinement through propagation. Compared to existing works that propagate features only once, grid propagation repeatedly extracts information from the entire sequence hence improving feature expressiveness. Motivated by the strong relation between deformable alignment and flow-based alignment, the team further proposes to employ optical flow to guide deformable alignment. After propagation, the aggregated features are used to generate the output image through convolution and pixel-shuffling. Thanks to the powerful modules, the proposed BasicVSR++ superseded its competitors and emerged as the champion.
Besides the achievements on video super-resolution, the team has recently proposed a novel method that can significantly improve the restoration quality of large-factor image super-resolution. The method, Generative LatEnt bANk (GLEAN), goes beyond existing practices by directly leveraging rich and diverse priors encapsulated in a pre-trained Generative Adversarial Network. The technique can be used for many applications such as vintage photo and film restoration (see the figure below). The work has been accepted as an oral paper by IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), a premier annual computer vision conference.
Malay girl, Singapore: An old image of the girl that is originally as small as 60 x 85 resolution (5,100 total pixels) can be enlarged and enhanced up to 720 x 1,024 resolution (737,280 total pixels). Images from The New York Public Library.
The team has released the papers on BasicVSR, BasicVSR++, and GLEAN:
“BasicVSR: The Search for Essential Components in Video Super-Resolution and Beyond” is accepted by IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Link: https://arxiv.org/abs/2012.02181
“BasicVSR++: Improving Video Super-Resolution with Enhanced Propagation and Alignment” is released on arXiv. Link: https://arxiv.org/abs/2104.13371
“GLEAN: Generative Latent Bank for Large-Factor Image Super-Resolution” is accepted by IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Link: https://arxiv.org/abs/2012.02181
The source code of BasicVSR is available at GitHub: https://github.com/open-mmlab/mmediting
Team members: Mr Kelvin Chan Cheuk Kit (PhD, Year 3), Mr Zhou Shangchen (PhD, Year 2), Dr Xu Xiangyu (Research Fellow), Assoc Prof Loy Chen Change
The team would like to thank SenseTime, NTU College of Engineering, School of Computer Science and Engineering, and the Singapore Government for their generous support.