Text to image super resolution

MARCONet: Revealing the Unseen through Text Image Super-Resolution

Synopsis

MARCONet, an artificial intelligence (AI)-driven, image enhancement technology leverages advanced neural networks to improve document digitisation, archival preservation, data extraction, medical imaging and publishing quality.


Opportunity

In today's digital landscape, the demand for crystal-clear, high-resolution text within images is relentless. Our technology, the artificial intelligence (AI)-driven MARCONet, addresses this need by elevating the quality of text in images, even when the source image quality is blurry or unknown. This capability has implications across industries, from enhancing image legibility in archival documents, enabling businesses and individuals to extract more value from their visual data.

 

Technology

Our Blind Text Image Super-Resolution technology, MARCONet, harnesses the power of advanced neural networks to intelligently enhance text within images. It presents a novel prior that focuses more on character structure. MARCONet learns generative structure prior via reformulating a StyleGAN, and proposes a transformer-based encoder to jointly predict the font style, character bounding boxes and the indexes in the codebook.

 

Figure 1: Blind Chinese Text Super-Resolution.

Figure 1: Blind Chinese Text Super-Resolution.

 

Applications & Advantages

  1. Document Digitisation: Enhances the quality of scanned documents, improving the accuracy of digitised text and data.
  2. Archival Preservation: Preserves historical manuscripts and documents by enhancing faded or deteriorated text.
  3. Data Extraction: Improves the performance of OCR systems for data extraction from images, streamlining data entry processes.
  4. Medical Imaging: Enhances text readability in medical images, such as X-rays and MRIs, aiding diagnosis and analysis.
  5. Publishing and Printing: Elevates the quality of printed materials by enhancing text in low-resolution source images.

 

Reference

  1. https://www.mmlab-ntu.com/project/marconet/index.html
  2. Xiaoming Li, Wangmeng Zuo and Chen Change Loy, Learning Generative Structure Prior for Blind text Image Super-resolution, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023. https://openaccess.thecvf.com/content/CVPR2023/papers/Li_Learning_Generative_Structure_Prior_for_Blind_Text_Image_Super-Resolution_CVPR_2023_paper.pdf
  3. Tero Karras, Samuli Laine and Timo Aila, A Style-Based Generator Architecture for Generative Adversarial Networks, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019. https://openaccess.thecvf.com/content_CVPR_2019/papers/Karras_A_Style-Based_Generator_Architecture_for_Generative_Adversarial_Networks_CVPR_2019_paper.pdf

Inventor

Prof LOY Chen Change