In-Country Events (China)

CCF-TF International Symposium on Intelligent Multimedia Computing

Jointly organized by CCF Multimedia Technical Committee and Nanyang Technological University, Singapore.



Changsheng Xu

Institute of Automation, Chinese Academy of Sciences
Weisi Lin

School of Computer Science and Engineering, Nanyang Technological University
Shuqiang Jiang

Institute of Computing Technology, Chinese Academy of Sciences
Chunyan Miao

School of Computer Science and Engineering, Nanyang Technological University
Jitao Sang

Beijing Jiaotong University
Jie Zhang

School of Computer Science and Engineering, Nanyang Technological University
Weiqing Min

Institute of Computing Technology, Chinese Academy of Sciences

Bo An

School of Computer Science and Engineering, Nanyang Technological University




Baoquan Zhao

School of Artificial Intelligence, Sun Yat-sen University



CCF-TF International Symposium on Intelligent Media Computing is co-hosted by China Computer Federation (CCF) and Temasek Foundation (TF), and is jointly organized by CCF Multimedia Technical Committee and Nanyang Technological University, Singapore. The symposium comprises a series of monthly research seminars and will be held online. In each seminar, well-known researchers from China, Singapore and USA will introduce the frontier advances in various aspects of artificial intelligence, including but not limited to future intelligent media, robotics, multimedia analysis and retrieval, media coding and transmission, intelligent media and health, artificial intelligence in healthcare, FinTech, etc.


Event 5: Intelligent Media Analysis & Retrieval

Program and time table:

December 7 (Wednesday), 2022, China Standard Time (CST), UTC +8

 Time Speaker Topics    Host
3:00 pm - 4:00 pmProf. Changsheng Xu, Chinese Academy of Sciences, ChinaConnecting Isolated Social Multimedia Big DataProf. Jitao Sang, Beijing Jiaotong University, China
4:00 pm - 5:00 pmProf. Jialie Shen, City, University of London, UKMultimodal Learning and Multimedia Computing
5:00 pm - 6:00 pmProf. Guosheng Lin, Nanyang Technological University, SingaporeWeakly Supervised and Self-Supervised Learning for 3D Data

Connecting Isolated Social Multimedia Big Data

December 7, 2022 | 3:00 pm - 4:00 pm (China Standard Time)
Prof Changsheng Xu
Chinese Academy of Sciences, China
Abstract: The explosion of social media has led to various Online Social Networking (OSN) services. Today's typical netizens are using a multitude of OSN services. Exploring the user-contributed cross-OSN heterogeneous data is critical to connect between the separated data islands and facilitate value mining from big social multimedia. From the perspective of data fusion, understanding the association among cross-OSN data is fundamental to advanced social media analysis and applications. From the perspective of user modeling, exploiting the available user data on different OSNs contributes to an integrated online user profile and thus improved customized social media services. This talk will introduce a user-centric research paradigm for cross-OSN mining and applications and some pilot works along two basic tasks: (1) From users: cross-OSN association mining and (2) For users: cross-OSN user modeling.


Changsheng Xu is a professor of Institute of Automation, Chinese Academy of Sciences. His research interests include multimedia content analysis/indexing/retrieval, pattern recognition and computer vision. He has hold 50+ granted/pending patents and published over 400 refereed research papers including 100+ IEEE/ACM Trans. papers in these areas.

Prof. Xu serves as Editor-in-Chief of Multimedia Systems Journal and Associate Editor of ACM Trans. on Multimedia Computing, Communications and Applications. He received the Best Paper Awards of ACM Multimedia 2016, 2016 ACM Trans. on Multimedia Computing, Communications and Applications and 2017 IEEE Multimedia. He served as Associate Editor of IEEE Transactions on Multimedia and Program Chair of ACM Multimedia 2009. He has served as associate editor, guest editor, general chair, program chair, area/track chair and TPC member for over 20 IEEE and ACM prestigious multimedia journals, conferences and workshops. He is an ACM Distinguished Scientist, IEEE Fellow, and IAPR Fellow.

​Multimodal Learning and Multimedia Computing

December 7, 2022 | 4:00 pm - 5:00 pm (China Standard Time)
Prof. Jialie Shen
City, University of London, UK
Abstract: Driven by the rapid growth of multimedia big data, multimodal learning (especially multimodal deep learning) has gained its significant importance and achieved biggest success in various multimedia computing related applications. However, the complexity and scale of modern multimedia systems often require much more sophisticated statistical model, learning architecture and data processing algorithm for content understanding and analysis than ever before. This talk discusses several major research challenges for the future multimedia system with advanced multimodal learning. In particular, I shall,
  • Introduce why multimodal learning is important for Web scale multimedia search, understanding and analytics.
  • Discuss and review various limitations of the current generation of learning model and architecture.
  • Review key challenges and technical issues in developing and evaluating modern multimedia computing systems with multimodal learning under different contexts.
  • Make predictions about the road that lies ahead for the scholarly exploration and industrial practice in machine learning, multimedia computing and other related communities.
  • I hope that this talk provides an impetus for further research on this important direction.

    Bio: Jialie Shen is currently a professor in computer vision and machine learning (Chair) with the Department of Computer Science, City, University of London, UK. His research interests spread across subareas in artificial intelligence (AI) and data science, including computer vision, deep learning, machine learning, image/video analytics and information retrieval. His research results have expounded in more than 150 publications at prestigious journals and conferences, such as IEEE T-IP, T-CYB, T-MM, T-CSVT, T-CDS, ACM TOIS, ACM TOMM, IJCAI, AAAI, CVPR, ACM SIGIR, ACM SIGMOD, ACM Multimedia, ICDE, and ICDM with several awards: the Lee Foundation Fellowship for Research Excellence Singapore, the Microsoft Mobile Plus Cloud Computing Theme Research Program Award, the Best Paper Runner-Up for IEEE Transactions on Multimedia, the Best Reviewer Award for Information Processing and Management (IP&M) 2019 and ACM Multimedia 2020, and the Test of Time Reviewer Award for Information Processing and Management (IP&M) 2022. He has served for 100 major conferences including CVPR, ICCV, ECCV, IJCAI, AAAI, NIPS, ICDM, SIGKDD, WWW, MMM, ICMR, ICME, ACM SIGIR, and ACM Multimedia as area chair and senior PC/PC. He also serves as an Associate Editor and (or) a member for the editorial board of leading journals: Information Processing and Management (IP&M), Pattern Recognition (PR), IEEE Transactions on Circuits and Systems for Video Technology (IEEE T-CSVT), IEEE Transactions on Multimedia (IEEE T-MM), IEEE Transactions on Knowledge and Data Engineering (IEEE T-KDE) and ACM Transactions on Multimedia Computing, Communications, and Applications (ACM TOMM).

    Weakly Supervised and Self-Supervised Learning for 3D Data

    December 7, 2022 | 5:00 pm - 6:00 pm (China Standard Time)
    Prof. Guosheng Lin
    Nanyang Technological University, Singapore
    Abstract: Weakly supervised point cloud segmentation with only a few labelled points in the whole 3D scene is highly desirable due to the heavy burden of collecting abundant dense annotations for the model training. However, existing methods remain challenging to accurately segment 3D point clouds since limited annotated data may lead to insufficient guidance for label propagation to unlabeled data. In this talk, I will present our recent method for weakly supervised point cloud segmentation based on consistency learning. Our method performs dual adaptive transformations via an adversarial strategy at both point-level and region level, aiming at enforcing the local and structural smoothness constraints on 3D point clouds. I will also talk about our weakly supervised methods for 3D sequence data and instance-level segmentation. Self-supervised learning is another important learning scheme to make use of unlabelled data to improve performance in various visual understanding tasks. I will cover our recent self-supervised learning methods for scene flow estimation and feature learning on 3D point clouds.

    Bio: Guosheng Lin is an Assistant Professor at the School of Computer Science and Engineering, Nanyang Technological University. Prior to that, he was a research fellow at the Australian Centre for Robotic Vision from 2014 to 2017. He received his PhD from the University of Adelaide in 2014. His research interests generally lie in deep learning and 2D/3D visual understanding. He has published over 100 research articles in top-tier research venues and his research work receives over 10K citations as per Google Scholar. He is named in the world's top 2% of scientists List.



    • Click here or scan the QR code below to register.
      (Meeting link will be sent upon receiving the registration.)



    Event 4: Intelligent Media and Health


    Program and time table:

    October 30 (Sunday), 2022, China Standard Time (CST), UTC +8

    9:00 am - 10:00 amProf. Minlie Huang, Tsinghua University, ChinaEmotional Intelligence: from sentiment understanding to sentiment generationProf. Changsheng Xu, Institute of Automation, Chinese Academy of Sciences, China
    10:00 am - 11:00 amProf. Daoqiang Zhang, Nanjing University of Aeronautics and Astronautics, ChinaIntelligent Analysis of Brain Imaging for Early Diagnosis of Brain Diseases
    11:00 am - 12:00 pmProf. Shuqiang Jiang, Institute of Computing Technology, Chinese Academy of Sciences, ChinaProgress and Prospect of Food Computing


    ​Emotional Intelligence: from sentiment understanding to sentiment generation

    October 30, 2022 | 9:00 am - 10:00 am (China Standard Time)
    Prof. Minlie Huang
    Tsinghua University, China
    Abstract: Emotional intelligence is defined by Salovey & Mayer as “the ability to monitor one's own and other people's emotions, to discriminate between different emotions and label them appropriately, and to use emotional information to guide thinking and behavior”. Emotional intelligence is one of key intelligent behaviors in human beings, and sentiment understanding and generation is a complex AI tasks, crucial for human-level intelligence. The speech will start from sentiment analysis on text, and then talk about sentiment-aware representation, empathy, emotional support with dialog systems. He will also share his story with Emohaa, the first Chinese empathetic chatbot for counselling.

    Bio: Dr. Minlie Huang is a Tenured Associate Professor of Tsinghua University, and was awarded as Distinguished Young Scholars of National Science Foundation of China. His main research interests include natural language generation, dialogue systems, and machine reading comprehension. He has won the first prize of the Wu Wenjun Artificial Intelligence Science and Technology Progress Award of the Chinese Association for Artificial Intelligence, the Hanwang Youth Innovation Award of the Chinese Information Processing Society, and the Alibaba Innovation Cooperation Research Award. He has published more than 100 papers in international top conferences or journals, with more than 12000 citations, won the best papers or nominations (IJCAI, ACL, SIGDIAL, etc.) in premier conferences for 5 times, and authored the first book on natural language generation in Chinese named “Modern Natural Language Generation”. He served as the editorial board member of top journals TNNLS, TBD, TACL and CL, and the area chair of ACL/EMNLP multiple times.


    ​Intelligent Analysis of Brain Imaging for Early Diagnosis of Brain Diseases

    October 30, 2022 | 10:00 am - 11:00 am (China Standard Time)
    Prof. Daoqiang Zhang
    Nanjing University of Aeronautics and Astronautics, China

    Abstract: In recent years, the brain research projects have received considerable public and governmental attentions worldwide. Brain imaging technique is an important tool for brain science research. However, due to the high-dimensional, multi-modality, heterogenous, and time-variant characteristics of brain images, it is very challenging to develop both efficient and effective methods for brain image analysis. In this talk, I will introduce our recent works on intelligent methods of brain imaging, based on machine learning techniques. Specifically, this talk will cover the topics including image reconstruction and segmentation, image genomic association analysis, functional alignment and brain network analysis, as well as their applications in early diagnosis of brain disease and brain decoding.

    Bio: Daoqiang Zhang received the B.S. degree, and Ph.D. degree in Computer Science from Nanjing University of Aeronautics and Astronautics (NUAA), China, in 1999, and 2004, respectively. He joined the Department of Computer Science and Engineering of NUAA as a Lecturer in 2004, and is a professor at present. His research interests include machine learning, pattern recognition, data mining, and medical image analysis. In these areas, he has published over 200 scientific articles in refereed international journals such as IEEE Trans. Pattern Analysis and Machine Intelligence, IEEE Trans. Medical Imaging, IEEE Trans. Image Processing, Neuroimage, Human Brain Mapping, Medical Image Analysis, Nature Communications; and conference proceedings such as IJCAI, AAAI, NIPS, CVPR, MICCAI, KDD, with 15,000+ citations by Google Scholar. He was nominated for the National Excellent Doctoral Dissertation Award of China in 2006, won the best paper award or best student award of several international conferences such as PRICAI'06, STMI’12, BICS’16, MICCAI’19 and MICCAI’22, etc. He has served as an associate editor for several international journals such as IEEE Trans. Medical Imaging, Pattern Recognition, Machine Intelligence Research, etc. He is a Fellow of the International Association for Pattern Recognition (IAPR).


    ​Progress and Prospect of Food Computing

    October 30, 2022 | 11:00 am - 12:00 pm (China Standard Time)
    Prof. Shuqiang Jiang
    Chinese Academy of Sciences, China

    Abstract: Artificial Intelligence (AI) technology develops rapidly in all walks of life. However, it has not been widely used in food-related fields, and the deep integration between these fields is in the infancy. The intelligent analysis and digitalized utilization of multimedia food information has broad application prospects and great social value in many traditional fields such as agriculture, food industry and food service industry, as well as food safety, life and health. New opportunities in large-scale food data and the new advance of AI technology promote the development of food computing. Focusing on food computing, this report introduces the relevant research from several aspects, including food recognition/recognition, food knowledge graph, multimodal learning and food recommendation, and looks forward to the development of food computing in the future.

    Bio: Shuqiang Jiang is a professor with the Institute of Computing Technology, Chinese Academy of Sciences (CAS) and a professor in University of CAS. He is also with the Key Laboratory of Intelligent Information Processing, CAS. His research interests include multimedia processing and intelligent understanding, multimodal intelligence and food computing. He has authored or co-authored more than 200 papers on the related research topics. He was supported by National Science Fund for Distinguished Young Scholars in 2021, NSFC Excellent Young Scientists Fund in 2013, Young top-notch talent of Ten Thousand Talent Program in 2014. He won the CAS International Cooperation Award for Young Scientists, the CCF Award of Science and Technology, Wu Wenjun Natural Science Award for Artificial Intelligence, CSIG Natural Science Award, and Beijing Science and Technology Progress Award. He is the senior member of IEEE and CCF, member of ACM, Associate Editor of ACM ToMM. He is the vice Chair of IEEE CASS Beijing Chapter, vice Chair of ACM SIGMM China chapter. He has served as an organization member of more than 20 academic conferences, including the general chair of ICIMCS 2015, program chair of ICIMCS2010, PCM2017, ACM Multimedia Asia2019, He has also served as a TPC member for many conferences, including ACM Multimedia, CVPR, ICCV, IJCAI, AAAI, etc.


    Event 3: Video Coding and Transmission


    Program and time table:

    July 30 (Saturday), 2022, China Standard Time (CST), UTC +8

    2:00 pm - 3:00 pmProf. Yonggang Wen, Nanyang Technological University, SingaporeLearning to Appreciate: Transforming Multimedia Communications via Deep Video AnalyticsProf. Shuqiang Jiang, Institute of Computing Technology, CAS, China
    3:00 pm - 4:00 pmProf. Siwei Ma, Peking University, ChinaAVS3: The Third Generation AVS Video Coding Standard for 8K UHDTV Broadcasting
    4:00 pm - 5:00 pmProf. Mai Xu, Beihang University, ChinaEmbracing Intelligence in Video Compression


    ​Learning to Appreciate: Transforming Multimedia Communications via Deep Video Analytics

    July 30, 2022 | 2:00 pm - 3:00 pm (China Standard Time)
    Prof. Yonggang Wen
    Nanyang Technological University

    Abstract: Media-rich applications will continue to dominate mobile data traffic with an exponential growth, as predicted by Cisco Video Index. The improved quality of experience (QoE) for the video consumers plays an important role in shaping this growth. However, most of the existing approaches in improving video QoE are system-centric and model-based, in that they tend to derive insights from system parameters (e.g., bandwidth, buffer time, etc) and propose various mathematical models to predict QoE scores (e.g., mean opinion score, etc). In this talk, we will share our latest work in developing a unified and scalable framework to transform multimedia communications via deep video analytics. Specifically, our framework consists two main components. One is a deep-learning based QoE prediction algorithm, by combining multi-modal data inputs to provide a more accurate assessment of QoE in real-time manner. The other is a model-free QoE optimization paradigm built upon deep reinforcement learning algorithm. Our preliminary results verify the effectiveness of our proposed framework. We believe that the hybrid approach of multimedia communications and computing would fundamentally transform how we optimization multimedia communications system design and operations.

    Bio: Prof. Yonggang Wen is a Full Professor and President’s Chair in School of Computer Science and Engineering at Nanyang Technological University (NTU), Singapore. He is a Fellow of IEEE and Singapore Academy of Engineering, and an ACM Distinguished Member. He has also served as the Associate Dean (Research) at College of Engineering since 2018, and has served as the acting Director of Nanyang Technopreneurship Centre (2017-2019) at NTU. He received his PhD degree in Electrical Engineering and Computer Science (minor in Western Literature) from Massachusetts Institute of Technology (MIT), Cambridge, USA, in 2008. He has worked extensively in applying machine-learning techniques to system prototyping and performance optimization for large-scale networked computer systems. His work in Yubigo, a Multi-Screen Cloud Social TV System, has been featured by global media (more than 1600 news articles from over 29 countries) and received the 2013 ASEAN ICT Awards (Gold Medal). His latest work on DCWiz, a Cloud AI platform for data centre transformation, has won the 2020 IEEE Industrial Technical Excellence Award, the 2016 ASEAN ICT Awards (Gold Medal) and the 2015 Datacentre Dynamics Awards– APAC (the ‘Oscar’ award of data centre industry). He is the winner of 2019 Nanyang Research Award and 2016 Nanyang Awards in Innovation and Entrepreneurship at NTU Singapore. He was named “Top Asia Pacific Technology Leader” for cloud and data center by W.Media in 2021 and Singapore Computer Society Tech Leader – Digital Achiever in 2022. He is a co-recipient of multiple journal best papers awards, including IEEE Transactions on Circuits and Systems for Video Technology (2019), IEEE Multimedia (2015), and several best paper awards from international conferences, including 2020 IEEE VCIP, 2016 IEEE Globecom, 2016 IEEE Infocom MuSIC Workshop, 2015 EAI/ICST Chinacom, 2014 IEEE WCSP, 2013 IEEE Globecom and 2012 IEEE EUC. He has received 2016 IEEE ComSoc MMTC Distinguished Leadership Award. He serves or had served on editorial boards for multiple transactions and journals, including IEEE Transactions on Circuits and Systems for Video Technology, IEEE Wireless Communication Magazine, IEEE Communications Survey & Tutorials, IEEE Transactions on Multimedia, IEEE Transactions on Signal and Information Processing over Networks, IEEE Access Journal and Elsevier Ad Hoc Networks, and was elected as the Chair for IEEE ComSoc Multimedia Communication Technical Committee (2014-2016). His research interests include cloud computing, green data center, distributed machine learning, blockchain, big data analytics, multimedia network and mobile computing.

    ​AVS3: The Third Generation AVS Video Coding Standard for 8K UHDTV Broadcasting

    July 30, 2022 | 3:00 pm - 4:00 pm (China Standard Time)
    Prof. Siwei Ma
    Peking University, China
    Abstract: The Audio and Video Coding Standard (AVS) working group, founded in 2002, has been continuously working on developing efficient video coding standards for the past two decades. A series of video coding standards and extensions have been published and standardized, renowned for promising coding performance, hardware friendly design, and transparent intellectual property rights (IPR) policy. AVS3 is the third generation video coding standard developed by AVS workgroup. It aims at the emerging 8K ultra high definition (UHD) video applications. This talk will give a brief overview of AVS3 standard, including its development process, key coding tools, performance testing and deployment in China 8K UHDTV broadcasting. Compared to the previous AVS2 standard, AVS3 can achieve about 40% bit-rate reduction, which is very promising for the high throughput 8K video applications. In January 2021, China CCTV has successfully launched 8K UHD TV broadcasting channel, and the Spring Festival Gala 8K live show has been provided over 10 different cities among China in February, 2021, enabled by the ultra-fast transmission speed of 5G networks. In February 2022, the 8K livestream of Beijing Olympic Winter Games opening ceremony and programs were broadcast with AVS3 coding standard. Actually AVS3 is opening a new era for 8K UHD video applications in China. After AVS3, AVS workgroup will continue to seek new techniques to further improve the coding efficiency. Several directions are currently under investigation, including neural-network based video coding and machine-vision oriented video coding. The development direction of future AVS video coding standard is also discussed.

    Bio: Siwei Ma is currently the Boya Distinguished Professor of Peking University. He received the B.Sc. degree from Shandong Normal University, in 1999, and the Ph.D. degree in computer science from the Institute of Computing Technology, Chinese Academy of Sciences, in 2005. He worked as a postdoc with the University of Southern California, from 2005 to 2007. Then he joined the Institute of Digital Media, Peking University until now. His research interests are video coding and video processing. He has authored over 300 technical articles in refereed journals and proceedings. From 2002, he has been actively participating in the definition of AVS national standards and its applications in HDTV, UHDTV broadcasting. As the AVS video group chair he has successfully led the development of AVS3 video coding standard, which supported the launch of CCTV-8K channel in China. He was awarded with the National Science Fund for Distinguished Young Scholars Award and the first prize of the National Technology Invention Award in 2020.

    ​Embracing Intelligence in Video Compression

    July 30, 2022 | 4:00 pm - 5:00 pm (China Standard Time)
    Prof. Mai Xu
    Beihang University, China
    Abstract: Recently, along with the explosion of multimedia content, visual communications have become increasingly prominent in communication networks, affecting the daily life of billions of citizens and millions of businesses in the world. The amount of data over networks is expected to grow almost 40-fold in the next five years. Given the limited spectrum, video applications have encountered the bandwidth-hungry bottleneck. The pioneering research on delivering the perceived content of human is relieving the bandwidth-hungry issue from the perspective of perceptual compression and coding, in which artificial intelligence (AI) techniques, such as computer vision and machine learning, have been actively studies. In this talk, we mainly focus on perception-inspired video compression, which learns from human intelligence for significantly removing perceptual redundancy of video data. Specifically, our talk first presents our works in data-driven saliency detection, which can be used to explore perceptual redundancy of video. Based on saliency detection, we then discuss our approaches on perception-inspired video compression for dramatically removing redundancy of video compression, such that both bit-rate and complexity can be significantly reduced without any degradation on quality of experience (QoE). Finally, we briefly introduce our latest works in panoramic video (also called 360-degree video) compression, which improves rate-distortion through predicting viewports of panoramic video.


    Bio: Mai Xu is a distinguished professor of School of Electronic Information Engineering (Yangtze River scholar), Beihang University. His research interests include video compression and image processing. In the past five years, more than 100 papers have been published in prestigious journals such as IJCV, IEEE TPAMI, TIP, JSAC, TMM, and famous conferences such as IEEE CVPR, ICCV, ECCV, ACM MM, AAAI, and DCC. Many papers were selected as ESI highly cited papers/highlight papers. As the PI, he is supported by many projects, e.g., Excellent Young Scholar Funding of e National Natural Science Foundation of China, and Distinguished Young Scholar Funding of e National Natural Science Foundation of Beijing.

    Event 2: Robotics

    Program and time table:

    June 30 (Thursday), 2022, China Standard Time (CST), UTC +8

    TimeDistinguished SpeakerTopicsHost
    09:00 am - 10:00 amProf. Louis Phee, Nanyang Technological University, SingaporeAdvancing Technologies to Improve Flexible EndoscopyProf. Weisi Lin, Nanyang Technological University, Singapore
    10:00 am - 11:00 amDr. Anthony Vetro, Mitsubishi Electric Research Labs (MERL), USALearning Robotic Manipulation for Assembly Tasks
    11:00 am - 12:00 pmProf Liu Zhi, Shandong University, ChinaAided Detection and Diagnosis of Medical Ultrasound Robot Based on Artificial Intelligence


    ​Advancing Technologies to Improve Flexible Endoscopy

    June 30, 2022 | 09:00 a.m. - 10:00 a.m. (China Standard Time)
    Prof. Louis Phee
    Nanyang Technological University

    Abstract: The first flexible endoscope was developed in the 1960s, shortly after the invention of the optical fiber, which allows light to be transmitted through a flexible fiber. The latest endoscopes use miniature CCD or CMOS cameras to capture clearer images. Besides clearer images, the structure and utility of the flexible endoscopy remained largely unchanged in the past 50 years. High accessibility with minimal damage to healthy tissue is the greatest benefit of the flexible endoscope. It allows for inspection and simple soft tissue manipulation of tubular organs without the need for incisions. The speaker has been developing new technologies for the past 20 years to augment and advance the capabilities of the flexible endoscope. With robotics, intricate surgical procedures can now be performed endoscopically, further blurring the line between conventional surgery and endoscopy. Artificial intelligence and deep learning techniques could improve navigation, tissue manipulation and diagnosis of diseases. Novel sensors and actuators would enable intelligent scopes to be developed that would better understand the environment within the organ of interest and actively conform to different medical scenarios. With the incorporation of these state-of-the-art technologies, the future of flexible endoscopy would see higher accuracy of diagnosis and advanced therapeutic means all delivered from a minimally invasive medical platform that enters the human body via a natural orifice.

    Bio: Dr Louis Phee is the Vice President (Innovation & Entrepreneurship) and Dean of College of Engineering at Nanyang Technological University, Singapore. He is also the Tan Chin Tuan Centennial Professor in Mechanical Engineering and a Fellow of the Singapore Academy of Engineers. He graduated from NTU with the B.Eng and M.Eng degrees. He obtained his PhD from Scuola Superiore Sant’Anna, Pisa, Italy in 2002 on a European Union scholarship. His research interests include Medical Robotics and Mechatronics in Medicine. He was a recipient of the prestigious National Research Foundation (NRF) Investigator Award. Professor Phee is the co-founder of 2 NTU start-ups and is an advisor and mentor to entrepreneurial faculty and students. He was awarded the Young Scientist Award, the Outstanding Young Persons of Singapore Award, the Nanyang Outstanding Young Alumni Award, the President’s Technology Award, the Nanyang Innovation and Entrepreneurship Award and the Nanyang Alumni Achievement Award.


    ​Learning Robotic Manipulation for Assembly Tasks

    June 30, 2022 | 10:00 a.m. - 11:00 a.m. (China Standard Time)
    Dr. Anthony Vetro
    Mitsubishi Electric Research Labs (MERL), USA

    Abstract: Human-level manipulation is well beyond the capabilities of today’s robotic systems. Not only do current industrial robots require significant time to program a specific task, but they lack the flexibility to generalize to other tasks and be robust to changes in the environment. While collaborative robots help to reduce programming effort and improve the user interface, they still fall short on generalization and robustness. This talk will highlight recent advances in a number of key areas to improve the manipulation capabilities of autonomous robots, including methods to accurately model the dynamics of the robot and contact forces, sensors and signal processing algorithms to provide improved perception, optimization-based decision-making and control techniques, as well as new methods of interactivity to accelerate and enhance robot learning. 

    Bio: Anthony Vetro is a Vice President and Director at Mitsubishi Electric Research Labs, in Cambridge, Massachusetts. He is currently responsible for a wide range of research in the areas of computer vision, speech/audio processing, robotics, control and dynamical systems. In his 25+ years with the company, he has contributed to the transfer and development of several technologies to Mitsubishi products, including digital television receivers and displays, surveillance and camera monitoring systems, automotive equipment, as well as satellite imaging systems. He has published more than 200 papers and has been a member of the MPEG and ITU-T video coding standardization committees for a number of years, serving in numerous leadership roles. He has also been active in various IEEE conferences, technical committees and boards, most recently serving on the Board of Governors of the IEEE Signal Processing Society and as a Senior Area Editor for the Open Journal on Signal Processing. Dr. Vetro received the B.S., M.S. and Ph.D. degrees in Electrical Engineering from New York University. He has received several awards for his work on transcoding and is a Fellow of the IEEE.


    Aided Detection and Diagnosis of Medical Ultrasound Robot Based on Artificial Intelligence

    June 30, 2022 | 11:00 a.m. - 12:00 p.m. (China Standard Time)
    Prof. Liu Zhi
    Shandong University, China


    Abstract: Ultrasound robots combine robotic technology with artificial intelligence (AI) to assist disease detection and diagnosis in clinical, which have become an indispensable part of medical technology. Ultrasound is one of the important basis for clinical diagnosis. It has the characteristics of non-invasiveness, non-radiation and low-cost, which is widely used in the diagnosis, detection and risk warning of various diseases. However, real-time dynamics of ultrasound images have huge and complex data, and the traditional manual analysis scheme can easily be affected by the subjective factors of doctors. Ultrasound robots can quickly and accurately complete massive data analysis and give objective conclusions, which have extremely high application value. This research focuses on ultrasonic robot-assisted diagnosis and detection, conducting research on intravascular ultrasound, carotid artery ultrasound, ultrasound image segmentation, cardiac disease monitoring, remote ultrasound-assisted diagnosis, and mechanical arm-assisted operation, providing novel ideas of ultrasonic robots and promoting the progress of intelligent disease diagnosis and treatment. 

    Bio: Zhi Liu (IEEE M'12-SM'20) received the Ph.D. degree from the Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, in 2008. He is currently a Professor with the School of Information Science and Engineering, Shandong University. He is also the dean of the Center of Intelligent Information Processing, SDU. His current research interests include applications of computational intelligence to linked multi-component big data systems, medical images in the neurosciences, multimodal human–computer interaction, Affective Computing, content-based image retrieval, semantic modeling, data processing, classification, and data mining. He has published over 80 papers on international journals, including IEEE Transactions on Instrumentation and Measurement, IEEE transactions on medical imaging, Medical Image Analysis.


    Event 1: Future Media Computing

    Program and time table:

    May 27 (Friday), 2022, China Standard Time (CST), UTC +8

    TimeDistinguished SpeakerTopicsHost
    9:00-9:10Prof. Wenwu Zhu, Prof. Weisi LinOpening and IntroductionProf. Shuqiang Jiang
    9:10-10:10Prof. Chua Tat-SengChallenges in Multimodal Conversational Search and Recommendation
    10:10-11:10Prof. Wenwu ZhuAutomated Machine Learning on Graphs
    11:10-12:10Prof. Chia-Wen LinMaking the Invisible Visible: Toward High-Quality Deep THz Computational Imaging
    12:10-12:15 Closing


    ​Challenges in Multimodal Conversational Search and Recommendation

    27 May 2022 | 9:10 a.m. - 10:10 a.m. (China Standard Time)
    Prof. Chua Tat-Seng
    National University of Singapore

    Abstract: Information search has been evolving from mostly unidirectional and text-based to interactive and multimodal. Recently, there is also a growing interest in all matters conversational. Multimodal conversation offers users a natural way to query the system by combining text/speech, images/videos and possibly gesture. It also helps to tackle the basic asymmetric problems by injecting conversation to help resolve the ambiguities in search and recommendation. However, the evolution from traditional IR to multimodal conversational search and recommendation (MCSR) faces many challenges. The first set of challenges touches on the basic MCSR models, including how to integrate task-oriented and open domain models, how to model multimodal context and history, and how to integrate domain knowledge and user models. The second set of challenges involves basic interactivity issues, including how to naturally converse using text and visual modalities, how to incorporate intervention strategy into search and browsing; and how to perform interactive IR, QA and recommendation. The third set of challenges looks more into the future on how to build dialogue simulator, and how to make MCSR systems extendable and active by allowing the system and users to co-evolve and becoming more intelligent together. This talk presents current research with pointers towards future research.

    BioDr Chua is the KITHCT Chair Professor at the School of Computing, National University of Singapore (NUS). He is also the Distinguished Visiting Professor of Tsinghua University and the Visiting Pao Yue-Kong Chair Professor of Zhejiang University. Dr Chua was the Founding Dean of the School of Computing from 1998-2000. His main research interests include unstructured data analytics, video analytics, conversational search and recommendation, and robust and trustable AI. Dr Chua is the co-Director of NExT, a joint research Center between NUS and Tsinghua, and Sea-NExT, a joint Lab between Sea Group and NExT. Dr Chua is the recipient of the 2015 ACM SIGMM Achievements Award for the Outstanding Technical Contributions to Multimedia Computing, Communications and Applications. He is the Chair of steering committee of Multimedia Modeling (MMM) conference series, and ACM International Conference on Multimedia Retrieval (ICMR) (2015-2018). He was the General Co-Chair of ACM Multimedia 2005, ACM CIVR (now ACM ICMR) 2005, ACM SIGIR 2008, ACM Web Science 2015, ACM MM-Asia 2020, and the upcoming ACM conferences on WSDM 2023 and TheWebConf 2024. He serves in the editorial boards of three international journals. Dr. Chua is the co-Founder of two technology startup companies in Singapore. He holds a PhD from the University of Leeds, UK.


    Automated Machine Learning on Graphs

    27 May 2022 | 10:10 a.m. - 11:10 a.m. (China Standard Time)
    Prof. Wenwu Zhu
    Tsinghua University

    Abstract: Automated machine learning (AutoML) on graphs, which combines the strength of graph machine learning and AutoML, is gaining attentions from the research community. This talk will first overview graph machine learning and AutoML on graphs. Then, recent advances, including efficient neural architecture search for self-attention representation, hyper-parameter optimization on large-scale graphs, and increasing explainability in AutoML on graphs, will be discussed. We will also introduce AutoGL, the first dedicated framework and open-source library for AutoML on graphs, which is expected to facilitate the research and application in the community. Last but not least, we discuss multimedia applications of automated graph machine learning and share our insights on future research directions with the audience

    Bio: Wenwu Zhu is currently a Professor of Computer Science Department of Tsinghua University and Vice Dean of National Research Center on Information Science and Technology. Prior to his current post, he was a Senior Researcher and Research Manager at Microsoft Research Asia. He was the Chief Scientist and Director at Intel Research China from 2004 to 2008. He worked at Bell Labs New Jersey as a Member of Technical Staff during 1996-1999. He has been serving as the chair of the steering committee for IEEE T-MM since January 1, 2020. He served as the Editor-in-Chief for the IEEE Transactions on Multimedia from 2017 to 2019. And Vice EiC for IEEE Transactions on Circuits and Systems for Video Technology from 2020-2021 He served as co-Chair for ACM MM 2018 and co-Chair for ACM CIKM 2019. His current research interests are in the areas of multimodal big data and intelligence, and multimedia networking. He received 10 Best Paper Awards. He is a member of Academia Europaea, an IEEE Fellow, AAAS Fellow, and SPIE Fellow.


    Making the Invisible Visible: Toward High-Quality Deep THz Computational Imaging

    27 May 2022 | 11:10 a.m. - 12:10 p.m. (China Standard Time)
    Prof. Chia-Wen Lin
    National Tsing Hua University, Taiwan


    Abstract: Terahertz (THz) computational imaging has recently attracted significant attention thanks to its non-invasive, non-destructive, non-ionizing, material-classification, and ultra-fast nature for 3D object exploration and inspection. However, its strong water absorption nature and low noise tolerance lead to undesired blurs and distortions of reconstructed THz images. The performances of existing methods are highly constrained by the diffraction-limited THz signals. In this talk, we will introduce the characteristics of THz imaging and its applications. We will also show how to break the limitations of THz imaging with the aid of complementary information between the THz amplitude and phase images sampled at prominent frequencies (i.e., the water absorption profile of THz signal) for THz image restoration. To this end, we propose a novel physics-guided deep neural network design, namely Subspace-Attention-guided Restoration Network (SARNet), that fuses such multi-spectral features of THz images for effective restoration. Furthermore, we experimentally construct an ultra-fast THz time-domain spectroscopy system covering a broad frequency range from 0.1 THz to 4 THz for building up temporal/spectral/spatial/phase/material THz database of hidden 3D objects.

    BioProf. Chia-Wen Lin is currently a Professor with the Department of Electrical Engineering, National Tsing Hua University (NTHU), Taiwan. He also serves as Deputy Director of the AI Research Center of NTHU. His research interests include image/video processing, computer vision, and video networking.

    Dr. Lin is an IEEE Fellow, and has been serving on IEEE Circuits and Systems Society (CASS) Fellow Evaluating Committee since 2021. He serves as IEEE CASS BoG member-at-Large during 2022-2024. He was Steering Committee Chair of IEEE ICME (2020-2021), IEEE CASS Distinguished Lecturer (2018-2019), and President of the Chinese Image Processing and Pattern Recognition (IPPR) Association, Taiwan (2019-2020). He has served as Associate Editor of IEEE Transactions on Image Processing, IEEE Transactions on Multimedia, IEEE Transactions on Circuits and Systems for Video Technology, and IEEE Multimedia. He also served as a Steering Committee member of the IEEE Transactions on Multimedia. He was Chair of the Multimedia Systems and Applications Technical Committee of the IEEE CASS. He has served as TPC Chair of IEEE ICME in 2010 and IEEE ICIP in 2019, and the Conference Chair of IEEE VCIP in 2018. His papers won the Best Paper Award of IEEE VCIP 2015, and the Young Investigator Award of VCIP 2005.


    Contact Us:

    HOSTED BY ​​​​