Application Porting and Resource Scheduling for Super-intelligent Computing Systems by Prof Tang Shanjiang

13 Nov 2025 11.00 AM - 12.00 PM LT4 Current Students, Industry/Academic Partners

Abstract

With the advancement of China's "East West Computing" project and the construction of E-class supercomputers, how to efficiently utilize this world-class computing power to solve the "last mile" problem from "system construction" to "application usability" has become a key challenge. This talk focuses on the two core aspects of large-scale super intelligent computing systems - application porting optimization and resource scheduling management. It systematically introduces our innovation and practice in building an autonomous and controllable full stack technology system.

In terms of application porting optimization, we have made key breakthroughs in major national fields such as astronomy, meteorology, and fluid simulation, targeting domestically produced heterogeneous multi-core architectures. By developing specialized operators, efficient compression, and mixed precision calculation techniques, the astronomical cross validation calculation time has been successfully reduced to the second level, achieving high time assimilation and simulation of meteorological forecasts, and supporting efficient visualization of PB level flow field data, significantly reducing transplant costs and improving the scalability of millions of cores.

In terms of resource scheduling optimization, we propose innovative scheduling strategies to achieve efficient and fair utilization of computing power resources. A game theory based soft fair resource allocation method is proposed for cloud computing environments; We have developed AI based I/O automatic tuning technology for HPC systems, which has improved the performance of some tasks by over 20 times; At the same time, a cross domain heterogeneous computing power collaborative scheduling mechanism is being constructed to meet the complex requirements of future intelligent computing systems.

This study has been successfully applied to major projects such as the National Astronomical Data Center and the Numerical Wind Tunnel, forming a full stack technology system covering from low-level applications to system scheduling, which has effectively promoted the practical process of domestic supercomputing.

 

Biography

Tang Shanjiang is currently an associate professor and doctoral supervisor at the School of Computer Science and Technology, Tianjin University. He is a visiting scholar at Pengcheng National Laboratory and a pillar of the Ministry of Education Huawei "Intelligent Base". He was awarded the ACM SIGHPC China New Star Award in 2022. Simultaneously serving as a high-performance computing expert at the National Supercomputing Tianjin Center, an expert in the Smart Hospital Co construction Project of Tianjin First Central Hospital, and a member of the Expert Committee of the Key Laboratory of Medical Journal Knowledge Mining and Service of the National Press and Publication Administration. Obtained a PhD in Computer Engineering from Nanyang Technological University in Singapore in 2015, and obtained a Bachelor's degree in Software Engineering and a Master's degree in Computer Software and Theory from Tianjin University in 2008 and 2010, respectively. Long term commitment to research in high-performance computing, intelligent computing, big data, and other fields. Published over 60 academic papers in international conferences and journals such as SC, TKDE, TPDS, TSC, TCC, etc., led 2 National Natural Science Foundation projects, 2 key sub projects of the Ministry of Science and Technology, and 1 key project of Tianjin Natural Science Foundation.