From left: Mr Hu Qinghao (PhD, Year 3 at S-Lab), Ms Zhang Meng (PhD, Year 1 at S-Lab), Prof Zhang Tianwei (Assistant Professor, SCSE), Prof Wen Yonggang (Associate Dean, College of Engineering).
One paper from NTU has been accepted by the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS) 2023. This is the first time that NTU appears in ASPLOS authors’ affiliations and represent an important milestone to steer advances in computer architecture and system research in Singapore.
The paper, “Lucid: A Non-Intrusive, Scalable and Interpretable Scheduler for Deep Learning Training Jobs”, was co-authored by Prof Wen Yonggang, Prof Zhang Tianwei and PhD students Hu Qinghao and Zhang Meng. This work builds a novel deep learning workload scheduler for large-scale GPU datacenters. It overcomes substantial defects of existing schedulers, including inflexible intrusive manner, exorbitant integration and maintenance cost, limited scalability, as well as opaque decision processes. It successfully achieves the following desirable properties in practical deployment: (1) Efficient non-intrusive scheduling; (2) Low deployment cost; (3) Model performance preservation; (4) Scalability to large-scale cluster; (5) Transparent system tuning.
ASPLOS is the premier forum for interdisciplinary systems research, intersecting computer architecture, hardware and emerging technologies, programming languages and compilers, operating systems, and networking. The conference will be held at the end of March next year in Vancouver, Canada.
Congratulations to all the authors!