Electricity use by data centres has sky-rocketed in recent years, fuelled by the demand for mission-critical information and communications technology (ICT) infrastructure. Sustaining such rapid growth while lowering the overall carbon footprint is a challenge. At the same time, the increasing complexity of data centre management has led to more unplanned data outages, resulting in considerable economic losses.
Optimising data centre operations
A digital twin is a virtual representation that serves as the real-time digital counterpart of a physical object, in this case, a data centre. It provides an accurate and intuitive 3D simulation platform that allows experts to better grasp information about the conditions—for instance, temperature and air flow rate—of the data centre hall and quickly pinpoint anomalies. In addition, a high-fidelity digital twin is able to generate massive amounts of synthesised data to augment datasets for AI algorithms.
Building on the data from the digital twin, our AIoT offers three tiers of intelligence. First, descriptive AI can accurately model the internal behaviour of the system based on historical and online data. Second, on the prescriptive level, moves to improve system management and efficiency can be proposed and then safely verified and validated on the cyber system before implementation. Finally, through predictive AI, we can forecast system behaviours with hypothetical inputs to anticipate data centre anomalies and failures.
readings from the data centre infrastructure management system, DCWiz offers high-precision, high-safety and efficient “what-if” analyses with an easy-to-understand user interface, on top of an automated cyber-physical control loop.
DCWiz in the wild
Our team has successfully conducted proof-of-concept trials of DCWiz in both China and Singapore. In China, DCWiz was successfully deployed by Alibaba Group in 2018 during their “Double Eleven” cybersales day, an event where Alibaba handled more than 13,000 transactions per second and hit a sales revenue of US$43 billion.
The fully automated digital twin calibration process was able to achieve accuracy to within±0.50C. With no prior maintenance required, the digital twin shortened the testing duration from one month to a mere week, saving the company tremendous operating costs in the process. Alibaba hailed the DCWiz solution as a “from zero to one” breakthrough in digitalising, optimising and automating data centre operations and management.
In Singapore, a trial was conducted at the enterprise-scale data centres of the National Supercomputing Centre. Here, DCWiz improved the power usage effectiveness from 1.35 to 1.3 for 40 server racks, with an accompanying energy cost savings of S$6,000 (US$4,500) per month. With the help of DCWiz, the supercomputing centre achieved energy savings of 15% for an air-cooled system and 30% for a water-cooled system.
Widely recognised in industry and academia, DCWiz has won a series of prestigious awards—such as the 2020 IEEE TCCPS Industrial Technical Excellence Award, 2016 ASEAN ICT Award (Gold Medal), and 2015 DCD APAC Award—in addition to the Nanyang Research Award, NTU’s top research award, in 2020.
We are currently developing a minimum viable product, with all the essential components of DCWiz, that will be integrated into a cloud-based platform. Our plans include a series of proof-of-value trials with local partners, followed by commercialisation of DCWiz through a spin-off company.
By Wen Yonggang, Anna Chua and Yang Fan
Associate Dean (Research) at NTU’s College of Engineering.
Dr Anna Chua is Assistant Director in Business Development at NTU’s College of Engineering, and Yang Fan is a research associate in SCSE.