Published on 24 May 2024

Boosting the discovery of new medicines

Assoc Prof Mu Yuguang makes virtual drug screening a reality with machine learning.

Pharmacies house a staggering array of medicines to treat anything from headaches to infections. But these consumer-ready products only represent a fraction of the millions or even billions of chemicals that are screened and tested during drug discovery – the long and costly process of finding new medicines.

With the advent of computational chemistry and the availability of high throughput biology tools, drug discovery has become significantly more efficient and feasible. A standard tool used in the early stages of drug discovery is a set of algorithms that predict how a chemical (known as a ligand) will interact with its biological target.

Known as structure-based ligand identification, this process virtually screens massive databases of potential ligands and matches them to target candidates that are most likely to produce a therapeutic effect.

A critical factor determining the success of virtual screening is how strongly a ligand binds to the biological target – a measure dubbed the score function.

Methods using machine learning-based artificial intelligence (AI) can be harnessed to solve the score function. However, they lack the accuracy required to discriminate the nuances of ligand-target binding, such as short- versus long-range interactions.

A ligand called 1,3-benzodioxole-5-carboxylic acid (yellow) is a potential drug for treating tuberculosis. Using our OnionNet tool, we predicted its binding strength to its biological target, the enzyme pantothenate synthetase (white), which plays a role in the disease. Our model produced comparable results to a previous experimental method that was based on the structure of the molecules. Credit: NTU.


Our laboratory developed a method called OnionNet – an AI model that considers both short- and long-range ligand target interactions and the properties of the elements involved to more accurately estimate the strength of binding.

Stacked up against comparative methods, OnionNet demonstrated clear improvements when predicting binding affinities based on experimentally determined data.

Taking it a step further, we sought to apply the new concepts used for developing OnionNet to produce a tool. As demonstrated in our follow-up work named OnionNet-SFCT, applying this tool to a traditional method of predicting score functions called AutoDock Vina helped correct mistakes and considerably improved AutoDock Vina’s ability to predict the binding strength of ligands to their targets compared to benchmarks.

This same procedure was successfully used in reverse virtual screening, a process in which the binding of different biological targets to a known ligand is tested virtually. We were able to identify multiple known targets for a stress hormone in plants, suggesting that it has potential to help find new targets for existing drugs.

Using a tool called OnionNet-SFCT, we predicted the binding strength of a potential anti-cancer metabolite from plants called ursolic acid (the white, pink and red ligand) to a protein called SPG21 or maspardin (the green biological target), which may have a role in activating cancer fighting immune cells. Credit: NTU.


The binding strength of a ligand to its biological target is only one of many factors determining its success as a drug. Other important properties are its toxicity and how well the body absorbs the molecule.

Previous models only predicted a single molecular property because of the complexity involved. This has necessitated numerous models to predict various properties of a molecule in parallel.

Our lab is looking at using a type of machine learning called graph convolutional networks to develop a single model that can represent a molecule’s complexity to make predictions for several properties simultaneously. By cross analysing these variables, we can predict how the ligand would perform in a complex physiological context.

Going forward, we will also explore new AI techniques to further accelerate drug discovery.

By Mu Yuguang and Hilbert Lam

Assoc Prof Mu Yuguang from NTU’s School of Biological Sciences (SBS) studies how peptides, proteins, DNA and RNA fold into specific shapes and structures. He also develops methods for simulating and analysing molecular dynamics and uses machine learning to discover new drugs.

Hilbert Lam is an SBS research assistant and an NTU biological sciences undergraduate.

Details of the research cited can be found in Journal of Integrative Plant Biology (2023), DOI: 10.1111/jipb.13469; Briefings in Bioinformatics (2022), DOI:10.1093/bib/bbac051; and ACS Omega (2019), DOI:10.1021/acsomega.9b01997.

The article appeared first in NTU's research and innovation magazine Pushing Frontiers (issue #22, August 2023).