Alán Aspuru-Guzik and Christine Allen use AI to fast-track drug formulation development

In a bid to reduce the time and cost associated with developing promising new medicines, University of Toronto scientists have successfully tested the use of artificial intelligence to guide the design of long-acting injectable drug formulations.

The study, published this week in Nature Communication, was led by Professor Christine Allen in the Leslie Dan Faculty of Pharmacy and Alán Aspuru-Guzik in the departments of chemistry and computer science in the Faculty of Arts & Science.

Their multidisciplinary research shows that machine-learning algorithms can be used to predict experimental drug release from long-acting injectables (LAI) and can also help guide the design of new LAIs.

“This study takes a critical step towards data-driven drug formulation development with an emphasis on long-acting injectables,” said Allen, who is a member of U of T’s Acceleration Consortium, a global initiative that uses artificial intelligence and automation to accelerate the discovery of materials and molecules needed for a sustainable future.

“We’ve seen how machine learning has enabled incredible leap-step advances in the discovery of new molecules that have the potential to become medicines. We are now working to apply the same techniques to help us design better drug formulations and, ultimately, better medicines.”

Considered one of the most promising therapeutic strategies for the treatment of chronic diseases, long-acting injectables are a class of advanced drug delivery systems that are designed to release their cargo over extended periods of time to achieve a prolonged therapeutic effect. This approach can help patients better adhere to their medication regimen, reduce side effects and increase efficacy when injected close to the site of action in the body.

However, achieving the optimal amount of drug release over the desired period of time requires the development of a wide array of formulation candidates through extensive and time-consuming experiments. This trial-and-error approach has created a significant bottleneck in LAI development compared to more conventional types of drug formulation.

“AI is transforming the way we do science. It helps accelerate discovery and optimization. This is a perfect example of a ‘before AI’ and an ‘after AI’ moment and shows how drug delivery can be impacted by this multidisciplinary research,” said Aspuru-Guzik, who is director of the Acceleration Consortium and holds the CIFAR Artificial Intelligence Research Chair at the Vector Institute in Toronto and the Canada 150 Research Chair in Theoretical and Quantum Chemistry.

Reducing ‘trial and error’ for new drug development

To investigate whether machine-learning tools could accurately predict the rate of drug release, the research team trained and evaluated a series of 11 different models, including multiple linear regression (MLR), random forest (RF), light gradient boosting machine (lightGBM) and neural networks (NN). The data set used to train the selected panel of machine learning models was constructed from previously published studies by the authors and other research groups.

“Once we had the data set, we split it into two subsets: one used for training the models and one for testing,” said Pauric Bannigan, research associate with the Allen research group at the Leslie Dan Faculty of Pharmacy. “We then asked the models to predict the results of the test set and directly compared with previous experimental data. We found that the tree-based models, and specifically lightGBM, delivered the most accurate predictions.”

As a next step, the team worked to apply these predictions and illustrate how machine learning models might be used to inform the design of new LAIs by using advanced analytical techniques to extract design criteria from the lightGBM model. This allowed the design of a new LAI formulation for a drug currently used to treat ovarian cancer.

Expectations around the speed with which new drug formulations are developed have heightened drastically since the onset of the COVID-19 pandemic.

“We’ve seen in the pandemic that there was a need to design a new formulation in weeks, to catch up with evolving variants. Allowing for new formulations to be developed in a short period of time, relative to what has been done in the past using conventional methods, is crucially important so that patients can benefit from new therapies,” Allen said, explaining that the research team is also investigating using machine learning to support the development of novel mRNA and lipid nanoparticle formulations.

More robust databases needed for future advances

The results of the current study signal the potential for machine learning to reduce reliance on trial-and-error testing. However, Allen and the research team identify that the lack of available open-source data sets in pharmaceutical sciences represents a significant challenge to future progress.

“When we began this project, we were surprised by the lack of data reported across numerous studies using polymeric microparticles,” Allen said. “This meant the studies and the work that went into them couldn’t be leveraged to develop the machine learning models we need to propel advances in this space. There is a real need to create robust databases in pharmaceutical sciences that are open access and available for all so that we can work together to advance the field.”

To that end, Allen and the research team have published their datasets and code on the open-source platform Zenodo.

“For this study our goal was to lower the barrier of entry to applying machine learning in pharmaceutical sciences,” Bannigan said. “We’ve made our data sets fully available so others can hopefully build on this work. We want this to be the start of something and not the end of the story for machine learning in drug formulation.”

The study was supported by the Natural Sciences and Engineering Research Council of Canada, the Defense Advance Research Projects Agency and the Vector Institute.