Artificial Intelligence
AI Finds Drug Candidate for Liver Cancer in 30 Days
New AI speeds up drug discovery to merely days.
Posted January 25, 2023 Reviewed by Ekua Hagan
Key points
- The average drug discovery and development process takes over a decade.
- Scientists have broken new ground with the AI discovery of a novel drug candidate for liver cancer in just 30 days.
- Worldwide, liver cancer was one of the leading causes of cancer mortality in 2020, with over 830,000 deaths.
A major artificial intelligence (AI) breakthrough occurred that may herald the start of a new era in drug discovery that may completely revamp the biotechnology, pharmaceutical, healthcare, and life sciences industries. Scientists have broken new ground with the AI discovery of a novel drug candidate for liver cancer in just 30 days. The study was published this month in Chemical Science.
“This work is the first demonstration of applying AlphaFold to the hit identification process in drug discovery,” wrote senior authors Alex Zhavoronkov, Ph.D., founder and CEO of Insilico Medicine; Alán Aspuru-Guzik, Ph.D., professor of computer science and chemistry the University of Toronto and director of the University of Toronto’s Acceleration Consortium; and Michael Levitt, Ph.D., Nobel Prize laureate in chemistry in 2013 and professor of structural biology at the Stanford University School of Medicine, along with their research colleagues.
The most common type of primary liver cancer is hepatocellular carcinoma (HCC). Worldwide, liver cancer was one of the leading causes of cancer mortality in 2020, with over 830,000 deaths; along with lung cancer (1.8 million deaths), colon and rectum cancers (916,000 deaths), stomach cancer (769,000 deaths), and breast cancer (685,000 deaths), according to the World Health Organization (WHO).
The average drug discovery and development process takes over a decade, and only a tiny fraction of candidates make it beyond the initial phase. This breakthrough study demonstrates how AI can accelerate drug discovery to days instead of years—a potentially vast improvement that may disrupt the pharmaceutical industry.
Drug research and development require significant monetary resources. The cost of discovering and developing new FDA-approved pharmaceutical drugs and biologics is an estimated USD 2.87 billion, according to a study published in the Journal of Health Economics by researchers affiliated with Tufts Center for the Study of Drug Development at Tufts University, Duke University, and the Simon Business School.
To bring a drug to market in the U.S. requires successful completion of Phase I, II, III and regulatory filings with the Food and Drug Administration (FDA). Phase I involves drug discovery and development. Phase II includes preclinical research studies to determine whether or not a drug should be tested in humans. In Phase III clinical research is conducted with human clinical trials. If Phase III is completed successfully, then the drug developer may file for FDA regulatory review and approval. The overall likelihood of Phase I approval was only 7.9 percent and it took an average of 10.5 years to discover and develop a drug during 2011-2020 according to the Clinical Development Success Rates and Contributing Factors 2011–2020 report by BIO, Pharma Intelligence, and QLS Advisors LLC. The breakdown of the 10.5 years average by phase is 2.3 years for Phase I, 3.6 years for Phase II, 3.3 years for Phase III, and 1.3 years for the regulatory filing.
“In 2022, Insilico nominated nine preclinical candidates out of its AI engine, eight for internal and one for a partner,” said Alex Zhavoronkov, founder and CEO at Insilico Medicine. “If you compare this total with any big pharmaceutical company's performance, it is very impressive since it is a comparable number which comes at a tiny fraction of the cost. Many of these are novel targets and some are challenging targets demonstrating that the generative AI can now perform well in both biology and chemistry. We also got the top line data from the first AI-discovered and AI-designed antifibrotic.”
The researchers used PandaOmics to identify the top 20 targets using text and omics data from 10 hepatocellular carcinoma datasets. After filtering the targets for safety, tissue specificity, accessibility by biologics, small molecule accessibility, and novelty, Cyclin-dependent kinase 20 (CDK20) was selected as an initial target.
“Cyclin-dependent kinase 20 (CDK20) was finally selected as our initial target to work on due to its strong disease association, limited experimental structure information, and shortage of approved drugs or clinical compounds in the context of any disease during the last three years,” the researchers wrote.
To predict the structure of the CDK20 protein, the researchers used the AlphaFold Protein Structure Database, an open-source AlphaFold database with over 200 million protein structure predictions developed by DeepMind and the European Molecular Biology Laboratory—European Bioinformatics Institute (EMBL-EBI). The AlphaFold Protein Structure Database contains the proteomes of 48 organisms including human, mouse, rat, worm, yeast, Zebrafish, corn, rice, bacteria, and yeast proteomes. The term proteome is a combination of the words protein and genome. A proteome refers to the full complement of proteins that can be expressed by the genetic material of an organism, or in a particular cell type at a particular time under defined environmental conditions.
The sequence of its amino acids determines the three-dimensional (3D) shape and function of proteins. AlphaFold is an AI software created by Alphabet-owned DeepMind Technologies based in London. It is an AI solution that is able to predict the 3D structure of a protein from its one-dimensional amino acid sequence. AlphaFold uses data from amino-acid sequences, templates, and multiple sequence alignments for inference to output features with per-residue confidence scores, atom coordinates, and a histogram of distances between pairs of pixels in an image called a distogram. In 2020, AlphaFold made history by solving a 50-year-old grand challenge in biology known as the protein-folding problem with its highly accurate protein structure predictions.
The scientists used the AlphaFold-predicted CDK20 structure and the fully-automated AI machine-learning Chemistry42 to generate 8,918 molecules, out of which seven were selected for synthesis and biological testing. From the seven molecules, compound ISM042-2-001 looked the most promising in a CDK20 kinase binding assay. Another round of compound generation, synthesis, and testing was performed that resulted in a more powerful hit molecule with nanomolar potency called ISM042-2-048.
“Empowered by Chemistry42 and AlphaFold predicted protein structures, it took us only 30 days to discover our first hit,” the researchers wrote.
“I never imagined that we would go so far by ourselves without a pharma partner,” said Zhavoronkov. “And unlike most other companies in the field, we make our platform available for licensing by others. You can think of it as selling AI software. I am also happy to report that 10 out of the top 20 pharmaceutical companies now have used our platform. Having Pharma.AI platform in-house is now a sign of big pharma's competence in AI. But we are not stopping here. We started new research initiatives in quantum computers and also launched the first sixth-generation fully-robotic target discovery lab.”
Copyright © 2023 Cami Rosso All rights reserved.