I view the applications of AI in unlocking key insights about what goes on in our bodies, what causes diseases, and discovering or designing new treatments one of the most promising areas for AI to make an impact in healthcare. There is so much data to analyze to figure out what is really going on that only AI can help us in this endeavor. Our genes, microbiomes, proteins, RNA, other molecules, and a whole lot of other things represent billions and trillions of data that need to be analyzed. I can assure you that our minds are not capable of uncovering these relationships and extracting the key insights we need. Also, our existing lab-based research or analytic tools will take a long time to do this. That is how we’ve been able to make progress so far but we want to move faster and learn more.
For AI to help us to understand the causes of diseases, it will need access to huge amounts of data that are currently either inaccessible or don’t exist yet. But, everyday we are measuring more things and collecting more data. Starting off with historical data from past experiments, this process can be initiated with AI models to try to identify relationships and patterns in that data. AI platforms are cutting down on discovery time by mining medical data—including omics data, scientific literature, and clinical trials data—to identify new drug targets and predict optimal drug designs. They can lower pre-discovery costs by up to 90% and help us to better understand drug mechanisms and the structures of diseases.
Built on language models, AI-powered generators such as ChatGPT can be made to read the codes that make up the building blocks of the human body. Research has found that it’s possible to develop a generalizable program— a “genomic language model”—that could be applied to a variety of different tasks, instead of requiring scientists to build fit-for-purpose AIs to chase answers for each major biological question. For example, Nvidia has been working with the synthetic biology company Evozyne to build a large language model focused on constructing never-before-seen proteins.
Putting genomics front and center could help us to better understand the biology of diseases, which would drive efficiencies and cut research and development costs while allowing us to bring targeted drugs to patients in record time. Used in combination with genomics, AI could help pharma companies to develop new drugs for rare diseases. The rarer a disease is, the smaller the market is and so the less likely it is to have been addressed. Big pharma is hesitant to take on the high development costs for new drugs if there’s no sign of a return on investment.
Data from previous experiments with targets or molecules has often been collected in a way that makes it hard to analyze. That means that the experiment results have been collected in narrative format, meaning unstructured data. Breakthroughs with large language models show that these models could effectively digest medical literature and unstructured data from previous experiments and help researchers to find information and results from previous research, identifying hidden insights and helping to design new experiments. One of the most promising applications for the new models is to generate protein amino acid sequences and three-dimensional structure from text prompts. Such a model could condition its generation on desired functional properties.
Much remains to be done in this area but generative AI can accelerate progress in this area and help us realize the much anticipated benefits of AI in drug discovery.