Chemistry and Materials Science Seminar: Language Models and Their Impact on Chemistry

Event Image
scms seminar banner

Chemistry and Materials Science Seminar
Language Models and Their Impact on Chemistry

Dr. Mayk Caldas
Postdoctoral researcher, White Lab Research Group

Chemical Engineering Department, University of Rochester

Large language models (LLMs), deep neural networks with billions of parameters, are revolutionizing scientific research, particularly in chemistry, where traditional methods often involve lengthy and resource-intensive processes. Traditionally, scientific research follows a trial-and-error strategy, often requiring extensive time and resources. For instance, the drug development process requires an investment of ~15 years and $1 billion. However, the impracticality of exploring the vast space of possible chemical compounds through such methods reveals the need for a more efficient strategy. Data-driven methods offer an alternative way of addressing this paradigm to accelerate discovery in chemistry. More specifically, LLMs provide an even further paradigm shift by requiring only natural language as input. These models allow the use of natural language to interact with complex data, simplifying and speeding up research processes. Their applications extend to property prediction, molecule optimization, and efficient knowledge retrieval, demonstrating that language can effectively represent chemical data. Our research shows that language models can be applied to predict blood-brain barrier permeation and solubility with uncertainty. We deployed this model in an open web application to improve usability and reproducibility. Another result of our research shows how we can use natural language descriptions of chemical procedures to optimize their outcome. These are examples of how LLMs are not just reshaping the approach to chemical research but also significantly reducing the time and resources required for scientific breakthroughs. For this reason, our vision is that LLMs represent a significant step forward in how we do science nowadays and that language is the future for chemical representation.

Intended Audience:
Undergraduates, graduates, experts. Those with interest in the topic.

To request an interpreter, please visit

Nathan Eddingsaas
Event Snapshot
When and Where
November 21, 2023
12:30 pm - 1:45 pm
Room/Location: 2305

This is an RIT Only Event

Interpreter Requested?