Multiple Chat GPT instances combine to figure out chemistry

Image of a lab with chemicals, but no people present.
Enlarge / The lab’s empty because everyone’s relaxing in the park while the AI does their work.

Despite rapid advances in artificial intelligence, AIs are nowhere close to being ready to replace humans for doing science. But that doesn’t mean that they can’t help automate some of the drudgery out of the daily grind of scientific experimentation. For example, a few years back, researchers put an AI in control of automated lab equipment and taught it to exhaustively catalog all the reactions that can occur among a set of starting materials.

While useful, that still required a lot of researcher intervention to train the system in the first place. A group at Carnegie Mellon University has now figured out how to get an AI system to teach itself to do chemistry. The system requires a set of three AI instances, each specialized for different operations. But, once set up and supplied with raw materials, you just have to tell it what type of reaction you want done, and it’ll figure it out.

An AI trinity

The researchers indicate that they were interested in understanding what capacities large language models (LLMs) can bring to the scientific endeavor. So all of the AI systems used in this work are LLMs, mostly GPT-3.5 and GPT-4, although some others—Claude 1.3 and Falcon-40B-Instruct—were tested as well. (GPT-4 and Claude 1.3 performed the best.) But, rather than using a single system to handle all aspects of the chemistry, the researchers set up distinct instances to cooperate in a division of labor setup and called it “Coscientist.”

The three systems they used are:

Web searcher. This has two main capabilities. One is to use Google’s search API to find pages that might be worth ingesting for the information they contain. The second is to ingest those pages and extract information from them—think of that as similar to the context of the earlier portions of a conversation that Chat GPT can maintain to inform its later answers. The researchers could track where this module was spending its time, and about half the places it visited were Wikipedia pages. The top five sites it visited included the journals published by the American Chemical Society and the Royal Society of Chemistry.

Documentation searcher. Think of this as the RTFM instance. The AI was going to be given control of various lab automation equipment, like robotic fluid handlers and such, often controlled via either specialized commands or something like a python API. This AI instance was given access to all the manuals for this equipment, allowing it to figure out how to control it.

Planner. The planner can issue commands to both of the other two AI instances and process their responses. It has access to a Python sandbox to execute code, allowing it to perform calculations. It also has access to the automated lab equipment, allowing it to actually perform and analyze experiments. So you can think of the planner as the portion of the system that has to act like a chemist, learning from the literature and attempting to use equipment to implement what it has learned.

The planner is also able to determine when software errors occur (either in its Python scripts or in its attempts to control the automated hardware), allowing it to correct its mistakes.

Putting the system to use

Initially, the system was asked to synthesize a number of chemicals such as acetaminophen and ibuprofen, confirming that it could generally figure out a viable synthesis after searching the web and scientific literature. So, the question is whether the system could figure out the hardware it had access to well enough to put its conceptual ability to work.

To start with something simple, the researchers used a standard sample plate, which holds a bunch of small wells arranged in a rectangular grid. The system was asked to fill in squares, diagonal stripes, or other patterns using various colored liquids and managed to do so effectively.

Moving on from that, they placed three different colored solutions at random locations in the grid of wells; the system was asked to identify which wells were what color. On its own, Coscientist didn’t know how to do this. But when given a prompt that reminded it that the different colors would show different absorption spectra, it used a spectrograph it had access to and was able to identify the different colors.

With the basic command and control seemingly functioning, the researchers decided to try some chemistry. They provided a sample plate with wells filled with simple chemicals, catalysts, and the like, and asked it to perform a specific chemical reaction. Coscientist got the chemistry right from the start, but its attempts to run the synthesis failed because it sent an invalid command to hardware that heats and stirs the reactions. That sent it back to the Documentation module, allowing it to correct the problem and run reactions.

And it worked. Spectral signatures of the desired products were present in the reaction mixture, and their presence was confirmed by chromatography.

Optimization

With basic reactions working, the researchers then asked the system to improve the efficiency of the reaction—they presented the optimization process as a game where the score would go up with the reaction’s yield.

The system made some bad guesses in the first round of test reactions but quickly zeroed in on better yields. The researchers also found that they could avoid the bad choices in the first round by providing Coscientist with information about the yields generated by a handful of random starting mixtures. This implies that it doesn’t matter where Coscientist gets its information—either from reactions it runs or from some external information source—it is able to incorporate the information into its planning.

The researchers conclude that Coscientist has several notable capabilities:

  • Planning chemical synthesis using public information
  • Navigating and processing technical manuals for complicated hardware
  • Using that knowledge to control a range of laboratory equipment
  • Integrating these hardware-handling capabilities into a lab workflow
  • Analyzing its own reactions and using that information to design improved reaction conditions.

In a lot of ways, this sounds like the experience a student might have in the first year of graduate school. Ideally, the grad student will progress beyond that. But maybe GPT-5 will be able to as well.

More seriously, the structure of Coscientist, which relies on the interaction of a number of specialized systems, is similar to how brains operate. Obviously, the brain’s specialized systems are capable of a much wider range of activities, and there’s a lot more of them. But it may be that this sort of structure is critical for enabling more complicated behavior.

That said, the researchers themselves are concerned about some of Coscientist’s capabilities. There are a lot of chemicals (think things like nerve gasses) that we don’t want to see made easier to synthesize. And figuring out how to tell GPT instances not to do something has become an ongoing challenge.

Nature, 2023. DOI: 10.1038/s41586-023-06792-0  (About DOIs).

https://arstechnica.com/?p=1992320