'Robotic Scientist' will run experiments too complex for humans
Lipson and graduate student Michael Schmidt have already demonstrated the system's ability to derive natural laws of motion from observations of a physical system. The new work focuses on biology, where there are often hundreds of interacting variables. "Many systems in biology are too complex to analyze manually," Schmidt said. "There may be new things we haven't found because they're ugly and complex, but to the computer they're obvious."
Unlike current drug tests that look for the drug itself or its breakdown products, the new approach will search for traces of previous use. Preliminary experiments suggest that drugs like alcohol and cocaine bring about changes in the metabolism of cells that might change the chemicals the cells secrete in response to certain stimuli. Detecting those secretions could make a test that's harder to fool, and information on past use could be valuable in choosing the best treatment for a drug abuser.
The quest for the new test is a collaboration among Cornell, Vanderbilt and Duke universities and the National Institute on Drug Abuse of the National Institutes of Health, which has provided $2.7 million in stimulus money from the American Reinvestment and Recovery Act (ARRA) to fund the project. It combines nanotechnology to isolate and manipulate a small number of immune-system cells called leukocytes, computer-controlled equipment to infuse the cells with various chemicals and analyze proteins and other materials they secrete in response, and Lipson and Schmidt's AI system to interpret the results of an experiment and direct the apparatus to conduct new experiments.
Vanderbilt scientists will feed leucocytes from the blood of rats and mice addicted to cocaine or alcohol into their analytical apparatus for comparison to "control" cells from non-addicted animals. A high-performance parallel computer at Cornell will remotely control the apparatus at Vanderbilt.
Given the results of the first, hand-operated experiment, the computer will randomly generate many sets of rules that might explain the relationship between the inputs and outputs. It will then run simulations using these rules to see if the results fit the data. The ones that come closest will be tweaked and run again, repeating until only the best remain. There will be several sets of rules because, Schmidt said, at the beginning there is very little data and many possible explanations for the results. So the computer will then evolve new experiments that create the most disagreement between predictions of competing candidate rules.
"We can add a certain nutrient, or a little more of this or less of that," Lipson explained. "New data will refute some of the models. Some models will die out, some will be supported and spawn off even better models. Processing the results of one experiment and sending back instructions for the next should take about two minutes. We might conduct hundreds of experiments, gradually zeroing in on the truth."
What should emerge at the end is a set of input conditions that produce a clear signature of exposure to a particular drug.
The Eureqa software is freely available onlin . "We are looking for other collaborations where automated experimentation can be useful," Lipson said.