A dish of neurons may have taught itself to play Pong (badly)
One of the more exciting developments in AI has been the development of algorithms that can teach themselves the rules of a system. Early versions of things like game-playing algorithms had to be given the basics of a game. But newer versions don’t need that—they simply need a system that keeps track of some reward like a score, and they can figure out which actions maximize that without needing a formal description of the game’s rules.
A paper released by the journal Neuron takes this a step further by using actual neurons grown in a dish full of electrodes. This added an additional level of complication, as there was no way to know what neurons would actually find rewarding. The fact that the system seems to have worked may tell us something about how neurons can self-organize their responses to the outside world.
Say hello to DishBrain
The researchers behind the new work, who were primarily based in Melbourne, Australia, call their system DishBrain. And it’s based on, yes, a dish with a set of electrodes on the floor of the dish. When neurons are grown in the dish, these electrodes can do two things: sense the activity of the neurons above them or stimulate those electrodes. The electrodes are large relative to the size of neurons, so both the sensing and stimulation (which can be thought of as similar to reading and writing information) involve a small population of neurons, rather than a single one.
Beyond that, it’s a standard culture dish, meaning a variety of cell types can be grown in it—for some control experiments, the researchers used cells that don’t respond to electrical signals. For these experiments, the researchers tested two types of neurons: some dissected from mouse embryos, and others produced by inducing human stem cells to form neurons. In both cases, as seen in other experiments, the neurons spontaneously formed connections with each other, creating networks that had spontaneous activity.
While the hardware is completely flexible, the researchers configured it as part of a closed-loop system with a computer controller. In this configuration, electrodes in a couple of regions of the dish were defined as taking input from the DishBrain; they’re collectively termed the motor region since they control the system’s response.
Another eight regions were designated to receive input in the form of stimulation by the electrodes, which act a bit like a sensory area of the brain. The computer could also use these electrodes to provide feedback to the system, which we’ll get into below.
Collectively, these provide everything necessary for a neural network to learn what’s going on in the computer environment. The motor electrodes allow the neurons to alter the behavior of the environment, and the sensory ones receive both input on the state of the environment as well as a signal that indicates whether its actions were successful. The system is generic enough that all sorts of environments could be set up in the computer portion of the experiment—pretty much anything where simple inputs alter the environment.
The researchers chose Pong.
https://arstechnica.com/?p=1889781