Sega golden oldie repackaged as a research testbed
Reinforcement learning is an area of machine learning that tries to teach an agent specific behaviours in a fixed environment. The agent is programmed to explore its environment and experiments with different actions, whenever it makes a good move it is rewarded. Since it’s been programmed to try and maximise its score, the idea is that it should improve and learns how to complete a specific task over time.
It’s often explored in old school video games like Doom, Pacman, or Q*bert. Now, it’s time to bring Sonic the Hedgehog back.
OpenAI has released Gym Retro, a new platform made up of 58 specific scenarios or “save states” from the games: Sonic the Hedgehog, Sonic the Hedgehog 2, and Sonic 3 & Knuckles.
These mini challenges are designed to test the competitor’s RL algorithm on levels it has never seen before. During the training process any environment or datasets can be used, but for the testing phase the agent is only allowed 18 hours on each level.
For humans 18 hours in one go is certainly too long to spend on any level in Sonic the Hedgehog, but machines they need the extra time because it’s actually pretty difficult for them.
Transfer learning is very challenging for neural networks, Julian Togelius, an associate professor at New York University researching AI and games, who helped advise OpenAI for this project, explained to The Register.
“When we play games, we pick things up really fast. But neural networks learn very brittle representations, and essential overfit to specific scenarios. So they aren’t really good at generalising to new ones.”
Researchers have previously used the Arcade Learning Environment (ALE), a platform based on Atari 2600 games to benchmark progress. Togelius said the new Gym Retro environment is a “big step forward.”
ALE is a very limited environment. The number of parameters in the neural networks used to play the games that are stored in memory is considerably larger than the games itself. It means that neural networks can essentially just memorise how to play the games.
Gym Retro is a little more challenging but there is still a long way to go, Togelius said. Although the agents are being challenged, they’re still only learning behaviours that are really applicable to the same game. Also, the Sonic the Hedgehog games don’t have that many levels to start with, and the training and testing levels are pretty similar to one another.
John Schulman, a researcher at OpenAI, agreed that AI can’t transfer between different games yet.
“Transfer learning for RL is still in its infancy. Transferring between different games is still a bit too hard, but transferring between different levels is about right for the current state of the art. Like Super Mario, Sonic includes many levels, including a variety of visuals and physics,” he told El Reg.
The competition runs from 5 April to 5 June. Researchers at Open AI have released some baseline results they achieved on a range of algorithms to describe the benchmarks in more detail.
“We’re interested to see what works the best. We hope to see everything from handcrafted bots to model based RL to learning from demonstrations to DQN-like algorithms,” Schulman said.
You can register for the competition here. ®