It’s not easy to train a neural network. Even if they’re not difficult to implement, it can take hours to get them ready no matter how much computing power you can use. OpenAI researchers may have a better solution: forgetting many of the usual rules. They’ve developed an evolution strategy (no, it doesn’t relate much to biological evolution) that promises more powerful AI systems. Rather than use standard reinforcement training, they create a “black box” where they forget that the environment and neural networks are even involved. It’s all about optimizing a given function in isolation and sharing it as necessary.
The system starts with many random parameters, makes guesses, and then tweaks follow-up guesses to favor the more successful candidates, gradually whittling things down to the ideal answer. You may start with a million numbers, but you’ll end up with just one in the end.
It sounds a bit mysterious, but the benefits are easy to understand. The technique eliminates a lot of the traditional cruft in training neural networks, making the code both easier to implement and roughly two to three times faster. And when ‘workers’ in this scheme only need to share tiny bits of data with each other, the method scales elegantly the more processor cores you throw at a problem. In tests, a large supercomputer with 1,440 cores could train a humanoid to walk in 10 minutes versus 10 hours for a typical setup, and even a “lowly” 720-core system could do in 1 hour what a 32-core system would take a full day to accomplish.
There’s a long way to go before you see the black box approach used in real-world AI. However, the practical implications are clear: neural network operators could spend more time actually using their systems instead of training them. And as computers get ever faster, this increases the likelihood that this kind of learning can effectively happen in real time. You could eventually see robots that are very quick to adapt to new tasks and learn from mistakes.