Researchers from the University of California, Berkeley have developed a robotic learning technology that enables robots to imagine the future of their actions.
Based on this ground-breaking technology, the robot can reportedly determine how to manipulate objects it has never encountered before.
In the future, this technology could help self-driving cars anticipate future events on the road and produce more intelligent robotic assistants in homes. But the initial prototype focuses on learning simple manual skills entirely from autonomous play.
Using this technology, called visual foresight, the robots can predict what their cameras will see if they perform a sequence of movements.
These robotic imaginations are still relatively simple for now – predictions made only several seconds into the future – but they are enough for the robot to figure out how to move objects around on a table without disturbing obstacles.
Crucially, the robot can learn to perform these tasks without any help from humans or prior knowledge about physics, its environment or what the objects are.
That’s because the visual imagination is learned entirely from scratch from unattended and unsupervised exploration, where the robot plays with objects on a table.
After this play phase, the robot builds a predictive model of the world, and can use this model to manipulate new objects that it has not seen before.
Sergey Levine, assistant professor in Berkeley’s Department of Electrical Engineering, said: “In the same way that we can imagine how our actions will move the objects in our environment, this method can enable a robot to visualise how different behaviours will affect the world around it. This can enable intelligent planning of highly flexible skills in complex real-world situations.”
At the core of this system is a deep learning technology based on convolutional recurrent video prediction, or dynamic neural advection (DNA). DNA-based models predict how pixels in an image will move from one frame to the next based on the robot’s actions.
Recent improvements to this class of models, as well as greatly improved planning capabilities, have enabled robotic control based on video prediction to perform increasingly complex tasks, such as sliding toys around obstacles and repositioning multiple objects.
With the new technology, a robot pushes objects on a table, then uses the learned prediction model to choose motions that will move an object to a desired location.
The Berkeley scientists are continuing to research control through video prediction, focusing on further improving video prediction and prediction-based control.
As well, they are developing more sophisticated methods by which robots can collected more focused video data, for complex tasks such as picking and placing objects and manipulating soft and deformable objects such as cloth or rope, and assembly.