MIT researchers help robots follow voice commands

One of MIT CSAIL’s projects is working to provide context to voice recognition, focusing on context and episodic memory to help robots follow voice commands.


A CSAIL’s project is working to provide context to voice recognition to help robots follow voice commands – image courtesy of MIT.

Researchers at Massachusetts Institute of Technology’s ‘Computer Science and Artificial Intelligence Laboratory’ (CSAIL), are working on new ways to employ robot memory on potential industrial scenarios.

CSAIL authors presented in the academic white paper “Temporal Grounding Graphs for Language Understanding with Accrued Visual-Linguistic Context” an Amazon Alexa-like system dubbed “ComText” (for “commands in context”).

ComText is intended to enable robots to understand a wide range of voice commands requiring contextual knowledge about objects and their environments.

Robotic scientist, Rohan Paul explained: “Where humans understand the world as a collection of objects and people and abstract concepts, machines view it as pixels, point-clouds and 3D maps generated from sensors.

“This semantic gap means that, for robots to understand what we want them to do, they need a much richer representation of what we do and say.”

Declarative memory — the recall of concepts, facts, dates — includes semantic memory (general facts) and episodic memory (personal facts). Most approaches to robot learning have focused only on semantic memory.

ComText is designed to observe a range of visuals and natural language to glean ‘episodic memory’ about an object’s size, shape, position, type, and even whether it belongs to someone; from this knowledge base, it can then reason, infer meaning, and respond to commands.

Research scientist, Andrei Barbu added: “The main contribution is this idea that robots should have different kinds of memory, just like people.

“We have the first mathematical formulation to address this issue, and we’re exploring how these two types of memory play and work off of each other.”

The research has been described as an important step towards building robots that can interact much more naturally with people. In particular, it could enable robots to better understand the names used to identify objects, and interpret instructions that use those names to better performs user-driven tasks.

Get insights like this delivered straight to your inbox

5 Digital Briefings | 5 Front-of-Mind Topics | 5 Days a Week

  • Monday: Manufacturing Innovation
  • Tuesday: Manufacturing Leadership
  • Wednesday: Digital Transformation
  • Thursday: Industrial Automation
  • Friday: Industrial Internet

Sign up for free here.