It is important for a robot to be able to interpret natural language commands given by a human. In this paper, we consider performing a sequence of mobile manipulation tasks with instructions described in natural language. Given a new environment, even a simple task such as boiling water would be performed quite differently depending on the presence, location and state of the objects. We start by collecting a dataset of task descriptions in free-form natural language and the corresponding grounded task-logs of the tasks performed in an online robot simulator. We then build a library of verb-environment instructions that represents the possible instructions for each verb in that environment, these may or may not be valid for a different environment and task context. We present a model that takes into account the variations in natural language and ambiguities in grounding them to robotic instructions with appropriate environment context and task constraints. Our model also handles incomplete or noisy natural language instructions. It is based on an energy function that encodes such properties in a form isomorphic to a conditional random field. We evaluate our model on tasks given in a robotic simulator and show that it successfully outperforms the state of the art with 61.8% accuracy. We also demonstrate a grounded robotic instruction sequence on a PR2 robot using the Learning from Demonstration approach.
Mendeley saves you time finding and organizing research
Choose a citation style from the tabs below