When mapping a natural language instruction to a sequence of actions, it is often useful to identify sub-tasks in the instruction. Such sub-task segmentation, however, is not necessarily provided in the training data. We present the A2LCTC (Action-to-Language Connectionist Temporal Classification) algorithm to automatically discover a sub-task segmentation of an action sequence. A2LCTC does not require annotations of correct sub-task segments and learns to find them from pairs of instruction and action sequence in a weakly-supervised manner. We experiment with the ALFRED dataset and show that A2LCTC accurately finds the sub-task structures. With the discovered sub-tasks segments, we also train agents that work on the downstream task and empirically show that our algorithm improves the performance.
CITATION STYLE
Ri, R., Hou, Y., Marinescu, R., & Kishimoto, A. (2022). Finding Sub-task Structure with Natural Language Instruction. In LNLS 2022 - 1st Workshop on Learning with Natural Language Supervision, Proceedings of the Workshop (pp. 1–9). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.lnls-1.1
Mendeley helps you to discover research relevant for your work.