Situated mapping of sequential instructions to actions with single-step reward observation

Alane Suhr; Yoav Artzi

Conference ProceedingsOPEN ACCESS

Situated mapping of sequential instructions to actions with single-step reward observation

ACL 2018 - 56th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers) (2018) 1 2072-2082

DOI: 10.18653/v1/p18-1193

23Citations

133Readers

Abstract

We propose a learning approach for mapping context-dependent sequential instructions to actions. We address the problem of discourse and state dependencies with an attention-based model that considers both the history of the interaction and the state of the world. To train from start and goal states without access to demonstrations, we propose SESTRA, a learning algorithm that takes advantage of single-step reward observations and immediate expected reward maximization. We evaluate on the SCONE domains, and show absolute accuracy improvements of 9.8%-25.3% across the domains over approaches that use high-level logical representations.

Cite

CITATION STYLE

APA

Suhr, A., & Artzi, Y. (2018). Situated mapping of sequential instructions to actions with single-step reward observation. In ACL 2018 - 56th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers) (Vol. 1, pp. 2072–2082). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/p18-1193

Situated mapping of sequential instructions to actions with single-step reward observation

Abstract

Cite

Register to see more suggestions