Abstract
Multimedia procedural texts, such as instructions and manuals with pictures, support people to share how-to knowledge. In this paper, we propose a method for generating a procedural text given a photo sequence allowing users to obtain a multimedia procedural text. We propose a single embedding space both for image and text enabling to interconnect them and to select appropriate words to describe a photo. We implemented our method and tested it on cooking instructions, i.e., recipes. Various experimental results showed that our method outperforms standard baselines.
Cite
CITATION STYLE
Nishimura, T., Hashimoto, A., & Mori, S. (2019). Procedural text generation from a photo sequence. In INLG 2019 - 12th International Conference on Natural Language Generation, Proceedings of the Conference (pp. 409–414). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/w19-8650
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.