Visual Recipe Flow: A Dataset for Learning Visual State Changes of Objects with Recipe Flows

Keisuke Shirai; Atsushi Hashimoto; Taichi Nishimura; Hirotaka Kameko; Shuhei Kurita; Yoshitaka Ushiku; Shinsuke Mori

Conference Proceedings

Visual Recipe Flow: A Dataset for Learning Visual State Changes of Objects with Recipe Flows

Proceedings - International Conference on Computational Linguistics, COLING (2022) 29(1) 3570-3577

DOI: 10.5715/jnlp.30.1042

7Citations

26Readers

Get full text

Abstract

We present a new multimodal dataset called Visual Recipe Flow, which enables us to learn each cooking action result in a recipe text. The dataset consists of object state changes and the workflow of the recipe text. The state change is represented as an image pair, while the workflow is represented as a recipe flow graph (r-FG). The image pairs are grounded in the r-FG, which provides the cross-modal relation. With our dataset, one can try a range of applications, from multimodal commonsense reasoning and procedural text generation.

Cite

CITATION STYLE

APA

Shirai, K., Hashimoto, A., Nishimura, T., Kameko, H., Kurita, S., Ushiku, Y., & Mori, S. (2022). Visual Recipe Flow: A Dataset for Learning Visual State Changes of Objects with Recipe Flows. In Proceedings - International Conference on Computational Linguistics, COLING (Vol. 29, pp. 3570–3577). Association for Computational Linguistics (ACL). https://doi.org/10.5715/jnlp.30.1042

Visual Recipe Flow: A Dataset for Learning Visual State Changes of Objects with Recipe Flows

Abstract

Cite

Register to see more suggestions