In realizing video retrieval system, the crucial point is how to provide an effective access method of video contents. This paper focuses on Japanese cooking instruction utterances and describes a method of analyzing structure of them, which leads to a summary of video. We detect a hierarchical structure of video contents by using linguistic and visual information. We found that the integration of visual information can improve the detection of task units better than using linguistic information alone. © Springer-Verlag 2004.
CITATION STYLE
Shibata, T., Tachiki, M., Kawahara, D., Okamoto, M., Kurohashi, S., & Nishida, T. (2004). Structural Analysis of Instruction Utterances Using Linguistic and Visual Information. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 3213, 393–400. https://doi.org/10.1007/978-3-540-30132-5_57
Mendeley helps you to discover research relevant for your work.