Generating Audio-Visual Slideshows from Text Articles Using Word Concreteness

MacKenzie Leake; Hijung Valentina Shin; Joy O. Kim; Maneesh Agrawala

Conference ProceedingsOPEN ACCESS

Generating Audio-Visual Slideshows from Text Articles Using Word Concreteness

Conference on Human Factors in Computing Systems - Proceedings (2020)

DOI: 10.1145/3313831.3376519

25Citations

25Readers

Abstract

We present a system that automatically transforms text articles into audio-visual slideshows by leveraging the notion of word concreteness, which measures how strongly a word or phrase is related to some perceptible concept. In a formative study we learn that people not only prefer such audio-visual slideshows but find that the content is easier to understand compared to text articles or text articles augmented with images. We use word concreteness to select search terms and find images relevant to the text. Then, based on the distribution of concrete words and the grammatical structure of an article, we time-align selected images with audio narration obtained through text-to-speech to produce audio-visual slideshows. In a user evaluation we find that our concreteness-based algorithm selects images that are highly relevant to the text. The quality of our slideshows is comparable to slideshows produced manually using standard video editing tools, and people strongly prefer our slideshows to those generated using a simple keyword-search based approach.

Author supplied keywords

Cite

CITATION STYLE

APA

Leake, M., Shin, H. V., Kim, J. O., & Agrawala, M. (2020). Generating Audio-Visual Slideshows from Text Articles Using Word Concreteness. In Conference on Human Factors in Computing Systems - Proceedings. Association for Computing Machinery. https://doi.org/10.1145/3313831.3376519

Generating Audio-Visual Slideshows from Text Articles Using Word Concreteness

Abstract

Author supplied keywords

Cite

Register to see more suggestions