With the prevalence of video sharing, there are increasing demands for automatic video digestion such as highlight detection. Recently, platforms with crowdsourced time-sync video comments have emerged worldwide, providing a good opportunity for highlight detection. However, this task is non-trivial: (1) time-sync comments often lag behind their corresponding shot; (2) time-sync comments are semantically sparse and noisy; (3) to determine which shots are highlights is highly subjective. The present paper aims to tackle these challenges by proposing a framework that (1) uses concept- mapped lexical-chains for lagcalibration; (2) models video highlights based on comment intensity and combination of emotion and concept concentration of each shot; (3) summarize each detected highlight using improved SumBasic with emotion and concept mapping. Experiments on large real-world datasets show that our highlight detection method and summarization method both outperform other benchmarks with considerable margins.
CITATION STYLE
Ping, Q., & Chen, C. (2017). Video highlights detection and summarization with lag-calibration based on concept-emotion mapping of crowd-sourced time-sync comments. In EMNLP 2017 - Workshop on New Frontiers in Summarization, NFiS 2017 - Workshop Proceedings (pp. 1–14). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/w17-4501
Mendeley helps you to discover research relevant for your work.