Live video comments, or "danmu", are an emerging social feature on Asian online video platforms. These time-synchronous comments are overlaid on the video playback and uniquely enrich the viewing experience, engaging hundreds of millions of users in rich community discussions. The presence of danmu comments has become a determining factor for video popularity. Recent work has proposed a model to automatically generate comments, but very little work has so far considered the problem of where to insert the comments in the video timeline. In this work, we propose to address both the what and where of automatic danmu generation, by jointly predicting the danmu comment content to be generated, as well as its optimal insertion point in the video timeline. Our model exploits the video visual content, subtitles, audio signals, and any existing surrounding comments, in one unified architecture and can handle scenarios where the videos are already heavily commented or when the video has no comments yet. Experiments show that our proposed unified framework is in general observed to outperform state-of-the-art comment generation methods.
CITATION STYLE
Wu, H., Jones, G. J. F., & Pitie, F. (2021). Knowing Where and What to Write in Automated Live Video Comments: A Unified Multi-Task Approach. In ICMI 2021 - Proceedings of the 2021 International Conference on Multimodal Interaction (pp. 619–627). Association for Computing Machinery, Inc. https://doi.org/10.1145/3462244.3479942
Mendeley helps you to discover research relevant for your work.