Automatic video editing involving at least the steps of selecting the most valuable footage from points of view of visual quality and the importance of action filmed; and cutting the footage into a brief and coherent visual story that would be interesting to watch is implemented in a purely data-driven manner. Visual semantic and aesthetic features are extracted by the ImageNet-trained convolutional neural network, and the editing controller is trained by an imitation learning algorithm. As a result, at test time the controller shows the signs of observing basic cinematography editing rules learned from the corpus of motion pictures masterpieces.
CITATION STYLE
Podlesnyy, S. (2020). Towards Data-Driven Automatic Video Editing. In Advances in Intelligent Systems and Computing (Vol. 1074, pp. 361–368). Springer. https://doi.org/10.1007/978-3-030-32456-8_39
Mendeley helps you to discover research relevant for your work.