We present a convolutional autoencoder that enables high fidelity volumetric reconstructions of human performance to be captured from multi-view video comprising only a small set of camera views. Our method yields similar end-to-end reconstruction error to that of a probabilistic visual hull computed using significantly more (double or more) viewpoints. We use a deep prior implicitly learned by the autoencoder trained over a dataset of view-ablated multi-view video footage of a wide range of subjects and actions. This opens up the possibility of high-end volumetric performance capture in on-set and prosumer scenarios where time or cost prohibit a high witness camera count.
CITATION STYLE
Gilbert, A., Volino, M., Collomosse, J., & Hilton, A. (2018). Volumetric Performance Capture from Minimal Camera Viewpoints. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11215 LNCS, pp. 591–607). Springer Verlag. https://doi.org/10.1007/978-3-030-01252-6_35
Mendeley helps you to discover research relevant for your work.