Text-To-3D Generative AI on Mobile Devices: Measurements and Optimizations

0Citations
Citations of this article
15Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Emerging generative models can create 3D objects from text prompts. However, deploying these models on mobile devices is challenging due to resource constraints and user demand for real-Time performance. We take a first step towards understanding the bottlenecks by performing a measurement study of three recent text-To-3D generative models (Point-E, Shap-E, and CLIP-Mesh) in terms of their runtime GPU memory usage, latency, and synthesis quality. We investigate the effectiveness of quantization and distillation techniques to overcome these challenges by speeding up inference execution, potentially at the expense of quality. We find that the Shap-E model is promising for mobile deployment, but requires further optimization in its bottleneck diffusion step for real-Time performance, as well as reduced memory usage and load times. Further work is needed on custom optimizations for generative text-To-3D models, including targeting specific metrics at each computation stage, efficient representations of 3D objects, and adaptive network and system support for resource-hungry models.

Cite

CITATION STYLE

APA

Zhang, X., Li, Z., Oymak, S., & Chen, J. (2023). Text-To-3D Generative AI on Mobile Devices: Measurements and Optimizations. In Proceedings of the 2023 Workshop on Emerging Multimedia Systems, EMS 2023 (pp. 8–14). Association for Computing Machinery, Inc. https://doi.org/10.1145/3609395.3610594

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free