Text to 3D scene generation with rich lexical grounding

39Citations
Citations of this article
225Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The ability to map descriptions of scenes to 3D geometric representations has many applications in areas such as art, education, and robotics. However, prior work on the text to 3D scene generation task has used manually specified object categories and language that identifies them. We introduce a dataset of 3D scenes annotated with natural language descriptions and learn from this data how to ground textual descriptions to physical objects. Our method successfully grounds a variety of lexical terms to concrete referents, and we show quantitatively that our method improves 3D scene generation over previous work using purely rule-based methods. We evaluate the fidelity and plausibility of 3D scenes generated with our grounding approach through human judgments. To ease evaluation on this task, we also introduce an automated metric that strongly correlates with human judgments.

Cite

CITATION STYLE

APA

Chang, A., Monroe, W., Savva, M., Potts, C., & Manning, C. D. (2015). Text to 3D scene generation with rich lexical grounding. In ACL-IJCNLP 2015 - 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, Proceedings of the Conference (Vol. 1, pp. 53–62). Association for Computational Linguistics (ACL). https://doi.org/10.3115/v1/p15-1006

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free