There is a huge number of data on the Internet that can be used for the development of machine learning in a robot or an AI agent. Utilizing this unorganized data, however, usually requires pre-collected database, which is time-consuming and expensive to make. This paper proposes a framework for collecting names of items required for performing a task, using text and image data available on the Internet without relying on any dictionary or pre-made database. We demonstrate a method to use text data acquired from Google Search to estimate term frequency-inverse document frequency (TF-IDF) value for task-word-relation verification, then identify words that are likely to be an item-name using image classification. We show the comparison results of measuring words’ item-name likelihood using various image classification settings. Finally, we have demonstrated that our framework can discover more than 45% of the desired item-names on three example tasks.
CITATION STYLE
Thaipumi, P., & Hasegawa, O. (2019). Task-related item-name discovery using text and image data from the internet. In Advances in Intelligent Systems and Computing (Vol. 751, pp. 49–61). Springer Verlag. https://doi.org/10.1007/978-3-319-78452-6_6
Mendeley helps you to discover research relevant for your work.