Nowadays, the possibilities offered by state-of-The-Art deep neural networks allow the creation of systems capable of recognizing and indexing visual content with very high accuracy. Performance of these systems relies on the availability of high quality training sets, containing a large number of examples (e.g. million), in addition to the the machine learning tools themselves. For several applications, very good training sets can be obtained, for example, crawling (noisily) annotated images from the internet, or by analyzing user interaction (e.g.: on social networks). However, there are several applications for which high quality training sets are not easy to be obtained/created. Consider, as an example, a security scenario where one wants to automatically detect rarely occurring threatening events. In this respect, recently, researchers investigated the possibility of using a visual virtual environment, capable of artificially generating controllable and photo-realistic contents, to create training sets for applications with little available training images. We explored this idea to generate synthetic photo-realistic training sets to train classifiers to recognize the proper use of individual safety equipment (e.g.: worker protection helmets, high-visibility vests, ear protection devices) during risky human activities. Then, we performed domain adaptation to real images by using a very small image data set of real-world photographs. We show that training with the generated synthetic training set and using the domain adaptation step is an effective solution to address applications for which no training sets exist.
Di Benedetto, M., Meloni, E., Amato, G., Falchi, F., & Gennaro, C. (2019). Learning Safety Equipment Detection using Virtual Worlds. In Proceedings - International Workshop on Content-Based Multimedia Indexing (Vol. 2019-September). IEEE Computer Society. https://doi.org/10.1109/CBMI.2019.8877466