Abstract
Objective: To improve learning efficiency and prediction accuracy, multi-task learning aims to tackle multiple tasks based on the generic features assumption those are prior to task-related features. Multi-task learning technique has been applied in a variety of computer vision applications on the aspects of object detection and tracking, object recognition, human-based identification and human facial attribute classification. The worldwide digitization of artwork has called to art research from the aspect of computer vision and further facilitated cultural heritage preservation. Automatic artwork analysis has been developing the art style, the content of the painting, or the oriented attributes analysis for art research. Our multi-task learning for automatic art analysis application is based on the historical, social and artistic information. The existing multi-task joint learning methods learn multiple tasks based on a labor cost and time consuming weighted sum of losses. Our method illustrates art classification and art retrieval tools for the application of Digital Art Museum, which is convenient for researchers to deeply understand the connotation of art and further harness traditional cultural heritage research. Method: A multiple objectives learning method is based on Bayesian theory. In terms of Bayesian analyzed results, we use the correlation between each task and introduce task cluster (clustering) to constrain the model. Then, we formulate a multi-task loss function via maximizing the Gaussian possibility derived of homoscedastic uncertainty via task-dependent uncertainty in Bayesian modeling. Result: In order to slice into art classification and art retrieval missions, we identify the SemArt dataset, a recent multi-modal benchmark for understanding the semantic essence of the art, which is designed to retrieve the art paginating cross different modal, and could be readily modified for the classification of art paginating. This dataset contains 21 384 art painting images, which is randomly split into training, validation and test sets based on 19 244, 1 069 and 1 069 samples, respectively. First, we conduct art classification experiments on the SemArt dataset, and then evaluate the performance through classification accuracy, i.e., the proportion of properly predicted paintings to the total amount of paintings in test procedure. The art classification results demonstrate that our model is qualified based on proposed adaptive multi-task learning technique while in the previous multi-task learning model, the weight of each task in fixed. For example, in "Timeframe" classification task, the improvement is about 4.43% with respect to the previous model. In order to calculate the task-specific weighting, the previous model barriers are limited to twice back forward tracing. The art classification results also validate the importance of introducing weighting constraints in our model. Next, we also evaluate our model on cross-modal art retrieval tasks. Experiments are conducted through Text2Art Challenge Evaluation where painting samples are sorted out based on their similarity to an oriented text, and vice versa. The calculated ranking results are evaluated by median rank and recall rate at K, with K being 1, 5 and 10 on the test dataset and performances. Median rank denotes the value separating the higher half of the relevant ranking position amount all samples, whereas recall at rate K represents the rate of samples for which its relevant image is in the top K positions of the ranking. Compared with the most recent knowledge-graph-based model in the context of author attribute, the improvement is about 9.91% in average which is consistent of classification results. Finally, we compare our model with manual evaluators. Following an artistic text, which contains comment, title, author, type, school and time schedule, participants are required to pick the most proper painting image out from a collection of 10 images. There are two distinct levels in this task as mentioned below: the collection of painting images are easy to random selected from the test set, and the difficulty is where the 10 collected images have the identical attribute category (i.e., portraits, landscapes). All participants are required to conduct the task for 100 artistic texts in each level. The performance is reported as the proportion of clear feedbacks over all responses. Our demonstrated results also illustrate that our modeling accuracy is quite closer to human evaluators. Conclusion: We harness an adaptive multi-task learning method to weight multiple loss functions based on Bayesian theory for automatic art analysis tasks. Furthermore, we conduct several experiments on the public available art dataset. The synthesized results on this dataset include both art classification and art retrieval challenges.
Author supplied keywords
Cite
CITATION STYLE
Yang, B., Xiang, X., Kong, W., Shi, Y., & Yao, J. (2022). Automatic art analysis based on adaptive multi-task learning. Journal of Image and Graphics, 27(4), 1226–1237. https://doi.org/10.11834/jig.200648
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.