Abstract
A common approach to jointly learn multiple tasks with a shared structure is to optimize the model with a combined landscape of multiple sub-costs. However, gradients derived from each sub-cost often conflicts in cost plateaus, resulting in a subpar optimum. In this work, we shed light on such gradient conflict challenges and suggest a solution named Cost-Out, which randomly drops the sub-costs for each iteration. We provide the theoretical and empirical evidence of the existence of escaping pressure induced by the Cost-Out mechanism. While simple, the empirical results indicate that the proposed method can enhance the performance of multi-task learning problems, including two-digit image classification sampled from MNIST dataset and machine translation tasks for English from and to French, Spanish, and German WMT14 datasets.
Author supplied keywords
Cite
CITATION STYLE
Woo, S., Kim, K., Noh, J., Shin, J. H., & Na, S. H. (2021). Revisiting dropout: Escaping pressure for training neural networks with multiple costs. Electronics (Switzerland), 10(9). https://doi.org/10.3390/electronics10090989
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.