Neural architecture search (NAS) has advanced significantly in recent years but most NAS systems restrict search to learning architectures of a recurrent or convolutional cell. In this paper, we extend the search space of NAS. In particular, we present a general approach to learn both intra-cell and inter-cell architectures (call it ESS). For a better search result, we design a joint learning method to perform intra-cell and inter-cell NAS simultaneously. We implement our model in a differentiable architecture search system. For recurrent neural language modeling, it outperforms a strong baseline significantly on the PTB and WikiText data, with a new state-of-the-art on PTB. Moreover, the learned architectures show good transferability to other systems. E.g., they improve state-of-the-art systems on the CoNLL and WNUT named entity recognition (NER) tasks and CoNLL chunking task, indicating a promising line of research on large-scale pre-learned architectures.
CITATION STYLE
Li, Y., Hu, C., Zhang, Y., Xu, N., Jiang, Y., Xiao, T., … Li, C. (2020). Learning architectures from an extended search space for language modeling. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (pp. 6629–6639). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2020.acl-main.592
Mendeley helps you to discover research relevant for your work.