Teacher Guided Neural Architecture Search for Face Recognition

7Citations
Citations of this article
13Readers
Mendeley users who have this article in their library.

Abstract

Knowledge distillation is an effective tool to compress large pre-trained convolutional neural networks (CNNs) or their ensembles into models applicable to mobile and embedded devices. However, with expected flops or latency, existing methods are hand-crafted heuristics. They propose to predefine the target student network for knowledge distillation, which may be sub-optimal because it requires much effort to explore a powerful student from the large design space. In this paper, we develop a novel teacher guided neural architecture search method to directly search the student network with flexible channel and layer sizes. Specifically, we define the search space as the number of the channels/layers, which is sampled based on the probability distribution and is learned by minimizing the search objective of the student network. The maximum probability for the size in each distribution serves as the final searched width and depth of the target student network. Extensive experiments on a variety of face recognition benchmarks have demonstrated the superiority of our method over the state-of-the-art alternatives.

Cite

CITATION STYLE

APA

Wang, X. (2021). Teacher Guided Neural Architecture Search for Face Recognition. In 35th AAAI Conference on Artificial Intelligence, AAAI 2021 (Vol. 4A, pp. 2817–2825). Association for the Advancement of Artificial Intelligence. https://doi.org/10.1609/aaai.v35i4.16387

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free