Raise to speak: An accurate, low-power detector for activating voice assistants on smartwatches

Shiwen Zhao; Heri Nieto; Krishna Sridhar; Sethu Raman; Brandt Westing; Roman Holenstein; Brandon Newendorp; Tim Paek; Carlos Guestrin; Shawn Scully; Minwoo Jeong; Mike Bastian; Kevin Lynch

Conference ProceedingsOPEN ACCESS

Raise to speak: An accurate, low-power detector for activating voice assistants on smartwatches

Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2019) 2736-2744

DOI: 10.1145/3292500.3330761

5Citations

23Readers

Get full text

Abstract

The two most common ways to activate intelligent voice assistants (IVAs) are button presses and trigger phrases. This paper describes a new way to invoke IVAs on smartwatches: simply raise your hand and speak naturally. To achieve this experience, we designed an accurate, low-power detector that works on a wide range of environments and activity scenarios with minimal impact to battery life, memory footprint, and processor utilization. The raise to speak (RTS) detector consists of four main components: an on-device gesture convolutional neural network (CNN) that uses accelerometer data to detect specific poses; an on-device speech CNN to detect proximal human speech; a policy model to combine signals from the motion and speech detector; and an off-device false trigger mitigation (FTM) system to reduce unintentional invocations trigged by the on-device detector. Majority of the components of the detector run on-device to preserve user privacy. The RTS detector was released in watchOS 5.0 and is running on millions of devices worldwide.

Author supplied keywords

Cite

CITATION STYLE

APA

Zhao, S., Nieto, H., Sridhar, K., Raman, S., Westing, B., Holenstein, R., … Lynch, K. (2019). Raise to speak: An accurate, low-power detector for activating voice assistants on smartwatches. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 2736–2744). Association for Computing Machinery. https://doi.org/10.1145/3292500.3330761

Raise to speak: An accurate, low-power detector for activating voice assistants on smartwatches

Abstract

Author supplied keywords

Cite

Register to see more suggestions