This paper presents a method for ascertaining the relevance of inputs in vision-based tasks by exploiting temporal coherence and predictability. In contrast to the tasks explored in many previous relevance experiments, the class of tasks examined in this study is one in which relevance is a time-varying function of the previous and current inputs. The method proposed in this paper dynamically allocates relevance to inputs by using expectations of their future values. As a model of the task is learned, the model is simultaneously extended to create task-specific predictions of the future values of inputs. Inputs that are not relevant, and therefore not accounted for in the model, will not be predicted accurately. These inputs can be de-emphasized, and, in turn, a new, improved, model of the task created. The techniques presented in this paper have been successfully applied to the vision-based autonomous control of a land vehicle, vision-based hand tracking in cluttered scenes, and the detection of faults in the plasma-etch step of semiconductor wafers.
Baluja, S., & Pomerleau, D. (2002). Dynamic relevance: vision-based focus of attention using artificial neural networks. Artificial Intelligence, 97(1–2), 381–395. https://doi.org/10.1016/s0004-3702(97)00065-9