Abstract
While several tools exist for training and evaluating narrow machine learning (ML) algorithms, their design generally does not follow a particular or explicit evaluation methodology or theory. Inversely so for more general learners, where many evaluation methodologies and frameworks have been suggested, but few specific tools exist. In this paper we introduce a new framework for broad evaluation of artificial intelligence (AI) learners, and a new tool that builds on this methodology. The platform, called SAGE (Simulator for Autonomy & Generality Evaluation), works for training and evaluation of a broad range of systems and allows detailed comparison between narrow and general ML and AI. It provides a variety of tuning and task construction options, allowing isolation of single parameters across complexity dimensions. SAGE is aimed at helping AI researchers map out and compare strengths and weaknesses of divergent approaches. Our hope is that it can help deepen understanding of the various tasks we want AI systems to do and the relationship between their composition, complexity, and difficulty for various AI systems, as well as contribute to building a clearer research road map for the field. This paper provides an overview of the framework and presents results of an early use case.
Author supplied keywords
Cite
CITATION STYLE
Eberding, L. M., Thórisson, K. R., Sheikhlar, A., & Andrason, S. P. (2020). Sage: task-environment platform for evaluating a broad range of ai learners. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12177 LNAI, pp. 72–82). Springer. https://doi.org/10.1007/978-3-030-52152-3_8
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.