Agent Foundations for Aligning Machine Intelligence with Human Interests: A Technical Research Agenda

28Citations
Citations of this article
99Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In this chapter, we discuss a host of technical problems that we think AI scientists could work on to ensure that the creation of smarter-than-human machine intelligence has a positive impact. Although such systems may be decades away, it is prudent to begin research early: the technical challenges involved in safety and reliability work appear formidable, and uniquely consequential. Our technical agenda discusses three broad categories of research where we think foundational research today could make it easier in the future to develop superintelligent systems that are reliably aligned with human interests: 1.Highly reliable agent designs: how to ensure that we built the right system.2.Error tolerance: how to ensure that the inevitable flaws are manageable and correctable.3.Value specification: how to ensure that the system is pursuing the right sorts of objectives. Since little is known about the design or implementation details of such systems, the research described in this chapter focuses on formal agent foundations for AI alignment research—that is, on developing the basic conceptual tools and theory that are most likely to be useful for engineering robustly beneficial systems in the future.

Cite

CITATION STYLE

APA

Soares, N., & Fallenstein, B. (2017). Agent Foundations for Aligning Machine Intelligence with Human Interests: A Technical Research Agenda. In Frontiers Collection (Vol. Part F976, pp. 103–125). Springer VS. https://doi.org/10.1007/978-3-662-54033-6_5

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free