Bias Optimality

  • Lewis M
  • Puterman M
N/ACitations
Citations of this article
9Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The use of the long-run average reward or the gain as an optimality criterion has received considerable attention in the literature. However, for many practical models the gain has the undesirable property of being underselective, that is, there may be several gain optimal policies. After finding the set of policies that achieve the primary objective of maximizing the long-run average reward one might search for that which maximizes the "short-run" reward. This reward, called the bias aids in distinguishing among multiple gain optimal policies. This chapter focuses on establishing the usefulness of the bias in distinguishing among multiple gain optimal policies, computing it and demonstrating the implicit discounting captured by bias on recurrent states.

Cite

CITATION STYLE

APA

Lewis, M. E., & Puterman, M. L. (2002). Bias Optimality (pp. 89–111). https://doi.org/10.1007/978-1-4615-0805-2_3

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free