Artificial agents learning human fairness
- ISBN: 9780981738116
Abstract
Recent advances in technology allow multi-agent systems to be de- ployed in cooperation with or as a service for humans. Typically, those systems are designed assuming individually rational agents, according to the principles of classical game theory. However, re- search in the field of behavioral economics has shown that humans are not purely self-interested: they strongly care about fairness. Therefore, multi-agent systems that fail to take fairness into ac- count, may not be sufficiently aligned with human expectations and may not reach intended goals. In this paper, we present a computa- tional model for achieving fairness in adaptive multi-agent systems. The model uses a combination of Continuous Action Learning Au- tomata and the Homo Egualis utility function. The novel contribu- tion of our work is that this function is used in an explicit, compu- tational manner. We show that results obtained by agents using this model are compatible with experimental and analytical results on human fairness, obtained in the field of behavioral economics.
Author-supplied keywords
Artificial agents learning human fairness
Steven de Jong
MICC, Maastricht University,
The Netherlands
steven.dejong@micc.unimaas.nl
Karl Tuyls
Faculty of Industrial Design,
Eindhoven Technical
University, The Netherlands
ktuyls@gmail.com
Katja Verbeeck
Katholieke Hogeschool St.
Lieven, Gent, Belgium
katja.verbeeck@kahosl.be
ABSTRACT
Recent advances in technology allow multi-agent systems to be de-
ployed in cooperation with or as a service for humans. Typically,
those systems are designed assuming individually rational agents,
according to the principles of classical game theory. However, re-
search in the field of behavioral economics has shown that humans
are not purely self-interested: they strongly care about fairness.
Therefore, multi-agent systems that fail to take fairness into ac-
count, may not be sufficiently aligned with human expectations and
may not reach intended goals. In this paper, we present a computa-
tional model for achieving fairness in adaptive multi-agent systems.
The model uses a combination of Continuous Action Learning Au-
tomata and the Homo Egualis utility function. The novel contribu-
tion of our work is that this function is used in an explicit, compu-
tational manner. We show that results obtained by agents using this
model are compatible with experimental and analytical results on
human fairness, obtained in the field of behavioral economics.
Categories and Subject Descriptors
I.2.6 [Learning]; I.2.11 [Distributed Artificial Intelligence]; J.4
[Social and Behavioral Sciences]
General Terms
Algorithms, Design, Human Factors
Keywords
Fairness, Homo Egualis, Reinforcement Learning
1. INTRODUCTION
Modeling agents for a multi-agent system requires a thorough
understanding of the type and form of interactions with the environ-
ment and other agents in the system, including any humans. Since
many multi-agent systems are designed to interact with humans or
to operate on behalf of them, for instance in bargaining [12, 36],
resource distribution [10] and aircraft deicing [24], agents’ behav-
ior should often be aligned with human expectations. Otherwise,
agents may fail to reach their goals.
Usually, multi-agent systems are designed according to the prin-
ciples of a standard game-theoretical model, i.e., assuming individ-
ual rationality. However, recently, this strong assumption has been
Cite as: Artificial agents learning human fairness, Steven de Jong, Karl
Tuyls and Katja Verbeeck, Proc. of 7th Int. Conf. on Autonomous
Agents and Multiagent Systems (AAMAS 2008), Padgham, Parkes,
MüllerandParsons(eds.),May,12-16.,2008,Estoril,Portugal,pp.863-870.
Copyright c© 2008, International Foundation for Autonomous Agents and
Multiagent Systems (www.ifaamas.org). All rights reserved.
relaxed in various ways, for instance by including well-known con-
cepts such as bounded rationality [42] and social welfare [7, 8].
Research in the field of behavioral economics shows us that hu-
mans are not purely rational and self-interested; their decisions are
often based on considerations about others [4, 17, 18]. Therefore,
multi-agent systems using only standard game-theoretical princi-
ples risk being insufficiently aligned with human expectations and
may not obtain satisfactory payoffs. Prime examples known from
(evolutionary) game theory include games such as the Ultimatum
Game [17], in which purely rational players usually obtain a very
low payoff, and games such as the Public Goods Game [17, 41] or
the Traveler’s Dilemma [2], in which humans can actually obtain a
higher payoff by failing to find the rational solution, i.e., the Nash
equilibrium. More generally speaking, fairness may be important
in any problem domain in which the allocation of limited resources
plays an important role [7], as in the examples mentioned above.
Thus, designers of a variety of multi-agent systems should take
the human conception of fairness into account. If the motivations
behind human fairness are sufficiently understood and modeled, the
same motivations can be transferred to multi-agent systems. More
precisely, descriptive models of human fairness may be used as
a basis for prescriptive or computational models, used to control
agents in multi-agent systems in a way that guarantees alignment
with human expectations. This interesting track of research ties
in with the descriptive agenda formulated by Shoham [40] and the
objectives of evolutionary game theory [18, 44].
In this paper, we show that it is possible for multi-agent systems
to explicitly represent and utilize human fairness. We use a de-
scriptive model of human fairness called Homo Egualis [17] and
introduce this model into an adaptive multi-agent system driven by
Continuous Action Learning Automata. In contrast to earlier work
[46], in which agents were inspired by the Homo Egualis model to
obtain a fair distribution of limited resources, we use the model in
a direct, computational manner, to obtain the best possible align-
ment with human behavior. We study the concrete behavior of our
computational model in two game settings (more precisely, the Ul-
timatum and Nash Bargaining Game, extended for more players),
both of which represent common bargaining situations. We then
determine whether we can find and maintain solutions as calcu-
lated by behavioral economists – i.e., fair solutions that tie in with
human behavior.
In the remainder of this paper, we first discuss work in the area
of descriptive models of human fairness. Then, we look at compu-
tational or prescriptive modeling of fairness, first outlining existing
work in this area, then discussing the games we are looking at in
more detail, and finally presenting our own methodology. The pa-
per continues with a set of experiments, after which we discuss
results elaborately and conclude.
863
Already in the 1950’s people started investigating fairness, for
instance in the Nash Bargaining Game [27]. Recently, research in
behavioral economics and evolutionary game theory has examined
human behavior in various games, such as the Ultimatum Game
and the Public Goods Game (e.g., [3, 17]). In comparison to the
fair outcomes reached by human players, standard game-theoretical
models predict a very selfish (and suboptimal) outcome in these
games. The current state of the art describes and models three main
motivations for human fairness.
Inequity aversion. In [17], this is defined as follows: “Inequity
aversion means that people resist inequitable outcomes; i.e., they
are willing to give up some material payoff to move in the direction
of more equitable outcomes”. To model inequity aversion, an ex-
tension of the classical game theoretic actor is introduced, named
Homo Egualis [17, 18]. Homo Egualis agents are driven by the
following utility function:
ui = xi − αin− 1
∑
xj>xi
(xj − xi)− βin− 1
∑
xi>xj
(xi − xj) (1)
Here, ui is the utility of agent i ∈ {1, 2, . . . , n}. This utility is
calculated based on agent i’s own payoff, xi, and two terms related
to considerations on how this payoff compares to the payoffs xj
of other agents j: every agent i experiences a negative influence
on its utility for other agents j that have a higher payoff as well as
other agents that have a lower payoff. Thus, given its own payoff
xi, agent i obtains a maximum utility ui if ∀j : xj = xi.
Research with human subjects provides strong evidence that hu-
mans care more about inequity when doing worse than when doing
better in society [17]. Thus, in general, αi > βi is chosen. More-
over, the βi-parameter must be in the interval [0, 1]: for βi < 0,
agents would be striving for inequity, and for βi > 1, they would be
willing to “burn” some of their payoff in order to reduce inequity,
since simply reducing their payoff (without giving it to someone
else) already increases their utility value.
The Homo Egualis utility function has been shown to adequately
describe human behavior in various games, including the Ultima-
tum Game [17] and the Public Goods Game [9]. However, it should
be noted that there are also experiments in which human behavior
is not adequately captured by a utility model that is exclusively
based on inequity aversion and material interest [6]. Subjects may
also be motivated by additional information they may have about
each other, and by reciprocity: they become less cooperative in the
presence of defectors and sometimes punish unfair behavior. This
leads to two other models, viz. priority awareness and reciprocal
fairness, which will be outlined below.
Priority awareness. In [11], the relation between priorities and
fairness is studied. Experiments with human subjects show that
priorities matter strongly. For instance, priority mail is more expen-
sive than regular mail and should therefore be delivered sooner. To
examine the human response in such situations, an additional pa-
rameter is introduced in the two-player Ultimatum Game, denoting
the fact that one of the players is substantially more wealthy than
the other one – i.e., one player has a higher priority in receiving the
money at stake. It turns out that humans tend to give less money
to more wealthy opponents and accept less money from poor op-
ponents, and the other way around.This behavior is modeled in a
descriptive model called priority awareness.
Reciprocal fairness. The most important limitation of the inequity-
averse and priority-aware models is that they do not explicitly ex-
plain how fair behavior evolves with repeated interactions between
agents [17]. For instance, a group of people repeatedly playing the
same game may start by playing in an individually rational man-
ner, but for some reason may end up playing in a fair, coopera-
tive manner. Reciprocal fairness models aim at providing an an-
swer to the questions why and how this happens. The main idea
is that humans cooperate because of direct and indirect reciprocity
– here, direct means that a person is nice to someone else because
he expects something in return from this other person, and indirect
means that an agent is nice to someone else because he expects to
obtain something from a third person. It turns out that the opposite,
i.e., punishing someone who is nasty, has an even greater effect on
cooperation [41]. However, being nasty may be costly, and thus,
it would be individually rational to punish when we are sure to en-
counter the object of punishment again. Once again, humans do not
select the individually rational solution: even in one-shot interac-
tions, they consistently apply punishment if this is allowed. Since
this is clearly not of direct benefit to the punisher, this phenomenon
is referred to as altruistic punishment (see, e.g., [15, 16, 47]). In-
terestingly, the question thus seems to shift from ‘why do people
cooperate?’ to ‘why do people perform costly punishment?’. Var-
ious explanations have been analyzed from the perspective of evo-
lutionary game theory [18]. For instance, many researchers argue
that altruistic punishment only pays off when the reputation of the
players somehow becomes known to everyone [14, 25]. There are
also alternative explanations such as volunteering [20, 21], fair in-
tentions [13] or the topology of the network of interaction [38].
Although reciprocal fairness and priority awareness are interest-
ing descriptive models, our current work focuses on constructing a
computational model based on inequity aversion, since this model
can already explain many aspects of human behavior in bargaining
situations, our main topic of interest [9, 17].
3. COMPUTATIONAL FAIRNESS
In this section, we first discuss related work in computational
modeling of fairness. Then, we describe the games under study
and analyze their rational and fair solutions. Finally, we outline the
methodology for the design of our learning agents.
3.1 Related work
Here we discuss some contributions to prescriptive modeling of
human fairness. Many of these contributions were originally in-
tended to be descriptive, but were immediately verified in adaptive
agent systems and are thus also computational.
Cooperation in multi-agent games. Various researchers study
fairness using multi-agent games and claim that fairness (or, al-
ternatively, altruistic punishment) is achieved using internal agent
mechanisms such as reputation. To support this claim, the behav-
ior of agents driven by such systems is analyzed, mostly from the
perspective of evolutionary game theory [18]. In many papers, it
is shown that reputation can indeed increase cooperation [13, 14,
29, 32]. In addition to studies being performed on internal mecha-
nisms of agents, there are also studies focusing on external factors
that may lead to fairness. Most notably, researchers argue that hu-
mans do not interact on a random basis, as traditionally assumed by
population dynamics; instead, human interactions, like many other
natural phenomena, seem to be organized in scale-free or small-
world networks [38]. Moreover, humans are able to adjust their
social ties: in case they interact with a person they turn out not to
like, they may refuse to interact with this person again [37]. Indeed,
both ideas increase cooperation in adaptive multi-agent systems.
864
Sign up today - FREE
Mendeley saves you time finding and organizing research. Learn more
- All your research in one place
- Add and import papers easily
- Access it anywhere, anytime


