Sign up & Download
Sign in

The Influence Limiter : Provably Manipulation-Resistant Recommender Systems

by Paul Resnick
Work (2007)

Abstract

An attacker can draw attention to items that don't deserve that attention by manipulating recommender systems. We describe an influence-limiting algorithm that can turn existing recommender systems into manipulation-resistant systems. Honest reporting is the optimal strategy for raters who wish to maximize their influence. If an attacker can create only a bounded number of shills, the attacker can mislead only a small amount. However, the system eventually makes full use of information from honest, informative raters. We describe both the influence limits and the information loss incurred due to those limits in terms of information-theoretic concepts of loss functions and entropies.

Cite this document (BETA)

Available from portal.acm.org
Page 1
hidden

The Influence Limiter : Provably Manipulation-Resistant Recommender Systems

The Influence Limiter: Provably Manipulation-Resistant
Recommender Systems
Paul Resnick
University of Michigan
School of Information
presnick@umich.edu
Rahul Sami
University of Michigan
School of Information
rsami@umich.edu
ABSTRACT
An attacker can draw attention to items that don’t deserve
that attention by manipulating recommender systems. We
describe an influence-limiting algorithm that can turn exist-
ing recommender systems into manipulation-resistant sys-
tems. Honest reporting is the optimal strategy for raters
who wish to maximize their influence. If an attacker can
create only a bounded number of shills, the attacker can
mislead only a small amount. However, the system even-
tually makes full use of information from honest, informa-
tive raters. We describe both the influence limits and the
information loss incurred due to those limits in terms of
information-theoretic concepts of loss functions and entropies.
Categories and Subject Descriptors
I.2.6 [Computing Methodologies]: Artificial Intelligence—
Learning
General Terms
Algorithms, Reliability
Keywords
Recommender systems, manipulation-resistance, shilling
1. INTRODUCTION
Content posted on the Internet is not of uniform qual-
ity, nor is it equally interesting to different audiences. Rec-
ommender systems guide people to items they are likely to
like, based on their own and other people’s subjective re-
actions. We will refer to people’s opinions generically as
ratings, whether users explicitly enter them in the form of
ratings or tags, or whether the system infers them from im-
plicit behavioral indicators such as purchases, read times,
bookmarks, or links.
Authors and other parties often want to direct attention
to particular items. Google, Yahoo!, and others channel this
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
RecSys’07, October 19–20, 2007, Minneapolis, Minnesota, USA.
Copyright 2007 ACM 978-1-59593-730-8/07/0010 ...$5.00.
into a multi-billion dollar advertising marketplace. But to
the extent that people rely on recommender systems of var-
ious kinds to guide their attention, there are also natural in-
centives for promoters to manipulate the recommendations.
An attacker may rate strategically rather than honestly and
may introduce multiple entities, sometimes called sybils, to
rate on behalf of the attacker.
We offer a manipulation-resistance algorithm, called the
Influence Limiter, that can be overlaid on existing recom-
mender algorithms. Consider the predictions about whether
a particular target person will like various items. Each rater
begins with a very low but non-zero reputation score. The
current reputation limits the influence she1 can have on the
prediction for the next item. Eventually, the target person
indicates whether he likes the item and the raters who con-
tributed to predicting whether the target would like it gain
or lose reputation. The more that a rating implies a change
in the prediction for the target, the greater the potential
change in the rater’s reputation score. A rater who simply
goes along with the previous change will have no impact and
thus get no change in her reputation.
The Influence Limiter has several desirable properties.
First, in order to maximize the expected reputation score
of a single rater endowed with some information about the
target’s likely response to the items, the optimal strategy
is to induce predictions that accurately reveal that rater’s
information about the items. If the underlying recommen-
dation algorithm is making optimal use of ratings, this im-
plies that entering honest ratings is optimal. An important
special case is that a rater who has not interacted with an
item, and therefore has no information about the target’s
likely response to it, can only lose reputation in expectation
by giving a rating.
Second, the actual reputation score of any rater is always
positive and is bounded above by an information-theoretic
measure of the actual improvement that rater has made in
the predictions for the target. It is not possible to prevent
all manipulations of the predictions for particular items– a
rater who provides good information on all other items but
strategically provides bad information on one item is indis-
tinguishable from a rater who simply has an unusual opinion
on the item in question. Our algorithm does, however, en-
sure that no rater can negatively impact the overall set of
recommendations for a target by more than a tiny amount.
Moreover, if the item being manipulated is unlikely to be of
interest to the target, later raters may provide information
1We refer generically to raters as female and the target for
predictions as male.
25
Page 2
hidden
that corrects the prediction on that item before the target
is affected by the recommendation.
Finally, our algorithm limits the amount of damage that
can be done with sybils. For example, if one rater provides
bad information about an item in order to increase the repu-
tation of another rater who later corrects that bad informa-
tion, the expected sum of the reputations of the two raters
does not increase. Thus, while it may be possible to trans-
fer reputation among sybils, it is not possible to increase
the total reputation of the raters that a person controls. We
presume that the recommender system imposes some mini-
mal cost (or inconvenience) on the creation of rater entities.
Thus, there is some bound (say, 1,000) on the number of
sybils one person can create without it being too costly and
without being detected. The initial reputation of each rater
is set low enough that the total reputation of this bounded
number of raters is still relatively small.
To further motivate our manipulation-resistance algorithm,
consider some approaches to manipulating conventional rec-
ommender systems. One threat is a cloning attack. For
example, in a recommender system that asks each rater to
report movie ratings on a 1-5 scale, the attacker simply re-
ports the same ratings as some other rater, except for a
single item to be manipulated. Most recommender systems
do not take into account the order of ratings, and thus the
attacker will have just as much effect on predictions for the
last item as the rater who was copied. In a nearest-neighbor
recommender algorithm, the attacker can even just clone
the ratings of the target; the attacker will then be the near-
est neighbor of the target. The cloning attack can be made
more difficult by hiding the actual rating vectors of raters,
but significant information about others’ ratings will leak
out in the content of the recommendations, and sophisti-
cated attackers will be able to create influential rating pro-
files through approximate cloning. Our approach thwarts
the cloning attack by adding reputation only when a rater
improves the prediction made for some target. Unless the
rater moves the recommendation from where it was before
the rater provided its information, the rater can neither gain
nor lose reputation.
A second threat to conventional recommender systems
comes from random profile flooding. An attacker creates a
large number of sybils that provide random reports except
on the item or items to be manipulated. By chance, some
sybils may appear to have provided useful information in
their random reports. These sybils are used to impact the
prediction for an item being manipulated. Our algorithm
thwarts this attack by making the probability of gaining
sufficient credibility through random reports very low, so
low that an attacker gains less influence in expectation from
its sybils making random guesses than from simply transfer-
ring the initial credibility of all its sybils to a main identity.
Moreover, any sybil profile that does happen to gain a high
credibility score has, by chance, moved the predictions for
the target in a useful way, thus compensating for the lost
utility from the subsequent manipulation.
The paper begins with an exploration of related work in
section 1.1. Section 2 presents a model of the recommending
process. Section 3 presents our algorithm. Section 4 pro-
vides formal statements of its manipulation-resistance prop-
erties. Section 5 presents information-theoretic bounds on
the information loss due to influence limits. Section 6 dis-
cusses limitations and possible extensions.
1.1 Related Work
The possibility of sybil attacks on recommendation sys-
tems has been noted by O’Mahony et al. [22] and Lam and
Riedl [15], who use the term “shilling attack”. Through
simulation, Lam and Riedl study versions of the cloning and
random profile attacks on different recommender algorithms,
and note that the effectiveness of the attack varies depend-
ing on the algorithm used. However, they do not address
the development of a provably attack-resistant algorithm.
Several authors have suggested using statistical metrics
on ratings to distinguish “attack” identities from “regular”
identities, and eliminate the former [7, 23, 18]. Mobasher et
al. [20] survey this literature and classify attack strategies.
This approach is likely to lead to an arms-race where shillers
employ increasingly sophisticated patterns of attack. To
avoid this, our approach does not rely on identifying particu-
lar attack identities or specific attack strategies. O’Donovan
and Smyth [21] suggest using accuracy information from
multiple targets to judge credibility; it would be interest-
ing to see if our scheme can be extended in this way.
Dellarocas [8] provides an algorithm that bounds the dam-
age that attackers can do when they collectively provide less
than half the ratings in the system and the honest ratings
are normally distributed. Our approach succeeds much more
generally, even in situations where only a tiny fraction of the
ratings are honest, at the expense of greater information loss
during the startup phase when raters are not yet credible.
Herlocker et al. [13] study a modification of a nearest-
neighbor recommender algorithm that does not count a rater
as a near neighbor until it has rated sufficiently many items
in common with a target rater. This is similar in spirit to
our influence-limiting approach, but we provide a limiting
process that is grounded in information theory and provably
resistant to manipulation.
We use proper scoring rules to elicit honest ratings. These
were pioneered in the context of forecasting objective events
like weather patterns [4]; Miller et al. [19] noted that scor-
ing rules could be adapted to the recommendation setting by
treating the target’s rating as an objective outcome. Han-
son [11] developed the market scoring rule as a mechanism
for information markets. In this mechanism, a trader is re-
warded with the score difference between her prediction and
the previous prediction. This relative scoring rewards the
first provider of information, and forms an essential compo-
nent of our approach. We develop additional machinery to
handle strategies involving sybils and bankruptcy.
Bhattarcharjee and Goel [3] suggest sharing the revenue of
a ranking system with the raters. Using techniques similar
to market scoring rules to determine the revenue shares,
they argue that an attack on the system would be costly.
In contrast, we do not require any real money transactions,
and prove bounds on the damage that sybils can do.
We use the error score change as a natural measure of per-
formance of the system and damage to the system. Rashid
et al. [25] propose several other algorithm-independent mea-
sures of rater influence; unlike error score change, their mea-
sures do not consider the dynamic order of ratings. Sepa-
rately, Rashid et al. [24] use entropy to analyze a different
problem: choosing which items to ask new users to rate.
The literature on bounded-regret online learning deals
with combining predictions from multiple forecasters and
proving worst-case bounds on the error relative to the best
predictor that could be chosen in hindsight (see Cesa-Bianchi
26

Sign up today - FREE

Mendeley saves you time finding and organizing research. Learn more

  • All your research in one place
  • Add and import papers easily
  • Access it anywhere, anytime

Start using Mendeley in seconds!

Already have an account? Sign in

Readership Statistics

28 Readers on Mendeley
by Discipline
 
 
 
by Academic Status
 
43% Ph.D. Student
 
11% Student (Master)
 
11% Post Doc
by Country
 
25% United States
 
11% China
 
11% Canada