Factorization Meets the Neighborhood : a Multifaceted Collaborative Filtering Model
- ISSN: 03617742
- ISBN: 9781605581934
- DOI: 10.1145/1401890.1401944
- PubMed: 2428061
Abstract
Recommender systems provide users with personalized suggestions for products or services. These systems often rely on Collaborating Filtering (CF), where past transactions are analyzed in order to establish connections between users and products. The two more successful approaches to CF are latent factor models, which directly profile both users and products, and neighborhood models, which analyze similarities between products or users. In this work we introduce some innovations to both approaches. The factor and neighborhood models can now be smoothly merged, thereby building a more accurate combined model. Further accuracy improvements are achieved by extending the models to exploit both explicit and implicit feedback by the users. The methods are tested on the Netflix data. Results are better than those previously published on that dataset. In addition, we suggest a new evaluation metric, which highlights the differences among methods, based on their performance at a top-K recommendation task.
Author-supplied keywords
Factorization Meets the Neighborhood : a Multifaceted Collaborative Filtering Model
Collaborative Filtering Model
Yehuda Koren
AT&T Labs Research
180 Park Ave, Florham Park, NJ 07932
yehuda@research.att.com
ABSTRACT
Recommender systems provide users with personalized suggestions
for products or services. These systems often rely on Collaborat-
ing Filtering (CF), where past transactions are analyzed in order to
establish connections between users and products. The two more
successful approaches to CF are latent factor models, which di-
rectly pro le both users and products, and neighborhood mod els,
which analyze similarities between products or users. In this work
we introduce some innovations to both approaches. The factor and
neighborhood models can now be smoothly merged, thereby build-
ing a more accurate combined model. Further accuracy improve-
ments are achieved by extending the models to exploit both explicit
and implicit feedback by the users. The methods are tested on the
Net ix data. Results are better than those previously publi shed on
that dataset. In addition, we suggest a new evaluation metric, which
highlights the differences among methods, based on their perfor-
mance at a top-K recommendation task.
Categories and Subject Descriptors
H.2.8 [Database Management]: Database Applications Data Min-
ing
General Terms
Algorithms
Keywords
collaborative ltering, recommender systems
1. INTRODUCTION
Modern consumers are inundated with choices. Electronic retail-
ers and content providers offer a huge selection of products, with
unprecedented opportunities to meet a variety of special needs and
tastes. Matching consumers with most appropriate products is not
trivial, yet it is a key in enhancing user satisfaction and loyalty. This
emphasizes the prominence of recommender systems, which pro-
vide personalized recommendations for products that suit a user’s
taste [1]. Internet leaders like Amazon, Google, Net ix, Ti Vo and
Yahoo are increasingly adopting such recommenders.
Recommender systems are often based on Collaborative Filter-
ing (CF) [10], which relies only on past user behavior e.g., the ir
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for pro t or commercial advantage an d that copies
bear this notice and the full citation on the rst page. To cop y otherwise, to
republish, to post on servers or to redistribute to lists, requires prior speci c
permission and/or a fee.
KDD’08, August 24 27, 2008, Las Vegas, Nevada, USA.
Copyright 2008 ACM 978-1-60558-193-4/08/08 ...$5.00.
previous transactions or product ratings and does not requ ire the
creation of explicit pro les. Notably, CF techniques requi re no do-
main knowledge and avoid the need for extensive data collection.
In addition, relying directly on user behavior allows uncovering
complex and unexpected patterns that would be dif cult or im pos-
sible to pro le using known data attributes. As a consequenc e, CF
attracted much of attention in the past decade, resulting in signif-
icant progress and being adopted by some successful commercial
systems, including Amazon [15], TiVo and Net ix.
In order to establish recommendations, CF systems need to com-
pare fundamentally different objects: items against users. There are
two primary approaches to facilitate such a comparison, which con-
stitute the two main disciplines of CF: the neighborhood approach
and latent factor models.
Neighborhood methods are centered on computing the relation-
ships between items or, alternatively, between users. An item-
oriented approach evaluates the preference of a user to an item
based on ratings of similar items by the same user. In a sense,
these methods transform users to the item space by viewing them
as baskets of rated items. This way, we no longer need to compare
users to items, but rather directly relate items to items.
Latent factor models, such as Singular Value Decomposition (SVD),
comprise an alternative approach by transforming both items and
users to the same latent factor space, thus making them directly
comparable. The latent space tries to explain ratings by characteriz-
ing both products and users on factors automatically inferred from
user feedback. For example, when the products are movies, fac-
tors might measure obvious dimensions such as comedy vs. drama,
amount of action, or orientation to children; less well de n ed di-
mensions such as depth of character development or quirkin ess ;
or completely uninterpretable dimensions.
The CF eld has enjoyed a surge of interest since October 2006 ,
when the Net ix Prize competition [5] commenced. Net ix re-
leased a dataset containing 100 million movie ratings and chal-
lenged the research community to develop algorithms that could
beat the accuracy of its recommendation system, Cinematch. A
lesson that we learnt through this competition is that the neighbor-
hood and latent factor approaches address quite different levels of
structure in the data, so none of them is optimal on its own [3].
Neighborhood models are most effective at detecting very lo-
calized relationships. They rely on a few signi cant neighb orhood-
relations, often ignoring the vast majority of ratings by a user. Con-
sequently, these methods are unable to capture the totality of weak
signals encompassed in all of a user’s ratings. Latent factor models
are generally effective at estimating overall structure that relates si-
multaneously to most or all items. However, these models are poor
at detecting strong associations among a small set of closely related
items, precisely where neighborhood models do best.
In this work we suggest a combined model that improves predic-
hood and latent factor approaches. To our best knowledge, this is
the rst time that a single model has integrated the two appro aches.
In fact, some past works (e.g., [2, 4]) recognized the utility of com-
bining those approaches. However, they suggested post-processing
the factorization results, rather than a uni ed model where neigh-
borhood and factor information are considered symmetrically.
Another lesson learnt from the Net ix Prize competition is t he
importance of integrating different forms of user input into the
models [3]. Recommender systems rely on different types of in-
put. Most convenient is the high quality explicit feedback, which
includes explicit input by users regarding their interest in products.
For example, Net ix collects star ratings for movies and TiV o users
indicate their preferences for TV shows by hitting thumbs-up/down
buttons. However, explicit feedback is not always available. Thus,
recommenders can infer user preferences from the more abundant
implicit feedback, which indirectly re ect opinion through observ-
ing user behavior [16]. Types of implicit feedback include purchase
history, browsing history, search patterns, or even mouse move-
ments. For example, a user that purchased many books by the same
author probably likes that author. Our main focus is on cases where
explicit feedback is available. Nonetheless, we recognize the im-
portance of implicit feedback, which can illuminate users that did
not provide enough explicit feedback. Hence, our models integrate
explicit and implicit feedback.
The structure of the rest of the paper is as follows. We start
with preliminaries and related work in Sec. 2. Then, we describe
a new, more accurate neighborhood model in Sec. 3. The new
model is based on an optimization framework that allows smooth
integration with latent factor models, and also inclusion of implicit
user feedback. Section 4 revisits SVD-based latent factor models
while introducing useful extensions. These extensions include a
factor model that allows explaining the reasoning behind recom-
mendations. Such explainability is important for practical systems
[11, 23] and known to be problematic with latent factor models.
The methods introduced in Sec. 3-4 are linked together in Sec.
5, through a model that integrates neighborhood and factor mod-
els within a single framework. Relevant experimental results are
brought within each section. In addition, we suggest a new method-
ology to evaluate effectiveness of the models, as described in Sec.
6, with encouraging results.
2. PRELIMINARIES
We reserve special indexing letters for distinguishing users from
items: for users u, v, and for items i, j. A rating rui indicates the
preference by user u of item i, where high values mean stronger
preference. For example, values can be integers ranging from 1
(star) indicating no interest to 5 (stars) indicating a strong interest.
We distinguish predicted ratings from known ones, by using the no-
tation rˆui for the predicted value of rui. The (u, i) pairs for which
rui is known are stored in the set K = {(u, i) | rui is known}.
Usually the vast majority of ratings are unknown. For example, in
the Net ix data 99% of the possible ratings are missing. In or der
to combat over tting the sparse rating data, models are regu larized
so estimates are shrunk towards baseline defaults. Regularization
is controlled by constants which are denoted as: λ1, λ2, . . . Exact
values of these constants are determined by cross validation. As
they grow, regularization becomes heavier.
2.1 Baseline estimates
Typical CF data exhibit large user and item effects i.e., sy stem-
atic tendencies for some users to give higher ratings than others,
and for some items to receive higher ratings than others. It is cus-
tomary to adjust the data by accounting for these effects, which we
encapsulate within the baseline estimates. Denote by µ the overall
average rating. A baseline estimate for an unknown rating rui is
denoted by bui and accounts for the user and item effects:
bui = µ + bu + bi (1)
The parameters bu and bi indicate the observed deviations of user
u and item i, respectively, from the average. For example, suppose
that we want a baseline estimate for the rating of the movie Titanic
by user Joe. Now, say that the average rating over all movies, µ, is
3.7 stars. Furthermore, Titanic is better than an average movie, so it
tends to be rated 0.5 stars above the average. On the other hand, Joe
is a critical user, who tends to rate 0.3 stars lower than the average.
Thus, the baseline estimate for Titanic’s rating by Joe would be 3.9
stars by calculating 3.7− 0.3 + 0.5. In order to estimate bu and bi
one can solve the least squares problem:
min
b∗
∑
(u,i)∈K
(rui − µ− bu − bi)2 + λ1(
∑
u
b
2
u +
∑
i
b
2
i )
Here, the rst term ∑(u,i)∈K(rui − µ + bu + bi)2 strives to nd
bu’s and bi’s that t the given ratings. The regularizing term
λ1(
∑
u b
2
u +
∑
i
b2i ) avoids over tting by penalizing the magni-
tudes of the parameters.
2.2 Neighborhood models
The most common approach to CF is based on neighborhood
models. Its original form, which was shared by virtually all earlier
CF systems, is user-oriented; see [12] for a good analysis. Such
user-oriented methods estimate unknown ratings based on recorded
ratings of like minded users. Later, an analogous item-oriented
approach [15, 21] became popular. In those methods, a rating is
estimated using known ratings made by the same user on similar
items. Better scalability and improved accuracy make the item-
oriented approach more favorable in many cases [2, 21, 22]. In
addition, item-oriented methods are more amenable to explaining
the reasoning behind predictions. This is because users are famil-
iar with items previously preferred by them, but do not know those
allegedly like minded users. Thus, our focus is on item-oriented
approaches, but parallel techniques can be developed in a user-
oriented fashion, by switching the roles of users and items.
Central to most item-oriented approaches is a similarity measure
between items. Frequently, it is based on the Pearson correlation
coef cient, ρij , which measures the tendency of users to rate items
i and j similarly. Since many ratings are unknown, it is expected
that some items share only a handful of common raters. Computa-
tion of the correlation coef cient is based only on the commo n user
support. Accordingly, similarities based on a greater user support
are more reliable. An appropriate similarity measure, denoted by
sij , would be a shrunk correlation coef cient:
sij
def= nij
nij + λ2 ρij (2)
The variable nij denotes the number of users that rated both i and
j. A typical value for λ2 is 100. Notice that the literature suggests
additional alternatives for a similarity measure [21, 22].
Our goal is to predict rui the unobserved rating by user u for
item i. Using the similarity measure, we identify the k items rated
by u, which are most similar to i. This set of k neighbors is denoted
by Sk(i;u). The predicted value of rui is taken as a weighted av-
erage of the ratings of neighboring items, while adjusting for user
and item effects through the baseline estimates:
rˆui = bui +
∑
j∈Sk(i;u) sij(ruj − buj)∑
j∈Sk(i;u) sij
(3)
Sign up today - FREE
Mendeley saves you time finding and organizing research. Learn more
- All your research in one place
- Add and import papers easily
- Access it anywhere, anytime



