Piloted Search and Recommendation with Social Tag Cloud-Based Navigation
Available from bradipo.net
Page 1
Piloted Search and Recommendation with Social Tag Cloud-Based Navigation
Piloted Search and Recommendation with Social Tag
Cloud-Based Navigation
Cédric Mesnage
Faculty of Informatics
University of Lugano
cedric.mesnage@usi.ch
Mark Carman
Faculty of Informatics
University of Lugano
mark.carman@usi.ch
ABSTRACT
We investigate the generation of tag clouds using Bayesian
models and test the hypothesis that social network informa-
tion is better than overall popularity for ranking new and
relevant information. We propose three tag cloud genera-
tion models based on popularity, topics and social structure.
We conducted two user evaluations to compare the models
for search and recommendation of music with social net-
work data gathered from ”Last.fm”. Our survey shows that
search with tag clouds is not practical whereas recommenda-
tion is promising. We report statistical results and compare
the performance of the models in generating tag clouds that
lead users to discover songs that they liked and were new to
them. We find statistically significant evidence at 5% confi-
dence level that the topic and social models outperform the
popular model.
1. INTRODUCTION
We investigate mechanisms to explore social network in-
formation. Our current focus is to use contextual tag clouds
as a mean to navigate through the data and control a rec-
ommendation system.
Figure 1 shows the screen of the Web application we de-
veloped to evaluate our models. The goal is to find the
displayed track using the tag cloud. The tag cloud is gener-
ated according to a randomly selected model and the current
query. Participants in the evaluation can add terms to the
query by clicking on tags which generates a new tag cloud
and changes the list of results. Once the track is found, the
user clicks on its title and goes to the next task.
Figure 2 shows the principle of our controlled recommen-
dation experiment. The participant sees a tag cloud, by
clicking a tag she is recommended with a song. Once the
song is rated, a new tag cloud is given according to the pre-
viously selected tags.
This paper is structured as follows. We first discuss re-
lated work in the area of tag cloud-based navigation. We
then detail models for generating context-aware tag clouds
WOMRAD 2010 Workshop on Music Recommendation and Discovery,
colocated with ACM RecSys 2010 (Barcelona, SPAIN)
Copyright c©. This is an open-access article distributed under the terms
of the Creative Commons Attribution License 3.0 Unported, which permits
unrestricted use, distribution, and reproduction in any medium, provided
the original author and source are credited.
Figure 1: Searching task.
using both social network and topic modeling based ap-
proaches, that we have implemented in our prototype tag
cloud-based navigation system. We then describe the data
we have collected from the ”Last.fm”online music social net-
work, and the evaluation consisting of a pilot user-study, a
user survey and a follow up study.
2. RELATED WORK
2.1 Social Tagging and its Motivations
Research in social tagging is relatively recent with the first
tagging applications appearing in the late nineties [12]. The
system called Webtagger relied on a proxy to enable users
to share bookmarks and assign tags to them. The approach
was novel compared to storing bookmarks in the browser’s
folder in the sense that bookmarks were shared and belonged
to multiple categories (instead of being placed in a single
folder). The creators argued that hierarchical browsing was
tedious and frustrating when information is nested several
layers deep.
By 2004, social tagging had reached a point where it was
becoming more and more popular, initially on bookmarking
sites like Delicious and then later on social media sharing
sites such as Flickr and Youtube. Research in social tagging
started with Hammond [7] who gave an overview of social
bookmarking tools and was continued by Golder et al. [5]
who provided the first analysis of tagging as a process using
tag data from Delicious. They showed that tag data fol-
lows a power law distribution, gave a taxonomy of tagging
incentives, and looked at the convergence of tag descrip-
Cloud-Based Navigation
Cédric Mesnage
Faculty of Informatics
University of Lugano
cedric.mesnage@usi.ch
Mark Carman
Faculty of Informatics
University of Lugano
mark.carman@usi.ch
ABSTRACT
We investigate the generation of tag clouds using Bayesian
models and test the hypothesis that social network informa-
tion is better than overall popularity for ranking new and
relevant information. We propose three tag cloud genera-
tion models based on popularity, topics and social structure.
We conducted two user evaluations to compare the models
for search and recommendation of music with social net-
work data gathered from ”Last.fm”. Our survey shows that
search with tag clouds is not practical whereas recommenda-
tion is promising. We report statistical results and compare
the performance of the models in generating tag clouds that
lead users to discover songs that they liked and were new to
them. We find statistically significant evidence at 5% confi-
dence level that the topic and social models outperform the
popular model.
1. INTRODUCTION
We investigate mechanisms to explore social network in-
formation. Our current focus is to use contextual tag clouds
as a mean to navigate through the data and control a rec-
ommendation system.
Figure 1 shows the screen of the Web application we de-
veloped to evaluate our models. The goal is to find the
displayed track using the tag cloud. The tag cloud is gener-
ated according to a randomly selected model and the current
query. Participants in the evaluation can add terms to the
query by clicking on tags which generates a new tag cloud
and changes the list of results. Once the track is found, the
user clicks on its title and goes to the next task.
Figure 2 shows the principle of our controlled recommen-
dation experiment. The participant sees a tag cloud, by
clicking a tag she is recommended with a song. Once the
song is rated, a new tag cloud is given according to the pre-
viously selected tags.
This paper is structured as follows. We first discuss re-
lated work in the area of tag cloud-based navigation. We
then detail models for generating context-aware tag clouds
WOMRAD 2010 Workshop on Music Recommendation and Discovery,
colocated with ACM RecSys 2010 (Barcelona, SPAIN)
Copyright c©. This is an open-access article distributed under the terms
of the Creative Commons Attribution License 3.0 Unported, which permits
unrestricted use, distribution, and reproduction in any medium, provided
the original author and source are credited.
Figure 1: Searching task.
using both social network and topic modeling based ap-
proaches, that we have implemented in our prototype tag
cloud-based navigation system. We then describe the data
we have collected from the ”Last.fm”online music social net-
work, and the evaluation consisting of a pilot user-study, a
user survey and a follow up study.
2. RELATED WORK
2.1 Social Tagging and its Motivations
Research in social tagging is relatively recent with the first
tagging applications appearing in the late nineties [12]. The
system called Webtagger relied on a proxy to enable users
to share bookmarks and assign tags to them. The approach
was novel compared to storing bookmarks in the browser’s
folder in the sense that bookmarks were shared and belonged
to multiple categories (instead of being placed in a single
folder). The creators argued that hierarchical browsing was
tedious and frustrating when information is nested several
layers deep.
By 2004, social tagging had reached a point where it was
becoming more and more popular, initially on bookmarking
sites like Delicious and then later on social media sharing
sites such as Flickr and Youtube. Research in social tagging
started with Hammond [7] who gave an overview of social
bookmarking tools and was continued by Golder et al. [5]
who provided the first analysis of tagging as a process using
tag data from Delicious. They showed that tag data fol-
lows a power law distribution, gave a taxonomy of tagging
incentives, and looked at the convergence of tag descrip-
Page 2
Figure 2: Controlled recommendation task.
tions over time for resources on Delicious. The paper lead
to the first workshop on tagging [21], where papers mainly
discussed tagging incentives, tagging applications (in mu-
seums and enterprises), tag recommendation and knowledge
extraction. Following this workshop, research in tagging has
spread in various already established areas namely in Web
search, social dynamics, the Semantic Web, information re-
trieval, human computer interaction and data mining.
Sen et al. [19] examine factors that influence the way peo-
ple choose tags and the degree to which community mem-
bers share a vocabulary. The three factors they focus on are
personal tendency, community influence and the tag selec-
tion algorithm (used to recommend tags). Their study fo-
cuses on the MovieLens system that consists of user reviews
for movies. They categorize tags into three categories: fac-
tual, subjective and personal. They then divided users of
the system into four groups each with a different user inter-
face: the unshared group didn’t see any community tags; the
shared group saw random tags from their group; the popular
group saw the most popular tags; and the recommendation
group used a recommendation algorithm (that selected tags
most commonly applied to the target movie and to simi-
lar movies). They find that habit and investment influence
the users’ tag applications, while the community influences
a user’s personal vocabulary. The shared group produced
more subjective tags, while the popular and recommenda-
tion group produced more factual tags. The authors also
conducted a user survey in which they asked users whether
they thought tagging was useful for different tasks: self-
expression (50%), organizing (44%), learning (23%), finding
(27%), and decision support (21%).
Marlow et al. [14, 15] define a taxonomy of design aspects
of tagging systems that influence the content and useful-
ness of tags, namely tagging rights (who can tag), tagging
support (suggestion algorithms), aggregation model (bag or
set), resource type (web pages, images, etc.), source of con-
tent (participants, Web, etc.), resource connectivity (linked
or not), and social connectivity (linked or not). They also
propose aspects of user incentives expressing the different
motivations for tagging: future retrieval, contribution and
sharing, attracting attention, playing and competition, self
presentation, opinion expression.
Cattuto et al. [2, 1] perform an empirical study of tag
data from Delicious and find that the distribution of tags
over time follows a power law distribution. More specifi-
cally they find that the frequency of tags obeys a Zipf’s law
which is characteristic of self-organized communication sys-
tems and is commonly observed in natural language data.
They reproduced the phenomenon using a stochastic model,
leading to a model of user behavior in collaborative tagging
systems.
2.2 Browsing with Tags
Fokker et al. [4] present a tool to navigate Wikipedia us-
ing tag clouds. Their approach enables the user to select
different views on the tag cloud, such as recent tags, popu-
lar tags, personal tags or friends tags. They display related
tags when the user“mouses over”a tag in the cloud. They do
not, however, generate new contextually relevant tag clouds
when the user clicks on a tag.
In [16], Millen et al. investigate browsing behavior in
their Dogear social bookmarking application. The appli-
cation allows users to browse other peoples’ bookmark col-
lections by clicking on their username. They find that most
browsing activity of the web site is done through explor-
ing peoples’ bookmarks and then tags. They compare the
10 most browsed tags with 10 most used tags applied and
find that there is a strong correlation. While their find-
ings do not show that tagging improves social navigation in
general, they do show that browsing tags helps users to nav-
igate the bookmark collections of others. Following on from
this, Ishikawa et al. [10] studied the navigation efficiency
when browsing other users’ bookmarks. The idea is to de-
cide which user to browse first in order to discover faster
the desired information. While relevant to tag-based nav-
igation, this study does not deal with the problem of how
best to rank tags in order to improve cloud-based navigation
in general.
In [13], Li et al. propose various algorithms to browse
social annotations in a more efficient way. They extract
hierarchies from clusters and propose to browse social anno-
tations in a hierarchical manner. They also propose a way
to browse tags based on time. As discussed by Keller et al.
[12] a single taxonomy is not necessarily the best way to
navigate a corpus, however.
A more comprehensive study was performed by Sinclair
et al. [20] to examine the usefulness of tag clouds for infor-
mation seeking. They asked participants to perform infor-
mation seeking tasks on a folksonomy like dataset, providing
them with an interface consisting of a tag cloud and a search
box. The folksonomy was created by the same participants
who were asked to tag ten articles at the beginning of the
study, leading to a small scale folksonomy. The tag cloud
displayed 70 terms in alphabetical order with varying font
size proportional to the log of its frequency. The authors
give the following equation for the font size:
TagSize = 1 + C log(fi − fmin + 1)
log(fmax − fmin + 1) (1)
where C corresponds to the maximum font desired, fi to
the frequency of the tag to be displayed, fmin and fmax
to the minimum and maximum frequencies of the displayed
tags. Clicking on a tag in the cloud brings the user to a
tions over time for resources on Delicious. The paper lead
to the first workshop on tagging [21], where papers mainly
discussed tagging incentives, tagging applications (in mu-
seums and enterprises), tag recommendation and knowledge
extraction. Following this workshop, research in tagging has
spread in various already established areas namely in Web
search, social dynamics, the Semantic Web, information re-
trieval, human computer interaction and data mining.
Sen et al. [19] examine factors that influence the way peo-
ple choose tags and the degree to which community mem-
bers share a vocabulary. The three factors they focus on are
personal tendency, community influence and the tag selec-
tion algorithm (used to recommend tags). Their study fo-
cuses on the MovieLens system that consists of user reviews
for movies. They categorize tags into three categories: fac-
tual, subjective and personal. They then divided users of
the system into four groups each with a different user inter-
face: the unshared group didn’t see any community tags; the
shared group saw random tags from their group; the popular
group saw the most popular tags; and the recommendation
group used a recommendation algorithm (that selected tags
most commonly applied to the target movie and to simi-
lar movies). They find that habit and investment influence
the users’ tag applications, while the community influences
a user’s personal vocabulary. The shared group produced
more subjective tags, while the popular and recommenda-
tion group produced more factual tags. The authors also
conducted a user survey in which they asked users whether
they thought tagging was useful for different tasks: self-
expression (50%), organizing (44%), learning (23%), finding
(27%), and decision support (21%).
Marlow et al. [14, 15] define a taxonomy of design aspects
of tagging systems that influence the content and useful-
ness of tags, namely tagging rights (who can tag), tagging
support (suggestion algorithms), aggregation model (bag or
set), resource type (web pages, images, etc.), source of con-
tent (participants, Web, etc.), resource connectivity (linked
or not), and social connectivity (linked or not). They also
propose aspects of user incentives expressing the different
motivations for tagging: future retrieval, contribution and
sharing, attracting attention, playing and competition, self
presentation, opinion expression.
Cattuto et al. [2, 1] perform an empirical study of tag
data from Delicious and find that the distribution of tags
over time follows a power law distribution. More specifi-
cally they find that the frequency of tags obeys a Zipf’s law
which is characteristic of self-organized communication sys-
tems and is commonly observed in natural language data.
They reproduced the phenomenon using a stochastic model,
leading to a model of user behavior in collaborative tagging
systems.
2.2 Browsing with Tags
Fokker et al. [4] present a tool to navigate Wikipedia us-
ing tag clouds. Their approach enables the user to select
different views on the tag cloud, such as recent tags, popu-
lar tags, personal tags or friends tags. They display related
tags when the user“mouses over”a tag in the cloud. They do
not, however, generate new contextually relevant tag clouds
when the user clicks on a tag.
In [16], Millen et al. investigate browsing behavior in
their Dogear social bookmarking application. The appli-
cation allows users to browse other peoples’ bookmark col-
lections by clicking on their username. They find that most
browsing activity of the web site is done through explor-
ing peoples’ bookmarks and then tags. They compare the
10 most browsed tags with 10 most used tags applied and
find that there is a strong correlation. While their find-
ings do not show that tagging improves social navigation in
general, they do show that browsing tags helps users to nav-
igate the bookmark collections of others. Following on from
this, Ishikawa et al. [10] studied the navigation efficiency
when browsing other users’ bookmarks. The idea is to de-
cide which user to browse first in order to discover faster
the desired information. While relevant to tag-based nav-
igation, this study does not deal with the problem of how
best to rank tags in order to improve cloud-based navigation
in general.
In [13], Li et al. propose various algorithms to browse
social annotations in a more efficient way. They extract
hierarchies from clusters and propose to browse social anno-
tations in a hierarchical manner. They also propose a way
to browse tags based on time. As discussed by Keller et al.
[12] a single taxonomy is not necessarily the best way to
navigate a corpus, however.
A more comprehensive study was performed by Sinclair
et al. [20] to examine the usefulness of tag clouds for infor-
mation seeking. They asked participants to perform infor-
mation seeking tasks on a folksonomy like dataset, providing
them with an interface consisting of a tag cloud and a search
box. The folksonomy was created by the same participants
who were asked to tag ten articles at the beginning of the
study, leading to a small scale folksonomy. The tag cloud
displayed 70 terms in alphabetical order with varying font
size proportional to the log of its frequency. The authors
give the following equation for the font size:
TagSize = 1 + C log(fi − fmin + 1)
log(fmax − fmin + 1) (1)
where C corresponds to the maximum font desired, fi to
the frequency of the tag to be displayed, fmin and fmax
to the minimum and maximum frequencies of the displayed
tags. Clicking on a tag in the cloud brings the user to a
Sign up today - FREE
Mendeley saves you time finding and organizing research. Learn more
- All your research in one place
- Add and import papers easily
- Access it anywhere, anytime
Start using Mendeley in seconds!
Readership Statistics
3 Readers on Mendeley
by Discipline
by Academic Status
100% Student (Master)
by Country
33% Brazil
33% Portugal



