Sign up & Download
Sign in

Towards automatic extraction of event and place semantics from flickr tags

by Tye Rattenbury, Nathaniel Good, Mor Naaman
Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval SIGIR 07 (2007)

Abstract

We describe an approach for extracting semantics of tags, unstructured text-labels assigned to resources on the Web, based on each tag's usage patterns. In particular, we focus on the problem of extracting place and event semantics for tags that are assigned to photos on Flickr, a popular photo sharing website that supports time and location (latitude/longitude) metadata. We analyze two methods inspired by well-known burst-analysis techniques and one novel method: Scale-structure Identification. We evaluate the methods on a subset of Flickr data, and show that our Scale-structure Identification method outperforms the existing techniques. The approach and methods described in this work can be used in other domains such as geo-annotated web pages, where text terms can be extracted and associated with usage patterns.

Cite this document (BETA)

Available from portal.acm.org
Page 1
hidden

Towards automatic extraction of event and place semantics from flickr tags

Towards Automatic Extraction of Event and Place
Semantics from Flickr Tags
Tye Rattenbury

, Nathaniel Good† and Mor Naaman
Yahoo! Research Berkeley
Berkeley, CA, USA
(tye, ngood, mor)@yahoo-inc.com
ABSTRACT
We describe an approach for extracting semantics of tags,
unstructured text-labels assigned to resources on the Web,
based on each tag’s usage patterns. In particular, we fo-
cus on the problem of extracting place and event seman-
tics for tags that are assigned to photos on Flickr, a popu-
lar photo sharing website that supports time and location
(latitude/longitude) metadata. We analyze two methods
inspired by well-known burst-analysis techniques and one
novel method: Scale-structure Identification. We evaluate
the methods on a subset of Flickr data, and show that our
Scale-structure Identification method outperforms the exist-
ing techniques. The approach and methods described in this
work can be used in other domains such as geo-annotated
web pages, where text terms can be extracted and associated
with usage patterns.
Categories and Subject Descriptors: H.1.m [MODELS
AND PRINCIPLES]: Miscellaneous
General Terms: Algorithms, Measurement
Keywords: tagging systems, event identification, place iden-
tification, tag semantics, word semantics
1. INTRODUCTION
User-supplied “tags”, textual labels assigned to content,
have been a powerful and useful feature in many social media
and Web applications (e.g. Flickr, del.icio.us, Technorati).
Tags usually manifest in the form of a freely-chosen, short
list of keyword associated by a user with a resource such
as a photo, web page, or blog entry. Unlike category- or
ontology-based systems, tags result in unstructured knowl-
edge – they have no a-priori semantics. However, it is pre-
cisely the unstructured nature of tags that enables their util-
ity. For example, tags are probably easier to enter than
picking categories from an ontology; tags allow for greater
∗Also affiliated with UC Berkeley, Computer Science Dept.
†Also affiliated with UC Berkeley School of Information.
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
SIGIR’07, July 23–27, 2007, Amsterdam, The Netherlands.
Copyright 2007 ACM 978-1-59593-597-7/07/0007 ...$5.00.
flexibility and variation; and tags may naturally evolve to
reflect emergent properties of the data.
The information challenge facing tagging systems is to
extract structured knowledge from the unstructured set of
tags. Despite the lack of ontology and semantics, patterns
and trends emerge that could allow some structured infor-
mation to be extracted from tag-based systems [11, 17, 23].
While complete semantic understanding of tags associated
with individual resources is unlikely, the ability to assign
some structure to tags and tag-based data will make tag-
ging systems more useful.
Broadly, we are interested in the problem of identifying
patterns in the distribution of tags over some domain; in
this work we focus on spatial and temporal patterns. Specifi-
cally, we are looking at tags on Flickr [10], a popular photo-
sharing web site that supports user-contributed tags and
geo-referenced (or, geotagged) photos. Based on the tempo-
ral and spatial distributions of each tag’s usage, we attempt
to automatically determine whether a tag corresponds to
a “place” and/or “event” (see Section 3 for definitions).
For example, the tag Bay Bridge1 should be identified as
a place, and SIGIR2007 should be identified as an event.
Tag usage distributions are derived from associated photos’
metadata. While the correctness of the time and location
metadata for each individual photo is suspect [5], in large
numbers, trends and patterns can be reliably extracted and
used [9, 14].
Extraction of event and place semantics can assist many
different applications in the photo retrieval domain and be-
yond, including:
• improved image search through inferred query semantics;
• automated creation of place and event gazetteer data
that can be used, for example, to improve web search by
identifying relevant spatial regions and time spans for
particular keywords;
• generation of photo collection visualizations by location
and/or event/time;
• support for tag suggestions for photos (or other resources)
based on location and time of capture;
• automated association of missing location/time meta-
data to photos, or other resources, based on tags or cap-
tion text.
In this work we do not apply our analysis to a specific appli-
cation, but rather investigate the feasibility of automatically
determining event/place semantics for Flickr tags.
1We use this format to represent tags in the text.
SIGIR 2007 Proceedings Session 5: Image Retrieval
103
Page 2
hidden
This paper represents, to our knowledge, the first attempt
to extract place and event semantics for tags. Accordingly,
we are exploring a number of possible methods. We in-
troduce a new method tailored to event and place identifi-
cation, Scale-structure Identification, and demonstrate how
this method outperforms methods borrowed from other do-
mains.
Furthermore, we note that our general approach to se-
mantics extraction, and the methods we present as instan-
tiations of this approach, can be applied to any information
sources with temporal and spatial encodings from which we
can extract textual terms – like GeoRSS blog data and geo-
annotated web pages or Wikipedia articles. Additionally,
the general approach of analyzing a distribution of occur-
rences over a domain (in our case space and time) to infer
semantics could be extended to other metadata domains like
color (hue/saturation), visual features, audio features, and
text/semantic features.
To summarize, the contributions of this paper are:
• a generalizeable approach for extracting tag semantics
based on the distribution of individual tags;
• the modification, application, and analysis of existing
methods to the problem of event and place identification
for tag data;
• Scale-structure Identification – a new method for ex-
tracting patterns from usage data;
• a practical application of these methods to extract event
and place semantics from tags associated with geotagged
images on Flickr.
We formally define our problem in Section 3. Then we de-
scribe the methods (Section 4) and report on our evaluation
(Section 5). We begin by reviewing the related work.
2. RELATED WORK
We address related work from a number of relevant re-
search areas, including: event detection in time-stamped
data such as web queries and personal photo collections;
location-based analysis of spatially distributed data such as
GPS positions, demographics information, or even informa-
tion on the web; and analysis of tagging systems.
Many scientific domains have studied the general problem
of time-based event detection. Time Series analysis tech-
niques such as ARIMA [4, 18] analyze trends in time series
data with the goals of (1) explaining spikes and valleys over
various time windows and (2) producing future trend fore-
casts. In particular, our Na¨ıve Scan methods (see Section 4)
are similar to previous work on global event detection in
web query logs [25] and access logs [13] where events are
semantically defined as “bursts” (cf. [15]).
More germane to this paper is the problem of event iden-
tification in personal photo collections [12, 20, 24]. A key
characteristic of the personal photo collection domain is
the general assumption of “a single camera”, which reduces
event identification to a problem of temporal segmentation.
Events are considered to be a single segment of time over
which a single activity was taking place, providing a coher-
ent, unifying context. Prior work on this problem has ap-
plied a number of techniques: some rely primarily on time
[12], others use both locations and times [20, 21], and an-
other looks at the text annotation associated with photos
[24]. This type of event-identification is different than ours
since (1) we consider multi-person collections of photos and
(2) we are interested in whether tags describe events, not
whether a segment of time refers to a specific event for a
specific person.
Related to event identification is the extraction of mean-
ingful information from location-based data. Recent efforts
in ubiquitous computing systems identify meaningful loca-
tions and places for GPS and other location tracking tech-
nologies [1]. In epidemiology, efforts to identify and localize
disease outbreaks [16] are closely related to the place iden-
tification problem we address in this paper. Specifically, we
borrow some techniques from the disease/outbreak analysis,
where data is sparse and dependent on the underlying pop-
ulation statistics, as these two properties are echoed in our
data for each tag.
More semantically-rich location analysis problems have
been studied in the domain of web-based information re-
trieval. Specifically, the field of “GeoIR” has had two thrusts
relevant to this paper. First, attempts were made (e.g. [2,
6, 8]) at extracting geographic information for a web page,
based on the page links and network properties, as well as
geographic terms that appear on the page. Our system de-
scribed here could potentially help these systems by identi-
fying additional geographic terms and defining their spatial
scope. The second related research effort in GeoIR focused
on extracting the scope of geographic terms or entities based
on co-occurring text and derived latitude-longitude informa-
tion [3, 22]. With geo-annotated photos and tags, as well
as any system with direct location annotation, the potential
exists not only to delineate known geographic terms, but
also to identify new regions of interest based on the data.
Tagging systems in general have been of increasing re-
search interest. Most of the prior research has looked at de-
scribing tagging systems [17], or studying trends and prop-
erties of various systems [11]. Some efforts have looked at
extracting ontologies (or, structured knowledge) from tags
[23] – a similar goal to ours, yet using co-occurrence and
other text-based tools that could augment the methods an-
alyzed in this paper.
More directly related to this paper are research efforts that
analyzed Flickr tags (and other term associated with Flickr
photos) together with photo location and time metadata [9,
14]. These projects applied ad-hoc approaches to determine
“important” tags within a given region of time [9] or space
[14] based on inter-tag frequencies. However, no determi-
nation of the properties or semantics of specific tags was
provided. Naaman et al. created spatial models for terms
appearing in geo-referenced photograph labels [19], but did
not detect the location properties of specific terms.
3. PROBLEM DEFINITION
In this section, we provide a formal definition of our data
and research problem. Our dataset includes two basic ele-
ments: photos and tags. Each geotagged photo has, in ad-
dition to other metadata, an associated location and time.
The location, `p, (consisting of latitude-longitude coordi-
nates) associated with photo p generally marks where the
photo was taken; but sometimes marks the location of the
photographed object. The time, tp, associated with photo
p generally marks the photo capture time; but occasionally
refers to the time the photo was uploaded to Flickr. Both
location and time are recorded at high resolution (micro-
seconds of degrees for location, seconds for time).
SIGIR 2007 Proceedings Session 5: Image Retrieval
104

Sign up today - FREE

Mendeley saves you time finding and organizing research. Learn more

  • All your research in one place
  • Add and import papers easily
  • Access it anywhere, anytime

Start using Mendeley in seconds!

Already have an account? Sign in

Readership Statistics

79 Readers on Mendeley
by Discipline
 
 
 
by Academic Status
 
59% Ph.D. Student
 
10% Student (Master)
 
8% Assistant Professor
by Country
 
14% United States
 
14% Germany
 
9% China