Connecting the dots: music metadata generation, schemas and applications
- ISBN: 0615248497
Abstract
With the ever-increasing amount of digitized music becoming available, metadata is a key driver for different music related application domains. A service that combines different metadata sources should be aware of the existence of different schemas to store and exchange music metadata. The user of a metadata provider could benefit from knowledge about the metadata needs for different music application domains. In this paper, we present how we can compare the expressiveness and richness of a metadata schema for an application. To cope with different levels of granularity in metadata fields we defined clusters of semantically related metadata fields. Similarly, application domains were defined to tackle the fine-grained functionality space in music applications. Next is shown to what extent music application domains and metadata schemas make use of the metadata field clusters. Finally, we link the metadata schemas with the application domains. A decision table is presented that assists the user of a metadata provider in choosing the right metadata schema for his application
Connecting the dots: music metadata generation, schemas and applications
CONNECTING THE DOTS: MUSIC METADATA
GENERATION, SCHEMAS AND APPLICATIONS
Nik Corthaut, Sten Govaerts, Katrien Verbert, Erik Duval
Katholieke Universiteit Leuven, Dept. Computer Science
{nik.corthaut, sten.govaerts, katrien.verbert, erik.duval}@cs.kuleuven.be
ABSTRACT
With the ever-increasing amount of digitized music
becoming available, metadata is a key driver for different
music related application domains. A service that
combines different metadata sources should be aware of
the existence of different schemas to store and exchange
music metadata. The user of a metadata provider could
benefit from knowledge about the metadata needs for
different music application domains. In this paper, we
present how we can compare the expressiveness and
richness of a metadata schema for an application. To cope
with different levels of granularity in metadata fields we
defined clusters of semantically related metadata fields.
Similarly, application domains were defined to tackle the
fine-grained functionality space in music applications.
Next is shown to what extent music application domains
and metadata schemas make use of the metadata field
clusters. Finally, we link the metadata schemas with the
application domains. A decision table is presented that
assists the user of a metadata provider in choosing the
right metadata schema for his application.
1. INTRODUCTION
Metadata plays an important role in MIR research. In
2007, we proposed a semi-automatic approach for the
generation of music metadata [2] in the rockanango
project [1]. In this approach, we bundled the power of
computational techniques, the wisdom of the crowds and
the experience of the music experts of Aristo Music
(www.aristomusic.com), the company with whom we
collaborate. This was needed because of the increasing
volume of music that needs annotation within a reasonable
amount of time and cost.
A diversity of available metadata web services can be
used to facilitate the annotation process. Amazon.com has
a substantial web API to retrieve metadata about their
products. Available music metadata for CD’s includes
reviews from editors and customers, track listings, release
dates, genres, labels, popularity and similar items. Last.fm
(http://last.fm), the well-known social recommendation
music service, provides a web service to retrieve user,
artist, group and tag data. The All Music Guide is also a
very rich source of music metadata. MusicBrainz
(http://www.musicbrainz.org) is a community based music
metadata database used for CD recognition. It contains
artists, tracks, labels and releases. Discogs
(http://discogs.com) offers a neat web service with good
pointers to different production houses and publishers.
Another kind of web service is available that, given an
audio file, returns the results of different signal processing
algorithms. Echo Nest (http://analyze.echonest.com/)
recently released a web service that returns a set of
musical features, like timbre, pitch, and rhythm. It is
aiming at different applications, such as visualizations,
games and DJ-software. Furthermore, Echo Nest has been
developing software that ‘listens’ to music and tries to
predict which songs will become hits. The MIR Group of
the Vienna University of Technology
(http://www.ifs.tuwien.ac.at/mir/webservice/) also made a
web service available that returns a set of musical features
for a given song (e.g. rhythm patterns, statistical spectrum
descriptors and rhythm histograms) and allows the
training of self-organizing music maps. In conclusion,
these systems provide diverse and useful metadata,
however not without a substantial amount of overlap.
Figure 1. Internals of the metadata framework.
We want to build a framework that makes use of the
strength of the different systems (see Figure 1). The input
is a musical object (MO) that consists of an audio file (the
signal) and possibly some initial metadata (MD). When it
enters the system, a federated request for metadata among
available generators is done. Generators can use available
a priori metadata to enhance predictions. To cope with
potentially conflicting results, [3] suggests different
conflict resolution approaches. The resulting metadata is
249
stored in an internal format. The generated metadata can
cover the full range from low-level features, over factual
data e.g. the arrangement of the band, to affective
metadata e.g. the mood, similar songs or popularity.
By offering the choice between different existing music
metadata formats (OF), we enable the reuse of existing
tools and parsers to handle the generated metadata. To be
able to store all the generated metadata we are looking for
a suitable metadata schema for internal use in the
metadata framework. In this paper we will select a set of
metadata standards relevant for the task, investigate in
which application domains they are useful and evaluate
their descriptive richness.
The description of large music works, e.g. the complete
opus of a composer, is beyond the scope of this paper.
Likewise metadata schemas for metadata exchange (e.g.
METS [18]) or rights (e.g. ODRL [19]) are also out of the
scope. We focus on descriptive metadata schemas.
2. COMPARISON OF METADATA SCHEMAS
The goal of collecting metadata is to enable functionalities
in an application. To understand the usefulness of a
metadata schema in different cases, we will investigate
which metadata is involved in different use cases. This
will be the foundation for defining music application
domains and selecting a number of relevant metadata
schemas. For the sake of easier comparison, we introduce
a level of abstraction to the metadata fields by means of
field clustering. The three elements: the selected metadata
standards, the different application domains and categories
of metadata fields will be introduced in this section. The
question which metadata schemas to use when building an
application that offers a certain functionality can then be
answered through the comparison of the three elements.
2.1. Application domains
Music metadata can be anywhere in the production cycle
of music, whether it be copyright information about a
drum loop while composing or the number of searches for
a musical piece at an online store. For the use of metadata
schemas, we limit our application domains to software.
Clustering software applications into application
domains allows easier comparison between different
applications. The actual clustering is based on the actions
people perform when handling music [4] [14]. The eight
clusters cover the production cycle from conception
(composition) over consumption to transactions.
• Music library/encyclopedia: software systems for
physical libraries, encyclopediae or companies that
license music, describing factual knowledge for
large collections of music, e.g. AllMusic Guide.
• Personal collection management: software to
organize your music collection, e.g. iTunes, Winamp
media library, Collectorz.com Music Collector.
• Commerce and transactions: applications involved
in the act of shopping for music, this includes
presenting songs, searching and trading, e.g. iTunes
Music Store, Amazon, Magnatune.
• Music editing/production: the tools deployed in the
creation and adaptation of music, e.g. Logic Pro.
• Music playback: applications that render music files
to its audible form, e.g. your favorite music player.
• Music recommendation: services for discovery of
new and similar music, e.g. Pandora, Last.fm.
• Music retrieval: search and identification tools with
different query interfaces in all their forms, e.g.
query by humming.
• Musical notation: creation and manipulation tools
for musical scores, e.g. Sibelius, WAV2Midi.
A real-life application will most likely use a number of
application domains, e.g. playlist generation can be
classified as music recommendation and library
functionality, some music players also offer management
of the personal music collection.
2.2. Metadata standards
Metadata standards originate from different sources: it can
evolve out of the design of an application and through
wide adoption become a de facto standard, it can be
focused on interoperability or it can be designed as a
standard from the start. No single metadata standard is at
the moment available for music covering all the possible
requirements. Based on industry standards and the use in
ongoing research we selected eight music metadata
standards. The methodology presented in this paper is
applicable to the many other available standards (e.g.
MARC [20], MODS [21]). Future work includes
extending the comparison with other relevant schemas.
• ID3: An ID3-tag is a data container of a prescribed
format embedded within an audio file. The stored
data can contain the artist name, song title and genre
of the audio file. ID3 has a wide spread use in music
players and devices like iTunes, iPod and Winamp.
• FreeDB: is an online database to look up CD
information by calculating a unique ID for a CD to
query the database. FreeDB is a community version
of the commercial Gracenote service; both are used
in a variety of playback and MP3-ripping software.
• MusicBrainz: the scope of MusicBrainz [6] is the
same as FreeDB, but they have a moderated
database and use identification on track level.
MusicBrainz is an RDF-based [7] web service.
• Dublin Core: is a standard for cross-domain
information resource description, which provides a
simple and standardized set of conventions for
250
Sign up today - FREE
Mendeley saves you time finding and organizing research. Learn more
- All your research in one place
- Add and import papers easily
- Access it anywhere, anytime



