Sign up & Download
Sign in

Using Metadata for Information Retrieval in Document Management Systems

by Mirjana Andric, Wendy Hall
Computer as a Tool 2005 EUROCON 2005The International Conference on (2005)

Abstract

Locating digital documents in modern organisations with the aid of metadata, is a challenging area of research in document management systems. In order to investigate this, we have built an AWOCADO (Adaptive WOrkflow Controller And Document Organiser) prototype system. AWOCADO provides a framework for defining and managing document attributes as well as looking up documents by using their attributes. An evaluation study was conducted to observe information retrieval using metadata in our system.

Cite this document (BETA)

Available from eprints.soton.ac.uk
Page 2
hidden

Using Metadata for Information Retrieval in Document Management Systems

Ontologies are going to represent a base of the next phase
in the evolution of the Web, the Semantic Web [6]. It is
envisaged that the Semantic Web will be powered by
metadata and described by ontologies that will give
machine-understandable meaning to the data. Ontologies
could then be used to facilitate the process of information
retrieving. One of the essential components of the
Semantic Web technology is a common data model,
Resource Description Framework (RDF), described at
Information Retrieval (IR) deals with how people find
information and how tools can be constructed to help in
that process. Since the advent of the Web, these tools have
become known as search engines. DM systems employ
some IR techniques for delivering documents to their end
users. A group of users working together and contributing
documents and metadata into a DM system often benefits
from getting recommendations from their peers.
Recommender systems are the most wide spread systems
used to facilitate group work. They assist the users by
giving appropriate suggestions, usually in the form of a
proposed piece of information such as a document or a
reference to it. In a typical recommender system people
provide recommendations as input, which the system then
aggregates and directs to appropriate recipients depending
on their preferences [7].
III. AWOCADO OVERVIEW
The acronym AWOCADO stands for the Adaptive
WOrkflow Controller And Document Organiser.
AWOCADO handles documents and meta information
about documents. It also facilitates the exchange of
documents and messages among participants. The
AWOCADO system can be best described as an enriched
internal mailing system combined with a searchable
"source control"-like repository, benefiting end-users,
creators, editors and consumers of the documents in the
repository.
An experimental AWOCADO system prototype was
developed with the following aims in mind:
* To serve as a Web-based (Intranet) document and
metadata repository,
* To operate in a multi-user environment,
* To allow flexible attribute definitions based on a
defined document class,
* To allow flexible searching by adapting a set of search
attributes based on a document class context,
* To incorporate a simple workflow system, and, most
importantly,
* To provide a test bed for observing how users manage
and search documents in a DM system environment.
The architecture ofAWOCADO is given in Fig. 1. The
AWOCADO system aims to support three high level
groups of functions:
* Document archive management (storing, locating etc)
and a workflow control (Fig. 1, part 1, Manage Files)
Administrator
- Locate Files
File Upload/
Dowbload
Document Files
End-User
Fig. 1. The AWOCADO block system architecture.
* System administration (such as definition of document
classes, their attributes, and access control) (Fig. 1,
part 2, System Setup)
* Providing an interface for the end-users and enabling
them to access and manipulate document collection
(Fig. 1, part 3, User Interaction)
The main components of the system include:
* File manipulation component called Manage Files,
* Database (storage) component called Metadata Store,
* System initialization component System Setup,
* User Interaction module responsible for interfacing the
system user.
The fundamental, first-class object in the AWOCADO
system is a document. A document conceptually consists
of a document's physical content and a document's
header. A document's physical content is kept in files,
located on a local file system or Internet/Intranet Web
server(s). A document header comprises metadata about
the document, i.e. its attributes. This information, together
with the other required data, is kept in the AWOCADO
database or repository (Metadata Store in Fig. 1).
Currently, the system keeps attribute-value pairs in a
relational database structure for reasons of a simpler
implementation conforming to the rest of the repository.
However, the future work includes making use of more
standard and powerful Semantic Web formats such as
RDF. The system uses a Web-based architecture where the
Internet or Intranet clients interact with the database and
document files using a standard browser interface and the
1094

Sign up today - FREE

Mendeley saves you time finding and organizing research. Learn more

  • All your research in one place
  • Add and import papers easily
  • Access it anywhere, anytime

Start using Mendeley in seconds!

Already have an account? Sign in

Readership Statistics

8 Readers on Mendeley
by Discipline
 
 
by Academic Status
 
50% Ph.D. Student
 
25% Researcher (at an Academic Institution)
 
13% Student (Master)
by Country
 
38% Germany
 
25% Australia
 
13% Spain