From Information Retrieval to Information Interaction
Lecture Notes in Computer Science (2004)
Available from www.springerlink.com
or
Abstract
Information retrieval (IR) is hot. After 40 years of systematic research and development, often ignored by the public, technology and a global information economy have conspired to make IR a crucial element of the emerging cyberinfrastrucure and a field of interest for the best
Page 1
From Information Retrieval to Inf...
From Information Retrieval to Information Interaction Gary Marchionini University of North Carolina at Chapel Hill, School of Information and Library Science 100 Manning Hall Chapel Hill, NC 27599, USA march@ils.unc.edu Abstract. This paper argues that a new paradigm for information retrieval has evolved that incorporates human attention and mental effort and takes advan- tage of new types of information objects and relationships that have emerged in the WWW environment. One aspect of this new model is attention to highly interactive user interfaces that engage people directly and actively in informa- tion seeking. Two examples of these kinds of interfaces are described. 1 Introduction Information retrieval (IR) is hot. After 40 years of systematic research and develop- ment, often ignored by the public, technology and a global information economy have conspired to make IR a crucial element of the emerging cyberinfrastrucure and a field of interest for the best and brightest students. The new exciting employers are Google, Amazon, and eBay and the extant giants like IBM and Microsoft have active IR research and development groups. In many ways, research in IR had plateaued until the WWW breathed new life into it by supporting a global marketplace of elec- tronic information exchange. In fact, I argue that the IR problem itself has fundamen- tally changed and a new paradigm of information interaction has emerged. This ar- gument is made in two parts: first, the evolution of IR will be considered by a broad look at today���s information environment and trends in IR research and development and second, examples of attempts to address IR as an interactive process that engages human attention and mental effort will be given. 2 Information Objects and People As a scientific area, IR uses analysis to break down the whole problem into compo- nents and first focus on the components that promise to yield to our techniques. IR has always been fundamentally concerned with information objects and with the people who create, find, and use those objects however, because people are less predictable and more difficult and expensive to manipulate experimentally, IR re- search logically focused on the information objects first. Traditionally, information objects have been taken to be documents and queries and research has centered on
Page 2
two basic issues: representation of those objects and definition of the relationships among them. Representation is a classical issue in philosophy, information science (e.g., Heilprin argued that compression was the central representation problem [9]), and artificial intelligence. The IR community has demonstrated a variety of effective representations for documents and queries, including linguistic (e.g., controlled vo- cabulary) assignments and a large variety of mathematical assignments (e.g., vectors) based on term-occurrence, relevance probability estimates, and more recently hyper- link graphs. IR research has mainly focused on equality (e.g., of index terms) and similarity relationships���similarity between/among objects���and developed a large variety of matching algorithms that are exploited in today���s retrieval systems. A schematic for the traditional IR problem is depicted in Figure 1. Match Representations Representations Algorithm Terms Query Form A Query Document Sample Sample Space Space Vectors Query Form B Etc.. Etc.. Fig. 1. Content-Centered Retrieval as Matching Document Representations to Query Represen- tations The figure shows that samples of document and query objects from the respective universe of all objects are each represented in some fashion, most often using the same representation form. For example, a simple approach used in early commercial retrieval systems was to represent documents and queries with terms assigned from a controlled vocabulary and simply match overlaps. A more contemporary example returns ranked sets of similarities by representing documents and queries as vectors of inverse document frequency values for a specific set of terms in the sample ordered by cosine similarity. In cases where the document and query representations are in different forms (e.g., different metadata schemes or human languages), crosswalks, translations, or interlingua must also be added to the process. This content-centered paradigm has driven creative work and led to mainly effective retrieval systems (e.g., SMART, Okapi, Iquery), however, progress toward improving both recall and preci- sion seems to have reached a diminishing return state. Two important changes have been taking place in the electronic information envi- ronment that expand this schema and stimulate new kinds of IR research and devel- opment. These changes are due to new types and properties of information objects and to increasing attention to human participation in the IR process. The IR commu- nity has begun to recognize these changes as illustrated by the two grand research and development challenges identified for IR research at a recent strategic workshop [1]: global information access (���Satisfy human information needs through natural, effi-
Readership Statistics
37 Readers on Mendeley
by Discipline
11% Social Sciences
3% Psychology
by Academic Status
30% Ph.D. Student
14% Student (Master)
11% Professor
by Country
16% Germany
11% Canada
11% United Kingdom
Sign up today - FREE
Mendeley saves you time finding and organizing research. Learn more
- All your research in one place
- Add and import papers easily
- Access it anywhere, anytime



