Sign up & Download
Sign in

Usability evaluation considered harmful (some of the time)

by Saul Greenberg, Bill Buxton
Proceeding of the twentysixth annual CHI conference on Human factors in computing systems CHI 08 (2008)

Abstract

Current practice in Human Computer Interaction as encouraged by educational institutes, academic review processes, and institutions with usability groups advocate usability evaluation as a critical part of every design process. This is for good reason: usability evaluation has a significant role to play when conditions warrant it. Yet evaluation can be ineffective and even harmful if naively done 'by rule' rather than 'by thought'. If done during early stage design, it can mute creative ideas that do not conform to current interface norms. If done to test radical innovations, the many interface issues that would likely arise from an immature technology can quash what could have been an inspired vision. If done to validate an academic prototype, it may incorrectly suggest a design's scientific worthiness rather than offer a meaningful critique of how it would be adopted and used in everyday practice. If done without regard to how cultures adopt technology over time, then today's reluctant reactions by users will forestall tomorrow's eager acceptance. The choice of evaluation methodology - if any - must arise from and be appropriate for the actual problem or research question under consideration.

Cite this document (BETA)

Available from portal.acm.org
Page 1
hidden

Usability evaluation considered harmful (some of the time)

Usability Evaluation Considered Harmful
(Some of the Time)
Saul Greenberg
Department of Computer Science
University of Calgary
Calgary, Alberta, T2N 1N4, Canada
saul.greenberg@ucalgary.ca
Bill Buxton
Principle Researcher
Microsoft Research
Redmond, WA, USA
bibuxton@microsoft.com

ABSTRACT
Current practice in Human Computer Interaction as
encouraged by educational institutes, academic review
processes, and institutions with usability groups advocate
usability evaluation as a critical part of every design
process. This is for good reason: usability evaluation has a
significant role to play when conditions warrant it. Yet
evaluation can be ineffective and even harmful if naively
done ‘by rule’ rather than ‘by thought’. If done during early
stage design, it can mute creative ideas that do not conform
to current interface norms. If done to test radical
innovations, the many interface issues that would likely
arise from an immature technology can quash what could
have been an inspired vision. If done to validate an
academic prototype, it may incorrectly suggest a design’s
scientific worthiness rather than offer a meaningful critique
of how it would be adopted and used in everyday practice.
If done without regard to how cultures adopt technology
over time, then today's reluctant reactions by users will
forestall tomorrow's eager acceptance. The choice of
evaluation methodology – if any – must arise from and be
appropriate for the actual problem or research question
under consideration.
Author Keywords
Usability testing, interface critiques, teaching usability.
ACM Classification Keywords
H5.2. Information interfaces and presentation (e.g., HCI):
User Interfaces (Evaluation/Methodology).
In 1968, Dijkstra wrote ‘Go To Statement Considered
Harmful’, a critique of existing programming practices that
eventually led the programming community to adopt
structured programming [8]. Since then, titles that include
the phrase ‘considered harmful’ signal a critical essay that
advocates change. This article is written in that vein.
INTRODUCTION
Usability evaluation is one of the major cornerstones of
user interface design. This is for good reason. As Dix et al.,
remind us, such evaluation helps us “assess our designs and
test our systems to ensure that they actually behave as we
expect and meet the requirements of the user” [7]. This is
typically done by using an evaluation method to measure or
predict how effective, efficient and/or satisfied people
would be when using the interface to perform one or more
tasks. As commonly practiced, these usability evaluation
methods range from laboratory-based user observations,
controlled user studies, and/or inspection techniques
[7,22,1]. The scope of this paper concerns these methods.
The purpose behind usability evaluation, regardless of the
actual method, can vary considerably in different contexts.
Within product groups, practitioners typically evaluate
products under development for ‘usability bugs’, where
developers are expected to correct the significant problems
found (i.e., iterative development). Usability evaluation can
also form part of an acceptance test, where human
performance while using the system is measured
quantitatively to see if it falls within an acceptable criteria
(e.g., time to complete a task, error rate, relative
satisfaction). Or if the team is considering purchasing one
of two competing products, usability evaluation can
determine which is better at certain things.
Within HCI research and academia, researchers employ
usability evaluation to validate novel design ideas and
systems, usually by showing that human performance or
work practices are somehow improved when compared to
some baseline set of metrics (e.g., other competing ideas),
or that people can achieve a stated goal when using this
system (e.g., performance measures, task completions), or
that their processes and outcomes improve.
Clearly, usability evaluation is valuable for many situations,
as it often helps validate both research ideas and products at
varying stages in its lifecycle. Indeed, we (the authors) have
advocated and practiced usability evaluation in both
research and academia for many decades. We believe that
the community should continue to evaluate usability for
many – but not all – interface development situations. What
we will argue is that there are some situations where

Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise,
or republish, to post on servers or to redistribute to lists, requires prior
specific permission and/or a fee.
CHI 2008, April 5–10, 2008, Florence, Italy.
Copyright 2008 ACM 978-1-60558-011-1/08/04…$5.00
CHI 2008 Proceedings · Usability Evaluation Considered Harmful? April 5-10, 2008 · Florence, Italy
111
Page 2
hidden
usability evaluation can be considered harmful: we have to
recognize these situations, and we should consider
alternative methods instead of blindly following the
usability evaluation doctrine. Usability evaluation, if
wrongfully applied, can quash potentially valuable ideas
early in the design process, incorrectly promote poor ideas,
misdirect developers into solving minor vs. major
problems, or ignore (or incorrectly suggest) how a design
would be adopted and used in everyday practice.
This essay is written to help counterbalance what we too
often perceive as an unquestioning adoption of the doctrine
of usability evaluation by interface researchers and
practitioners. Usability evaluation is not a universal
panacea. It does not guarantee user-centered design. It will
not always validate a research interface. It does not always
lead to a scientific outcome. We will argue that:
the choice of evaluation methodology – if any – must arise from
and be appropriate for the actual problem or research question
under consideration.
We illustrate this problem in three ways. First, we describe
one of the key problems: how the push for usability
evaluation in education, academia, and industry has led to
the incorrect belief that designs – no matter what stage of
development they are in – must undergo some type of
usability evaluation if they are to be considered part of a
successful user-centered process. Second, we illustrate how
problems can arise by describing a variety of situations
where usability evaluation is considered harmful: (a) we
argue that scientific evaluation methods do not necessarily
imply science; (b) we argue that premature usability
evaluation of early designs can eliminate promising ideas or
the pursuit of multiple competing ideas; (c) we argue that
traditional usability evaluation of inventions and
innovations do not provide meaningful information about
its cultural adoption over time. Third, we give general
suggestions of what we can do about this. We close by
pointing to others who have debated the merits of usability
evaluation within the CHI context.
THE HEAVY PUSH FOR USABILITY EVALUATION
Usability evaluation
is central to today’s
practice of HCI. In
HCI education, it is a
core component of
what students are
taught. In academia,
validating designs
through usability
evaluation is
considered the de facto standard for submitted papers to our
top conferences. In industry, interface specialists regard
usability evaluation as a major component of their work
practice.
HCI Education
The ACM SIGCHI Curriculum formally defines HCI as
“a discipline concerned with the design, evaluation and
implementation of interactive computing systems for human
use…” [17, emphasis added].
The curriculum stresses the teaching of evaluation
methodologies as one of its major modules. This has
certainly been taken up in practice, although in a somewhat
limited manner. While there are many evaluation methods,
the typical undergraduate HCI course stresses usability
evaluation – laboratory-based user observations, controlled
studies, and /or inspection – as a key course component in
both lectures and student projects [7,13]. Following the
ACM Curriculum, the canonical development process
drummed into students’ heads is the iterative process of
design, implement, evaluate, redesign, re-implement, re-
evaluate, and so on [7,13,17]. Because usability evaluation
methodologies are easy to teach, learn, and examine (as
compared to other ‘harder’ methods such as design, field
studies, etc.), it has become perhaps the most concrete
learning objective in a standard HCI course.
CHI Academic Output
Our key academic conferences such as ACM CHI, CSCW
and even UIST strongly suggest that authors validate new
designs of an interactive technology. For example, the
ACM CHI 2008 Guide to Successful Submissions states:
“does your contribution take the form of a design for a new
interface, interaction technique or design tool? If so, you will
probably want to demonstrate ‘evaluation’ validity, by
subjecting your design to tests that demonstrate its
effectiveness. [21]
The consequence is that the CHI academic culture generally
accepts the doctrine that submitted papers on system design
must include a usability evaluation – usually controlled
experimentation or empirical usability testing – if it is to
have a chance of success. Not only do authors believe this,
but so do reviewers:
“Reviewers often cite problems with validity, rather than with
the contribution per se, as the reason to reject a paper” [21].
Our own combined five-decades of experiences
intermittently serving as Program Committee member,
Associate Chair, Program Chair or even Conference Chair
of these and other HCI conferences confirm that this ethic –
while sometimes challenged – is fundamental to how many
papers are written and judged. Indeed, Barkhuus and
Rode’s analysis of ACM CHI papers published over the last
24 years found that the proportion of papers that include
evaluation – particularly empirical evaluation – has
increased substantially, to the point where almost all
accepted papers have some evaluation component [1].
Industry
Over the last decade, industries are incorporating interface
methodologies as part of their day-to-day development
practice. This often includes the formation of an internal
group of people dedicated to considering interface design as
a first class citizen. These groups tend to specialize in
usability evaluation. They may evaluate different design
CHI 2008 Proceedings · Usability Evaluation Considered Harmful? April 5-10, 2008 · Florence, Italy
112

Sign up today - FREE

Mendeley saves you time finding and organizing research. Learn more

  • All your research in one place
  • Add and import papers easily
  • Access it anywhere, anytime

Start using Mendeley in seconds!

Already have an account? Sign in

Readership Statistics

168 Readers on Mendeley
by Discipline
 
 
 
by Academic Status
 
43% Ph.D. Student
 
23% Student (Master)
 
7% Researcher (at a non-Academic Institution)
by Country
 
27% United States
 
11% Germany
 
11% United Kingdom