Sign up & Download
Sign in

Predicting user tasks: I know what you're doing

by Simone Stumpf, Xinlong Bao, Anton Dragunov, T G Dietterich, Jon Herlocker, K Johnsrude, Lida Li, Jianqiang Shen
20th National Conference on Artificial Intelligence AAAI05 Workshop on Human Comprehensible Machine Learning (2005)

Cite this document (BETA)

Available from scholar.google.com
Page 1
hidden

Predicting user tasks: I know what you're doing

Predicting User Tasks: I Know What You’re Doing!
Simone Stumpf, Xinlong Bao, Anton Dragunov, Thomas G. Dietterich, Jon Herlocker, Kevin
Johnsrude, Lida Li, JianQiang Shen
School of Electrical Engineering
Oregon State University
Corvallis, OR
stumpf@eecs.oregonstate.edu

Abstract
Knowledge workers spend the majority of their working
hours processing and manipulating information. These users
face continual costs as they switch between tasks to retrieve
and create information. The TaskTracer project at Oregon
State University is investigating the possibilities of a
desktop software system that will record in detail how
knowledge workers complete tasks, and intelligently
leverage that information to increase efficiency and
productivity. Our approach combines human-computer
interaction and machine learning to assign each observed
action (opening a file, saving a file, sending an email,
cutting and pasting information, etc.) to a task for which it is
likely being performed. In this paper we report on ways we
have applied machine learning in this environment and
lessons learned so far.
Introduction
Knowledge workers spend the majority of their working
hours processing and manipulating information. These
users face continual costs as they switch between tasks to
retrieve and create information. The information may be
encoded in many different formats: documents, software
code, web pages, email messages, phone conversations.
The cost to the user of finding information may be
cognitive: workers may have to remember exactly where
they were in a chain of logic, or why they decided to take
their most recent action on a task. The cost may also lie in
the manual interaction needed to access the necessary
resources (e.g., documents and/or software tools).
Knowledge workers organize their work into discrete
and describable units, such as projects, tasks or to-do
items. The TaskTracer project at Oregon State University
is investigating the possibilities of a desktop software
system that will record in detail how knowledge workers
complete tasks, and intelligently leverage that information
to increase efficiency and productivity. Our goal is to
develop five capabilities: more task-aware user interfaces
in the applications we use daily, more efficient task-
interruption recovery, better personal information
management, workgroup information management and
within-workgroup workflow detection and analysis. Our
system operates in the Microsoft Windows environment,
tracking most interactions with desktop applications as
well as tracking phone calls. Our approach combines
human-computer interaction and machine learning to
assign each observed action (opening a file, saving a file,
sending an email, cutting and pasting information, etc.) to a
task for which it is likely being performed. Once we have
the past actions structured by task, we can provide
substantial value to the knowledge worker in assisting in
their daily task routines.
There is a substantial set of research challenges that
must be faced in order to successfully develop the
TaskTracer system with these capabilities. These
challenges include user interface design, machine learning,
privacy and workplace culture, data collection, systems
architecture, and data modeling. In this paper we report on
ways we have applied machine learning in this
environment and lessons learned so far.
Task-tracking and task-related systems
There have been previous efforts to build environments
that enable knowledge workers to manage multiple
concurrent activities, which we call tasks, and use
knowledge of those activities to improve productivity.
Workspaces (Bannon et al. 1983) can define tasks that
comprise information resources (usually documents and
tools for their processing) that are necessary to accomplish
the goal associated with the task. Some systems work on
the idea of physically separating tasks by requiring users to
create project-specific folders, or set up a virtual desktop
for each particular task (Card and Henderson 1987,
Robertson et al. 2000). Other systems work at a more
abstract level by organizing task-specific workspaces using
“filters” applied to communication threads (Bellotti et al.
2003), streams or networks of documents (Freeman and
Gelernter 1996, Dourish et al. 1999).
To be of assistance to a user, an agent (whether it is a
computer system or a human associate) must “know” what
the user is currently doing. In addition to the resources
used in a task, it also seems reasonable to record users’
actions performed on those resources. The rationale behind
this is that to have the correct comprehension of the task
context for some resources we must consider in which way
and for what reason they were accessed. For instance, the
same document (say, a text file) may be opened for two
completely different purposes: 1) for reading and 2) for
authoring. Various systems (Fenstermacher and Ginsburg
Page 2
hidden
2002, Kaptelinin 2003, Canny 2004) address this issue by
aiming at recording as much information as possible about
users’ activities when they interact with computers. These
activity records are obtained via monitoring the computer
file system, input devices, and applications.
Our software, TaskTracer, employs an extensive data-
collection framework to obtain detailed observations of
user interactions in the common productivity applications
used in knowledge work (Dragunov et al. 2004). Currently,
events are collected from Microsoft Office 2003, Microsoft
Visual Studio .NET, Windows XP operating system and
phone calls. In this framework, TaskTracer collects file
pathnames for file create, change, open, print and save, text
selection, copy-paste, windows focus, web navigation,
phone call, clipboard and email events. Phone call data
collection uses Caller Id to collect names and phone
numbers of callers. In addition, speech-to-text software
collects the user’s — but not the caller’s — phone speech.
All events are captured as individual EventMessages which
contain:
• Type: Event type. For example, TaskTracer captures
window focus, file open, file save, web page navigation,
text selection, and many other events on both the
applications and the operating system levels.
• Window ID: Window handle for windows, zero
otherwise.
• Listener Version: Changes every time we change or add
to the EventMessages the Listener can send and process.
This allows backward compatibility as we change our
data capture.
• Listener ID, the source of the EventMessage: MS Office
pro-grams, file system hooks, user, clipboard, phone,
etc.
• Body Type, Body: Event or document data in XML
format.
• Time: Time the event fired.

Instead of using unsupervised clustering to discover
tasks (Canny 2004), users of TaskTracer manually specify
what tasks they are doing in the initial stage of data
collection, so that each action of the user (a User Interface
event) will be tagged with a particular task identifier to
train predictors. We believe that we can learn to reliably
predict the users’ current task and task switches, and thus
we can create complete and detailed records of what has
been done on every task (past and present). All
EventMessages are stored in a database in raw form so that
researchers can analyze the history of user events. A
variety of learning models can be tested on identical data
sets. We are currently researching learning models based
on the event data for predicting the current task of the user,
for detecting when the user has changed tasks and for
reducing the cost of accessing resources whilst carrying out
a task.

Plan Recognition and Task Prediction
There is a range of plan recognition tasks that people have
addressed (Ourston and Mooney 1990, Davison and Hirsh
1998, Bauer 1999). For example, some work has been
carried out to recognize that someone is executing an
instance of a particular plan and suggest the next action. A
plan often has flexibility but the user is executing a specific
structured activity (e.g., taking money out of an ATM,
calibrating a glucose meter).
What we are addressing is supporting an unstructured
activity (e.g. writing the AAAI submission, putting
together a research study). These activities typically have
no or only a loosely fixed structure and are highly
distinctive to the individual knowledge worker. Hence, the
way that we use the term “task” is a user-defined concept
name, instead of a sequence of user actions (as an aside,
we would call this sequence an “event stream”).
These two approaches vary mostly in their degree of
sequential/hierarchical structure in the activity. We are not
suggesting that these two approaches are mutually
exclusive, indeed, much can be learned from plan
recognition.
It could be argued that no effective user support is
possible without a deep structure that can be used to
explain the observed user behavior. However, we are not
trying to explain the user behavior itself since the user is
very competent already in deciding what to do (and what to
do next). What we are trying to achieve instead is the
reduction of the costs that knowledge workers face when
they carry out their tasks by keeping task-related
information organized. Costs may be physical/mechanical
such as the number of user interface interactions (mouse,
keyboard, etc) needed to achieve a goal. Costs may
sometimes be in time. There are also cognitive costs, such
as the remembering where a piece of information was filed
or learning any new features. In addition to the “actual”
costs that workers encounter while pursuing their tasks, we
must also be particularly aware of the perceived costs of
using any features.
TaskPredictor and FolderPredictor
Automatic translation of interaction histories into project
contexts is very challenging to implement (Kaptelinin
2003). If users must indicate task switching manually (as
currently implemented in TaskTracer), this will create
additional cognitive and physical costs for users, since they
will have to 1) mentally structure their activities and 2)
perform additional actions not directly related to the
current goals — select tasks from lists, type in task titles
and descriptions, etc. We believe that we can reduce these
costs by combining probabilistic machine learning
approaches with appropriate user interfaces that maximize
online learning whilst reducing the cost on the user.
There are three main challenges to the machine learning
approach. Firstly, accuracy must be exceptionally high to
be acceptable to the user. Secondly, manual task switches

Sign up today - FREE

Mendeley saves you time finding and organizing research. Learn more

  • All your research in one place
  • Add and import papers easily
  • Access it anywhere, anytime

Start using Mendeley in seconds!

Already have an account? Sign in

Readership Statistics

7 Readers on Mendeley
by Discipline
 
by Academic Status
 
57% Ph.D. Student
 
29% Researcher (at an Academic Institution)
 
14% Assistant Professor
by Country
 
29% United States
 
29% Germany
 
14% United Kingdom