Sign up & Download
Sign in

Predicting Sequences of User Actions

by Brian D Davison, Haym Hirsh
Artificial Intelligence (1998)

Cite this document (BETA)

Available from www.cse.lehigh.edu
Page 1
hidden

Predicting Sequences of User Actions

To be presented at the AAAI/ICML 1998 Workshop on Predicting the Future: AI Approaches to Time-Series Analysis 1
Predicting Sequences of User Actions
Brian D. Davison and Haym Hirsh
Department of Computer Science
Rutgers, The State University of New Jersey
New Brunswick, NJ 08903 USAfdavison,hirshg@cs.rutgers.edu
Abstract
People display regularities in almost everything they do. This
paper proposes characteristics of an idealized algorithm that,
when applied to sequences of user actions, would allow a user
interface to adapt over time to an individual’s pattern of use.
We describe a simple predictive method with these character-
istics and show its predictive accuracy on a large dataset of
UNIX commands to be at least as good as others that have
been considered, while using fewer computational and mem-
ory resources.
Motivation
How predictable are you? Each of us displays patterns of ac-
tions throughout whatever we do. Most occur without con-
scious thought. Some patterns are widespread among large
communities, and are taught, as rules, such as reading from
left to right, or driving on the correct side of the road. Other
patterns are a function of our lifestyle, such as picking up
pizza on the way home from work every Friday, or program-
ming the VCR to record our favorite comedy each week.
Many are a result of the way interfaces are designed, like the
pattern of movement of your finger on a phone dialing a num-
ber you call often, or how you might log into your computer,
check mail, read news, and visit your favorite website for the
latest sports scores. As computers pervade more and more
aspects of our lives, the need for a system to be able to adapt
to the user, perhaps in ways not programmed explicitly by
the system’s designer, become ever more apparent.
A car that can offer advice on driving routes is useful; one
that can also guess your destination (such as a pizza parlor
because it is Friday and you are leaving work) is likely to
be found even more useful, particularly if you didn’t have to
program it explicitly with that knowledge. The ability to pre-
dict the user’s next action allows the system to anticipate the
user’s needs (perhaps through speculative execution or intel-
ligent defaults) and to adapt to and improve upon the user’s
work habits (such as automating repetitive tasks). Addition-
ally, adaptive interfaces have also been shown to help those
with disabilities (Greenberg et al. 1995; Demasco & McCoy
1992).
This paper considers the more mundane, but present-day
activities of user actions within a command line shell. We
have concentrated initially on UNIX command prediction
1
because of its continued widespread use; the UNIX shell
provides an excellent testbed for experimentation and auto-
matic data collection. However, our interest is in more gen-
eral action prediction, and so we hypothesize that successful
methodologies will also be applicable in other interfaces, in-
cluding futuristic ones anticipated above as well as present-
day menu selection in GUIs and voice-mail, or URL selec-
tion in web browsers. This paper, therefore, reflects our fo-
cus on the underlying technology for action prediction, rather
than on how prediction can be effectively used within an in-
terface.
In this paper, we use the data from two user studies to
suggest that relatively naive methods can predict a particu-
lar user’s next command surprisingly well. With the generic
task in mind, we will describe the characteristics of an ideal
algorithm for action prediction. Finally, we will present and
analyze a novel algorithm that satisfies these characteris-
tics and additionally performs better than the previous best-
performing system.
Background
This paper addresses the task of predicting the next element
in a sequence, where the sequence is made up of nominal (un-
ordered as well as non-numeric) elements. This type of prob-
lem (series prediction) is not studied often by machine learn-
ing researchers; concept recognition (i.e., a boolean classifi-
cation task such as sequence recognition) is more common,
as is the use of independent samples from a distribution of
examples. UNIX commands, and user actions in general,
however, are not independent, and being nominal, don’t fall
into the domain of traditional statistical time-series analysis
techniques.
Evaluation Criteria
In most machine learning experiments that have a single
dataset of independent examples, cross-validation is the stan-
dard method of evaluating the performance of an algorithm.
When cross-validation is inappropriate, partitioning the data
into separate training and test sets is common. For sequen-
tial datasets, then, the obvious split would have the training
set contain the first portion of the sequence, and the test set
1
We are currently ignoring command arguments and switches.
Page 2
hidden
...
96102513:34:49 cd
96102513:34:49 ls
96102513:34:49 emacs
96102513:34:49 exit
96102513:35:32 BLANK
96102513:35:32 cd
96102513:35:32 cd
96102513:35:32 rlogin
96102513:35:32 exit
96102514:25:46 BLANK
96102514:25:46 cd
96102514:25:46 telnet
96102514:25:46 ps
96102514:25:46 kill
96102514:25:46 emasc
96102514:25:46 emacs
96102514:25:46 cp
96102514:25:46 emacs
...
Figure 1: A portion of one user’s history, showing the times-
tamp of the start of the session and the command typed. (The
token BLANK marks the start of a new session.)
contain the latter portion (so that the algorithm is not trained
on data occuring after the test data). However, since we are
proposing an adaptive method, we will be evaluating perfor-
mance online — each algorithm is tested on the current com-
mand using the preceding commands for training. This max-
imizes the number of evaluations of the algorithms on unseen
data and reflects the expected application of such an algo-
rithm.
When considering performance across multiple users with
differing amounts of data, we use two methods to compute
averages. Macroaveraged results compute statistics sepa-
rately for each user, and then averages these statistics over
all users. Alternately, microaveraged results compute an av-
erage over all data, determining the number of correct pre-
dictions made across all users divided by the total number
of commands for all users combined. The former provides
equal weight to all users, since it averages across the average
performance of each user; the latter emphasizes users with
large amounts of data.
People Tend To Repeat Themselves
In order to determine how much repetition and other recog-
nizable regularities were present in the average user’s com-
mand line work habits, we collected command histories of
77 users, totaling over 168,000 commands executed during a
period of 2-6 months (Davison & Hirsh 1997a; 1997b) (see
Figure 1 for an example of the data that was collected). The
bulk of these users (70) were undergraduate computer sci-
ence students in an Internet programming course and the rest
were graduate students or faculty. All users had the option to
disable logging and had access to systems on which logging
was not being performed.
The average user had over 2000 command instances in his
or her history, using 77 distinct commands during that time.
On average over all users (macroaverage), 8.4% of the com-
mands were new and had not been logged previously. The
microaverage of new commands, however, was only 3.6%,
reflecting the fact that smaller samples had larger numbers
of unique commands. Approximately one out of five com-
mands were the same as the previous command executed
(that is, the user repeated the last command 20% of the time).
Earlier Results
In previous work (Davison & Hirsh 1997a; 1997b), we con-
sidered a number of simple and well-studied algorithms. In
each of these, the learning problem was to examine the com-
mands executed previously, and to predict the command to
be executed next. We found that, without explicit domain
knowledge, a naive method based on C4.5 (Quinlan 1993)
was able to predict each command with a macroaverage ac-
curacy of 38.5% (microaverage was 37.2%). For each pre-
diction, C4.5 was trained on the series of examples of the
form (Commandi2; Commandi1) ) Commandi;for1 i  k,wherek is the number of examples seen so far.
Command0 and Command1 are both defined to have the
value BLANK to allow prediction of the first and second
commands using the same form.
While the prediction method was a relatively straightfor-
ward application of a standard machine learning algorithm,
it has a number of drawbacks, including that it returned only
the single most likely command. C4.5 also has significant
computational overhead. It can only generate new decision-
trees; it does not incrementally update or improve the deci-
sion tree upon receiving new information. (While there are
other decision-tree systems that can perform incremental up-
dates (Utgoff 1989), they have not achieved the same levels
of performance as C4.5.) Therefore, C4.5 decision tree gen-
eration must be performed outside of the command predic-
tion loop.
Additionally, since C4.5 (like many other machine learn-
ing algorithms) is not incremental, it must revisit each past
command situation, causing the decision-tree generation to
require more time and computational resources as the num-
ber of commands in the history grows. Finally, it treats each
command instance equally; commands at the beginning of
the history are just as important as commands that were re-
cently executed. Note that C4.5 was selected as a com-
mon, well-studied decision-tree learner with excellent per-
formance over a variety of problems, but not with any claim
of superiority over other algorithms applicable to this do-
main.
These initial experiments dealt with some of these is-
sues by only allowing the learning algorithm to consider
the command history within some fixed window. This pre-
vented the model generation time from growing without
bound and from exceeding all available system memory.
This workaround, however, caused the learning algorithms
to forget relatively rare but consistently predictable situa-
tions (such as typographical errors) and restricted consider-
ation only to recent commands.

Sign up today - FREE

Mendeley saves you time finding and organizing research. Learn more

  • All your research in one place
  • Add and import papers easily
  • Access it anywhere, anytime

Start using Mendeley in seconds!

Already have an account? Sign in

Readership Statistics

14 Readers on Mendeley
by Discipline
 
 
7% Law
by Academic Status
 
64% Ph.D. Student
 
7% Student (Master)
 
7% Lecturer
by Country
 
29% United States
 
14% Germany
 
14% Portugal