Exploiting N-gram Analysis to Predict Operator Sequences
Available from
Christian Muise's profile on Mendeley.
Page 1
Exploiting N-gram Analysis to Predict Operator Sequences
Exploiting N-gram Analysis to Predict Operator Sequences
Christian Muise Sheila McIlraith Jorge A. Baier Michael Reimer
Department of Computer Science, University of Toronto, Toronto, Canada. M5S 3G4.
{cjmuise, sheila, jabaier, mreimer} @cs.toronto.edu
Abstract
N-gram analysis provides a means of probabilistically pre-
dicting the next item in a sequence. Due originally to Shan-
non, it has proven an effective technique for word prediction
in natural language processing and for gene sequence analy-
sis. In this paper, we investigate the utility of n-gram anal-
ysis in predicting operator sequences in plans. Given a set
of sample plans, we perform n-gram analysis to predict the
likelihood of subsequent operators, relative to a partial plan.
We identify several ways in which this information might be
integrated into a planner. In this paper, we investigate one of
these directions in further detail. Preliminary results demon-
strate the promise of n-gram analysis as a tool for improving
planning performance.
Introduction
The augmentation of deterministic planners with a learn-
ing component has been a topic of much research in recent
years. A learning component generally takes as input a set
of high-quality problem- or domain-specific bootstrap plans
from which plan properties are learned. These properties are
then used to augment an existing planner or planning domain
so that subsequent plan generation over related planning in-
stances is improved. Improvement is generally measured by
a reduction in plan generation time and/or by an increase in
plan quality relative to the performance of the planner with-
out the learning component input. A notable result of the
interest in planning and learning was the creation of a new
Learning Track at the 2008 International Planning Compe-
tition (IPC-2008). This competition attracted 13 entrants
from 10 groups. Entries differed with respect to the types of
knowledge learned and with respect to how this knowledge
was incorporated into subsequent planning. Approaches in-
cluded learning policies, macro operators, sub-goal decom-
positions, and improving value functions; the most success-
ful all-round planner was simply a portfolio of existing plan-
ners (Fern, Tadepalli, and Khardon 2008).
In this paper we take a different approach to learning that
is inspired by research in computational linguistics, and in
particular by the task of word prediction. In word prediction,
given a partial sentence, the system predicts the next likely
word to extend the sentence. Most of us have seen word
Copyright c© 2009, Association for the Advancement of Artificial
Intelligence (www.aaai.org). All rights reserved.
prediction technology, which is common on mobile devices
and often used by keyboard users with physical or cognitive
disabilities. This predictive technology is often based on n-
gram analysis (Garay-Vitoria and Abascal 2006).
N-gram analysis provides a means of probabilistically
predicting the next item in a sequence. Due originally to
Shannon, it has proven an effective technique for word pre-
diction in natural language processing and for gene sequence
analysis. Here, we investigate the utility of n-gram analy-
sis in predicting operator sequences in plans. Given a set
of sample plans, we perform n-gram analysis to predict the
likelihood of subsequent operators, relative to a partial plan.
An n-gram is a subsequence of n items (e.g., letters,
words, base pairs, plan operators) from a sequence. An n-
gram of size 1 is a unigram, of size 2 is a bigram, and of
size 3 is a trigram. When performing n-gram analysis, one
defines both the size of the subsequence (the window size)
and the size of n. If the window size is greater than n, wild-
card items may be interspersed among grounded ones. The
analysis results in statistics regarding the frequency of oc-
currence of n-grams within subsequences of a corpus.
Certainly the idea of using n-gram analysis seems intu-
itive and compelling, but there are a number of challenges
to its use. What is the optimal window and n-gram size?
What is the analogue of a word? Is it a lifted operator
name or a propositional representation of a grounded oper-
ator? If lifted, how are arguments encoded and how are the
relationships between arguments in sequences of operators
exploited? And how do we integrate this information into
subsequent planning? We might, for example, enhance the
quality of a search heuristics with our predictive informa-
tion; we might use these statistics to define a preferred oper-
ator ordering for operator selection in enforced hill-climbing
(EHC) or A* search; or we might use n-gram statistics to
suggest macro operators that could be used to augment the
set of operators used for planning.
In this paper, we elected to exploit lifted representa-
tions of operators for our original n-gram analysis, resulting
in the determination of high frequency operator sequences
and/or orderings. We then postprocessed our n-grams to
discover argument relationships. The result is a set of high-
occurrence patterns: a notion which refers to a sequence of
operators plus zero or more equality constraints between the
arguments of the operators. We use these patterns to match
Christian Muise Sheila McIlraith Jorge A. Baier Michael Reimer
Department of Computer Science, University of Toronto, Toronto, Canada. M5S 3G4.
{cjmuise, sheila, jabaier, mreimer} @cs.toronto.edu
Abstract
N-gram analysis provides a means of probabilistically pre-
dicting the next item in a sequence. Due originally to Shan-
non, it has proven an effective technique for word prediction
in natural language processing and for gene sequence analy-
sis. In this paper, we investigate the utility of n-gram anal-
ysis in predicting operator sequences in plans. Given a set
of sample plans, we perform n-gram analysis to predict the
likelihood of subsequent operators, relative to a partial plan.
We identify several ways in which this information might be
integrated into a planner. In this paper, we investigate one of
these directions in further detail. Preliminary results demon-
strate the promise of n-gram analysis as a tool for improving
planning performance.
Introduction
The augmentation of deterministic planners with a learn-
ing component has been a topic of much research in recent
years. A learning component generally takes as input a set
of high-quality problem- or domain-specific bootstrap plans
from which plan properties are learned. These properties are
then used to augment an existing planner or planning domain
so that subsequent plan generation over related planning in-
stances is improved. Improvement is generally measured by
a reduction in plan generation time and/or by an increase in
plan quality relative to the performance of the planner with-
out the learning component input. A notable result of the
interest in planning and learning was the creation of a new
Learning Track at the 2008 International Planning Compe-
tition (IPC-2008). This competition attracted 13 entrants
from 10 groups. Entries differed with respect to the types of
knowledge learned and with respect to how this knowledge
was incorporated into subsequent planning. Approaches in-
cluded learning policies, macro operators, sub-goal decom-
positions, and improving value functions; the most success-
ful all-round planner was simply a portfolio of existing plan-
ners (Fern, Tadepalli, and Khardon 2008).
In this paper we take a different approach to learning that
is inspired by research in computational linguistics, and in
particular by the task of word prediction. In word prediction,
given a partial sentence, the system predicts the next likely
word to extend the sentence. Most of us have seen word
Copyright c© 2009, Association for the Advancement of Artificial
Intelligence (www.aaai.org). All rights reserved.
prediction technology, which is common on mobile devices
and often used by keyboard users with physical or cognitive
disabilities. This predictive technology is often based on n-
gram analysis (Garay-Vitoria and Abascal 2006).
N-gram analysis provides a means of probabilistically
predicting the next item in a sequence. Due originally to
Shannon, it has proven an effective technique for word pre-
diction in natural language processing and for gene sequence
analysis. Here, we investigate the utility of n-gram analy-
sis in predicting operator sequences in plans. Given a set
of sample plans, we perform n-gram analysis to predict the
likelihood of subsequent operators, relative to a partial plan.
An n-gram is a subsequence of n items (e.g., letters,
words, base pairs, plan operators) from a sequence. An n-
gram of size 1 is a unigram, of size 2 is a bigram, and of
size 3 is a trigram. When performing n-gram analysis, one
defines both the size of the subsequence (the window size)
and the size of n. If the window size is greater than n, wild-
card items may be interspersed among grounded ones. The
analysis results in statistics regarding the frequency of oc-
currence of n-grams within subsequences of a corpus.
Certainly the idea of using n-gram analysis seems intu-
itive and compelling, but there are a number of challenges
to its use. What is the optimal window and n-gram size?
What is the analogue of a word? Is it a lifted operator
name or a propositional representation of a grounded oper-
ator? If lifted, how are arguments encoded and how are the
relationships between arguments in sequences of operators
exploited? And how do we integrate this information into
subsequent planning? We might, for example, enhance the
quality of a search heuristics with our predictive informa-
tion; we might use these statistics to define a preferred oper-
ator ordering for operator selection in enforced hill-climbing
(EHC) or A* search; or we might use n-gram statistics to
suggest macro operators that could be used to augment the
set of operators used for planning.
In this paper, we elected to exploit lifted representa-
tions of operators for our original n-gram analysis, resulting
in the determination of high frequency operator sequences
and/or orderings. We then postprocessed our n-grams to
discover argument relationships. The result is a set of high-
occurrence patterns: a notion which refers to a sequence of
operators plus zero or more equality constraints between the
arguments of the operators. We use these patterns to match
Page 2
with the plan history during the EHC phase of FF (Hoff-
mann and Nebel 2001). By matching the patterns with the
history and possible next operators, we construct a set of fa-
miliar operators to investigate that take precedence over the
helpful actions generated by FF.
In the sections to follow, we describe our approach and
present preliminary results, which demonstrate the promise
of n-gram analysis as a tool for improving planning perfor-
mance. We conclude with some discussion and future work.
Approach
We divide the presentation of our approach in two parts. The
first corresponds to the learning phase, in which we extract
statistical information from a set of plans, and then compile
this information into patterns that can be exploited by the
planner. The second phase is the planning phase. Our plan-
ner is a modification of FF that exploits the learned patterns.
Next we describe the two components.
Learning Phase
The learning phase of our approach involves a number of
individual steps, which are described below.
1. Solve the bootstrap problem set. Problems are solved
using the Lama planner (Richter, Helmert, and Westphal
2008), run until it produces the best possible solution.1
2. Convert each of the discovered plans into sequences of
operator names.
3. Run n-gram analysis on the operator-name sequences.
Here we run a standard n-gram statistics package (Baner-
jee and Pedersen 2003) on the sequences of operator
names with various n-gram lengths and window sizes. In
our experiments, the lengths we chose to investigate range
from 2 to 5 and the window sizes range from 2 to 10.
4. Construct patterns and pattern scores for each of the top
n-grams. Patterns are the central piece of information that
is exploited by the planner. Intuitively, they describe fam-
ilies of sequences of operators and corresponding argu-
ment constraints that seem to appear very frequently in
solutions. Further elaboration is provided below.
5. Determine the best settings for our planner. A setting is a
collection of patterns together with a parameter that esti-
mates the number of familiar operators the planner should
use. These settings are produced by solving the target
problems using our planner, and determining which set-
tings are most effective in solving as many problems as
possible. (Further details follow.)
6. Generate an experimental strategy. To maximize the num-
ber of problems solved, our planner will attempt to use not
one, but possibly several of the best settings computed in
the previous step. During experimental evaluation of our
planner, each setting is tried for 120 seconds and then the
planner switches to the next best setting. The order in
which the settings are tried is computed with a greedy al-
gorithm that attempts to maximize coverage.
1Quite often Lama is able to prove optimality with problems of
this size.
Constructing Patterns After retrieving the n-gram statis-
tics, we must compile the information into a form that can
be used within a planner. Using only the operator sequences,
however, is not enough. To illustrate this point, consider the
N-Puzzle domain where the only operator available ismove.
The n-grams generated from a corpus where only a single
operator appears (i.e. ‘move’) provide little value. To extract
more information from the generated plans, we consider the
relationship arguments have inside of the n-gram.
For each of the most frequent n-grams, we find all of
their occurrences in the generated plans and generalize the
grounded arguments. For example, consider the n-gram
〈move,move〉 in a planning domain where movement can
occur between locations connected to one another. When
examining the previous plans, we may find an occurrence of
the following which matches the n-gram:
move(loca, locb),move(locb, locc)
By forgetting the operator names and listing the arguments
from left to right, we can see that the second and third ar-
gument is the same (‘locb’). Each set of argument indices
is referred to as a group, and the set of argument groups,
[1][2,3][4], along with the associated n-gram, constitutes a
pattern. Each n-gram yields a set of patterns.
For use within the FF planner, we associate a score to each
pattern. This score can then be used to choose which pattern
to apply first in the generation of familiar operators. The
scoring procedure for patterns consists of sorting every pat-
tern, regardless of its associated n-gram, based on a generic
comparing function. The score of a pattern is set to be its
rank in this sorted list of all patterns. The comparing func-
tion we used is a series of tie-breaks on two patterns:
1. Prefer the pattern with higher frequency.
2. Tie-break: Prefer the pattern with more operators.
3. Tie-break: Prefer the pattern with fewer groups.
4. Tie-break: Prefer the pattern with more arguments.
5. Tie-break: Randomly pick one pattern.
Intuitively, we want to give a higher score to more fre-
quent patterns. If two patterns have equal frequencies, then
we want the one with more grounded operators2 since hav-
ing more operators indicates more of the pattern has been
matched. If these counts are equal as well, we choose the
pattern with the smaller number of groups, as this captures
more information about how the arguments relate to one an-
other. Finally we select a pattern with the greatest number of
arguments in general – if both patterns have the same num-
ber of groups, then more arguments indicates more informa-
tion about argument relations.
This is not the only possible scoring function, and we
hope to investigate further options in future work.
Planning Phase: Using Patterns in FF
The automated planner FF (Hoffmann and Nebel 2001) op-
erates in two distinct phases: EHC and best-first search.
While there are a number of ways we could incorporate the
use of patterns in FF, we decided to augment the set of suc-
cessor states considered in the EHC stage. When a state is
2Note that wildcard operators are not counted here.
mann and Nebel 2001). By matching the patterns with the
history and possible next operators, we construct a set of fa-
miliar operators to investigate that take precedence over the
helpful actions generated by FF.
In the sections to follow, we describe our approach and
present preliminary results, which demonstrate the promise
of n-gram analysis as a tool for improving planning perfor-
mance. We conclude with some discussion and future work.
Approach
We divide the presentation of our approach in two parts. The
first corresponds to the learning phase, in which we extract
statistical information from a set of plans, and then compile
this information into patterns that can be exploited by the
planner. The second phase is the planning phase. Our plan-
ner is a modification of FF that exploits the learned patterns.
Next we describe the two components.
Learning Phase
The learning phase of our approach involves a number of
individual steps, which are described below.
1. Solve the bootstrap problem set. Problems are solved
using the Lama planner (Richter, Helmert, and Westphal
2008), run until it produces the best possible solution.1
2. Convert each of the discovered plans into sequences of
operator names.
3. Run n-gram analysis on the operator-name sequences.
Here we run a standard n-gram statistics package (Baner-
jee and Pedersen 2003) on the sequences of operator
names with various n-gram lengths and window sizes. In
our experiments, the lengths we chose to investigate range
from 2 to 5 and the window sizes range from 2 to 10.
4. Construct patterns and pattern scores for each of the top
n-grams. Patterns are the central piece of information that
is exploited by the planner. Intuitively, they describe fam-
ilies of sequences of operators and corresponding argu-
ment constraints that seem to appear very frequently in
solutions. Further elaboration is provided below.
5. Determine the best settings for our planner. A setting is a
collection of patterns together with a parameter that esti-
mates the number of familiar operators the planner should
use. These settings are produced by solving the target
problems using our planner, and determining which set-
tings are most effective in solving as many problems as
possible. (Further details follow.)
6. Generate an experimental strategy. To maximize the num-
ber of problems solved, our planner will attempt to use not
one, but possibly several of the best settings computed in
the previous step. During experimental evaluation of our
planner, each setting is tried for 120 seconds and then the
planner switches to the next best setting. The order in
which the settings are tried is computed with a greedy al-
gorithm that attempts to maximize coverage.
1Quite often Lama is able to prove optimality with problems of
this size.
Constructing Patterns After retrieving the n-gram statis-
tics, we must compile the information into a form that can
be used within a planner. Using only the operator sequences,
however, is not enough. To illustrate this point, consider the
N-Puzzle domain where the only operator available ismove.
The n-grams generated from a corpus where only a single
operator appears (i.e. ‘move’) provide little value. To extract
more information from the generated plans, we consider the
relationship arguments have inside of the n-gram.
For each of the most frequent n-grams, we find all of
their occurrences in the generated plans and generalize the
grounded arguments. For example, consider the n-gram
〈move,move〉 in a planning domain where movement can
occur between locations connected to one another. When
examining the previous plans, we may find an occurrence of
the following which matches the n-gram:
move(loca, locb),move(locb, locc)
By forgetting the operator names and listing the arguments
from left to right, we can see that the second and third ar-
gument is the same (‘locb’). Each set of argument indices
is referred to as a group, and the set of argument groups,
[1][2,3][4], along with the associated n-gram, constitutes a
pattern. Each n-gram yields a set of patterns.
For use within the FF planner, we associate a score to each
pattern. This score can then be used to choose which pattern
to apply first in the generation of familiar operators. The
scoring procedure for patterns consists of sorting every pat-
tern, regardless of its associated n-gram, based on a generic
comparing function. The score of a pattern is set to be its
rank in this sorted list of all patterns. The comparing func-
tion we used is a series of tie-breaks on two patterns:
1. Prefer the pattern with higher frequency.
2. Tie-break: Prefer the pattern with more operators.
3. Tie-break: Prefer the pattern with fewer groups.
4. Tie-break: Prefer the pattern with more arguments.
5. Tie-break: Randomly pick one pattern.
Intuitively, we want to give a higher score to more fre-
quent patterns. If two patterns have equal frequencies, then
we want the one with more grounded operators2 since hav-
ing more operators indicates more of the pattern has been
matched. If these counts are equal as well, we choose the
pattern with the smaller number of groups, as this captures
more information about how the arguments relate to one an-
other. Finally we select a pattern with the greatest number of
arguments in general – if both patterns have the same num-
ber of groups, then more arguments indicates more informa-
tion about argument relations.
This is not the only possible scoring function, and we
hope to investigate further options in future work.
Planning Phase: Using Patterns in FF
The automated planner FF (Hoffmann and Nebel 2001) op-
erates in two distinct phases: EHC and best-first search.
While there are a number of ways we could incorporate the
use of patterns in FF, we decided to augment the set of suc-
cessor states considered in the EHC stage. When a state is
2Note that wildcard operators are not counted here.
Page 3
evaluated by the relaxed planning graph heuristic,3 a num-
ber of helpful actions are generated for expanding the EHC
search frontier. In many situations, however, this frontier is
not sufficient to discover valid plans.
When expanding a state S, our planner takes the history
of operators that lead to S and suggests familiar operators
based on patterns with the highest score that match the his-
tory. Familiar operators play a similar role to helpful ac-
tions, and the number of familiar operators that should be
added to the frontier is a run-time setting (as described in
Step 5 above), and this value is considered on a per-domain
basis to find the optimal bound. Having too many may cause
the search frontier to become too large, yielding poor perfor-
mance. In general, we tested settings in the range of 1 to 15.
Note that a setting of 0 causes FF to run without any change.
The familiar operators are a subset of all the applicable
operators from the current state S. For each applicable op-
erator A, we define seq(A,S) as history(S) ◦ A, i.e. the
sequence of operators that correspond to the operators lead-
ing to S, plus the operator A. We then associate a score with
A to be:
score(A) = max
p∈Pseq(A,S)
score(p),
where Pseq(A,S) are the patterns that match the sequence
seq(A,S). If |seq(A,S)| is smaller than the length of the
pattern, it does not count as matching. Additionally, if
|seq(A,S)| is larger than the length k of a pattern, only the
last k operators of seq(A,S) are used for comparison.
Example To illustrate this procedure, consider the scenario
in Figure 1 from the Gold Miner domain. Here, we are per-
forming EHC and have arrived at state S after going through
states 1, 2, and 3 via the operators ‘move’, ‘pickup laser’,
and ‘move’. For the sake of this example, assume that nei-
ther ‘move’ nor ‘fire-laser’ appear in the set of helpful ac-
tions, but are applicable in state S. Assume further that the
following patterns exist:
p1 ≡ 〈pickup laser , ∗,fire laser〉 : [1, 2]
p2 ≡ 〈move,move〉 : [1][2, 3][4]
p3 ≡ 〈move, ∗, ∗,move〉 : [1][2][3][4]
Pattern p1 is of length three and has a single wildcard (any
single operator matches this), p2 is only of length two, and
p3 is of length four with two wildcards. The arguments for
pickup laser and fire laser refer to the same object (since
there is only one group), and the argument grouping for p2
ensures that movement is through some specific location (the
second group). For p3, there is no relation between the two
move operators. In the framework described above, we have
the following realization:4
• history(S) = move, pickup laser ,move
• seq(fire laser , S) = move, pickup laser ,move,fire laser
• seq(move, S) = move, pickup laser ,move,move
• Pseq(fire laser ,S) = {p1}
3It is assumed that the reader is familiar with the relaxed plan-
ning graph heuristic, EHC, and FF in general.
4For ease of readability, we suppress listing arguments and as-
sume the patterns match.
Figure 1: EHC Search in the Gold Miner domain. Solid
lines after state S represent helpful actions and dashed lines
represent familiar operators.
• Pseq(move,S) = {p2, p3}
• score(fire laser) = score(p1)
• score(move) = max {score(p2), score(p3)}
Once the frontier has been constructed, the EHC will in-
vestigate the heuristic value of each state on the frontier and
commit to one if it has a better value than S. The order in
which these states should be investigated can have an impact
on the efficiency of the planner, so we chose to evaluate the
familiar operators prior to the helpful actions.
If no state in the frontier is found to have a better heuristic
value, the search continues to a deeper level. It is interesting
to note that states in a level deeper than two may have been
reached from applications of both familiar operators and
helpful actions. Because of the ordering we place on fron-
tier evaluation, at level k in the breadth first search we eval-
uate states from the most familiar (having been reached by
applying familiar operators only) to the most helpful (hav-
ing been reached by applying helpful actions only). Fig-
ure 1 demonstrates the spectrum of states that are investi-
gated with lighter shades representing familiar operators and
darker shades representing helpful actions.
Since we only modify the EHC phase of FF, if it fails to
find a solution in this phase our approach behaves exactly as
FF normally would.
Experimental Results
To evaluate our approach, we reproduced the experimental
framework used in the IPC-2008 Learning Track. We built a
testing framework analogous to the learning track, with the
same problem sets used during the contest – 30 bootstrap /
30 target problem for training, and 30 target problems for
testing. FF was run with and without the learned patterns
(FF with the pattern usage enabled is referred to as FF∗).
Three primary metrics are associated with each planner in
the contest: time, quality (plan length), and success rate.
For each problem p, the best time and quality found by any
planner is referred to as T ∗p and N∗p respectively. The per-
formance of any particular planner on p is denoted as Tp and
Np, and its score for the problem is calculated to be T ∗p /Tp
and N∗p /Np. The overall time and quality metric for a plan-
ner is the sum over all of the individual problem scores.5 Fi-
5Note that the maximum score a planner can have for a partic-
ular problem is 1.
ber of helpful actions are generated for expanding the EHC
search frontier. In many situations, however, this frontier is
not sufficient to discover valid plans.
When expanding a state S, our planner takes the history
of operators that lead to S and suggests familiar operators
based on patterns with the highest score that match the his-
tory. Familiar operators play a similar role to helpful ac-
tions, and the number of familiar operators that should be
added to the frontier is a run-time setting (as described in
Step 5 above), and this value is considered on a per-domain
basis to find the optimal bound. Having too many may cause
the search frontier to become too large, yielding poor perfor-
mance. In general, we tested settings in the range of 1 to 15.
Note that a setting of 0 causes FF to run without any change.
The familiar operators are a subset of all the applicable
operators from the current state S. For each applicable op-
erator A, we define seq(A,S) as history(S) ◦ A, i.e. the
sequence of operators that correspond to the operators lead-
ing to S, plus the operator A. We then associate a score with
A to be:
score(A) = max
p∈Pseq(A,S)
score(p),
where Pseq(A,S) are the patterns that match the sequence
seq(A,S). If |seq(A,S)| is smaller than the length of the
pattern, it does not count as matching. Additionally, if
|seq(A,S)| is larger than the length k of a pattern, only the
last k operators of seq(A,S) are used for comparison.
Example To illustrate this procedure, consider the scenario
in Figure 1 from the Gold Miner domain. Here, we are per-
forming EHC and have arrived at state S after going through
states 1, 2, and 3 via the operators ‘move’, ‘pickup laser’,
and ‘move’. For the sake of this example, assume that nei-
ther ‘move’ nor ‘fire-laser’ appear in the set of helpful ac-
tions, but are applicable in state S. Assume further that the
following patterns exist:
p1 ≡ 〈pickup laser , ∗,fire laser〉 : [1, 2]
p2 ≡ 〈move,move〉 : [1][2, 3][4]
p3 ≡ 〈move, ∗, ∗,move〉 : [1][2][3][4]
Pattern p1 is of length three and has a single wildcard (any
single operator matches this), p2 is only of length two, and
p3 is of length four with two wildcards. The arguments for
pickup laser and fire laser refer to the same object (since
there is only one group), and the argument grouping for p2
ensures that movement is through some specific location (the
second group). For p3, there is no relation between the two
move operators. In the framework described above, we have
the following realization:4
• history(S) = move, pickup laser ,move
• seq(fire laser , S) = move, pickup laser ,move,fire laser
• seq(move, S) = move, pickup laser ,move,move
• Pseq(fire laser ,S) = {p1}
3It is assumed that the reader is familiar with the relaxed plan-
ning graph heuristic, EHC, and FF in general.
4For ease of readability, we suppress listing arguments and as-
sume the patterns match.
Figure 1: EHC Search in the Gold Miner domain. Solid
lines after state S represent helpful actions and dashed lines
represent familiar operators.
• Pseq(move,S) = {p2, p3}
• score(fire laser) = score(p1)
• score(move) = max {score(p2), score(p3)}
Once the frontier has been constructed, the EHC will in-
vestigate the heuristic value of each state on the frontier and
commit to one if it has a better value than S. The order in
which these states should be investigated can have an impact
on the efficiency of the planner, so we chose to evaluate the
familiar operators prior to the helpful actions.
If no state in the frontier is found to have a better heuristic
value, the search continues to a deeper level. It is interesting
to note that states in a level deeper than two may have been
reached from applications of both familiar operators and
helpful actions. Because of the ordering we place on fron-
tier evaluation, at level k in the breadth first search we eval-
uate states from the most familiar (having been reached by
applying familiar operators only) to the most helpful (hav-
ing been reached by applying helpful actions only). Fig-
ure 1 demonstrates the spectrum of states that are investi-
gated with lighter shades representing familiar operators and
darker shades representing helpful actions.
Since we only modify the EHC phase of FF, if it fails to
find a solution in this phase our approach behaves exactly as
FF normally would.
Experimental Results
To evaluate our approach, we reproduced the experimental
framework used in the IPC-2008 Learning Track. We built a
testing framework analogous to the learning track, with the
same problem sets used during the contest – 30 bootstrap /
30 target problem for training, and 30 target problems for
testing. FF was run with and without the learned patterns
(FF with the pattern usage enabled is referred to as FF∗).
Three primary metrics are associated with each planner in
the contest: time, quality (plan length), and success rate.
For each problem p, the best time and quality found by any
planner is referred to as T ∗p and N∗p respectively. The per-
formance of any particular planner on p is denoted as Tp and
Np, and its score for the problem is calculated to be T ∗p /Tp
and N∗p /Np. The overall time and quality metric for a plan-
ner is the sum over all of the individual problem scores.5 Fi-
5Note that the maximum score a planner can have for a partic-
ular problem is 1.
Page 4
Time Metric Quality Metric Success Rate
Domain FF FF∗ FF FF∗ FF FF∗
Gold Miner 0.001 19.19 6.86 30 0.26 1.0
Matching 0.13 0.11 7.42 7.41 0.33 0.33
N-Puzzle 0.22 0.03 10.16 8.29 0.53 0.43
Parking 0.03 0.03 4.04 4.04 0.23 0.23
Sokoban 0.14 7.41 10.22 11.83 0.43 0.43
Thoughtful 0.84 1.20 10.26 16.69 0.36 0.6
Overall 1.37 27.97 48.97 78.27 0.36 0.51
Delta + 26.60 + 29.30 + .15
(O. Wedge) (+ 36.05) (+ 29.02) (+ 0.17)
Table 1: Time, Quality, and Success Rate Metrics of FF∗ and
FF. For all metrics, larger numbers are better. Values for Ob-
tuse Wedge were obtained using a slightly faster machine.
nally, the success rate is the percentage of problems solved
in the given time limit of 15 minutes per problem. All ex-
periments were conducted on a Linux desktop with a Dual
Core 2.13GHz processor and 2GB of memory.
Unfortunately, most of the planners that were entered in
this track are not publicly available. To compute our time
and quality metrics, we compared to the raw data generated
by the planners in the competition, giving us a lower bound
on the score since the hardware configuration we used was
slower than the contest machines.
Time, quality, and success rate metrics for the six domains
used in the Learning Track are given in Table 1. For one
domain, Parking, we were unable to learn any patterns since
every configuration tested by the automated process caused
the planner to fail in the EHC phase – thus results shown are
for vanilla FF (the fall-back solution to our approach).
FF∗ displayed impressive performance along some di-
mensions and only modest performance along others. Most
impressive are our Delta values – values that measure the
effect of learning on the performance of our planner. Ta-
ble 1 lists our Delta values for time, quality and success
rate, as well as those of Obtuse Wedge (Yoon, Fern, and
Givan 2008), recipient of the Best Learner Award. Delta
values indicate that our planner displayed competitive re-
sults to Obtuse Wedge, even in the limited setting of slightly
slower hardware (the competition machine ran at 2.4GHz).
Our approach to learning is significantly different than that
of Obtuse Wedge. We conjecture that combining both tech-
niques may provide even better results.
Further analysis of the IPC-2008 statistics indicates that
while most planners’ learning components negatively im-
pacted quality, ours had a positive impact, showing improve-
ment in both quality and success rate metrics. These obser-
vations lead us to conjecture that perhaps the other planners’
learning components generated too much overhead, result-
ing in an inability to solve problems within the allotted time.
FF∗ did not suffer this fate.
Our planner did have its shortcomings. In general FF∗
scored in the middle of the pack, or slightly better relative to
the IPC-2008 planners for time, quality, and success rate. It
excelled in a few domains, but was unexceptional in others.
Nevertheless, the preliminary nature of this work and the
fact that our Delta values characterize us as one of the top
learners relative to all other IPC-2008 entrants does much to
demonstrate the promise of this line of research.
Discussion and Future Work
In this paper we presented a new perspective on planning
strategies through the lens of computational linguistics. By
treating partial plans as incomplete sentences, we showed
how word prediction techniques can help discover effective
operators during forward-search planning. We introduced
the concept of a pattern which extends the notion of an n-
gram to include argument relations. With the n-gram of op-
erator names, and argument restrictions of a pattern, we sug-
gest a set of familiar operators for the EHC search to choose
from. Preliminary results are encouraging, and indicate that
this approach is a powerful learning technique.
The interpretation of serialized plans as sentences in a lan-
guage reveals a number of interesting avenues to investigate.
We made the assumption that the ‘sentences’ are formed
only from the operator names. This can be generalized to
include other aspects of a planning problem such as state,
goals, and landmarks. Using a richer source of information
as input to the n-gram analysis may lead to better results.
Another intriguing idea we hope to investigate is the re-
lationship between our approach and macro operators. In
contrast to macro operator approaches, our method matches
with the pattern history applying only one operator to obtain
a successor. An investigation of the use of patterns in a way
more similar to a macro operator – in which we would ap-
ply more than one operator to generate a successor – is an
interesting avenue of future research.
The applicability of our approach need not be confined
to IPC planning domains. Further extensions may be real-
ized that allow a life-long robot companion (e.g. Thrun and
Mitchell 1995) to learn habitual patterns in every day rou-
tines – even allowing those patterns to change over time;
given sufficient background behaviour, plans being executed
can be monitored for any discrepancies observed as statis-
tical anomalies compared to the known n-gram statistics.
Many aspects of plan monitoring could also benefit from the
use of n-gram analysis on operator sequences.
Acknowledgements The authors gratefully acknowledge
funding from NSERC and Ontario’s ERA program.
References
Banerjee, S., and Pedersen, T. 2003. The Design, Implementa-
tion and Use of the Ngram Statistics Package. In Proceedings of
the Fourth International Conference on Intelligent Text Process-
ing and Computational Linguistics, 370–381.
Fern, A.; Tadepalli, P.; and Khardon, R. 2008. Results of IPC
2008: Learning track.
Garay-Vitoria, N., and Abascal, J. 2006. Text prediction systems:
a survey. Univers. Access Inf. Soc. 4(3):188–203.
Hoffmann, J., and Nebel, B. 2001. The FF planning system: Fast
plan generation through heuristic search. JAIR 14:253–302.
Richter, S.; Helmert, M.; and Westphal, M. 2008. Landmarks
revisited. In AAAI, 975–982.
Thrun, S., and Mitchell, T. 1995. Lifelong robot learning.
Robotics and autonomous systems 15(1):25–46.
Yoon, S.; Fern, A.; and Givan, R. 2008. Learning Control Knowl-
edge for Forward Search Planning. JAIR 9:683–718.
Domain FF FF∗ FF FF∗ FF FF∗
Gold Miner 0.001 19.19 6.86 30 0.26 1.0
Matching 0.13 0.11 7.42 7.41 0.33 0.33
N-Puzzle 0.22 0.03 10.16 8.29 0.53 0.43
Parking 0.03 0.03 4.04 4.04 0.23 0.23
Sokoban 0.14 7.41 10.22 11.83 0.43 0.43
Thoughtful 0.84 1.20 10.26 16.69 0.36 0.6
Overall 1.37 27.97 48.97 78.27 0.36 0.51
Delta + 26.60 + 29.30 + .15
(O. Wedge) (+ 36.05) (+ 29.02) (+ 0.17)
Table 1: Time, Quality, and Success Rate Metrics of FF∗ and
FF. For all metrics, larger numbers are better. Values for Ob-
tuse Wedge were obtained using a slightly faster machine.
nally, the success rate is the percentage of problems solved
in the given time limit of 15 minutes per problem. All ex-
periments were conducted on a Linux desktop with a Dual
Core 2.13GHz processor and 2GB of memory.
Unfortunately, most of the planners that were entered in
this track are not publicly available. To compute our time
and quality metrics, we compared to the raw data generated
by the planners in the competition, giving us a lower bound
on the score since the hardware configuration we used was
slower than the contest machines.
Time, quality, and success rate metrics for the six domains
used in the Learning Track are given in Table 1. For one
domain, Parking, we were unable to learn any patterns since
every configuration tested by the automated process caused
the planner to fail in the EHC phase – thus results shown are
for vanilla FF (the fall-back solution to our approach).
FF∗ displayed impressive performance along some di-
mensions and only modest performance along others. Most
impressive are our Delta values – values that measure the
effect of learning on the performance of our planner. Ta-
ble 1 lists our Delta values for time, quality and success
rate, as well as those of Obtuse Wedge (Yoon, Fern, and
Givan 2008), recipient of the Best Learner Award. Delta
values indicate that our planner displayed competitive re-
sults to Obtuse Wedge, even in the limited setting of slightly
slower hardware (the competition machine ran at 2.4GHz).
Our approach to learning is significantly different than that
of Obtuse Wedge. We conjecture that combining both tech-
niques may provide even better results.
Further analysis of the IPC-2008 statistics indicates that
while most planners’ learning components negatively im-
pacted quality, ours had a positive impact, showing improve-
ment in both quality and success rate metrics. These obser-
vations lead us to conjecture that perhaps the other planners’
learning components generated too much overhead, result-
ing in an inability to solve problems within the allotted time.
FF∗ did not suffer this fate.
Our planner did have its shortcomings. In general FF∗
scored in the middle of the pack, or slightly better relative to
the IPC-2008 planners for time, quality, and success rate. It
excelled in a few domains, but was unexceptional in others.
Nevertheless, the preliminary nature of this work and the
fact that our Delta values characterize us as one of the top
learners relative to all other IPC-2008 entrants does much to
demonstrate the promise of this line of research.
Discussion and Future Work
In this paper we presented a new perspective on planning
strategies through the lens of computational linguistics. By
treating partial plans as incomplete sentences, we showed
how word prediction techniques can help discover effective
operators during forward-search planning. We introduced
the concept of a pattern which extends the notion of an n-
gram to include argument relations. With the n-gram of op-
erator names, and argument restrictions of a pattern, we sug-
gest a set of familiar operators for the EHC search to choose
from. Preliminary results are encouraging, and indicate that
this approach is a powerful learning technique.
The interpretation of serialized plans as sentences in a lan-
guage reveals a number of interesting avenues to investigate.
We made the assumption that the ‘sentences’ are formed
only from the operator names. This can be generalized to
include other aspects of a planning problem such as state,
goals, and landmarks. Using a richer source of information
as input to the n-gram analysis may lead to better results.
Another intriguing idea we hope to investigate is the re-
lationship between our approach and macro operators. In
contrast to macro operator approaches, our method matches
with the pattern history applying only one operator to obtain
a successor. An investigation of the use of patterns in a way
more similar to a macro operator – in which we would ap-
ply more than one operator to generate a successor – is an
interesting avenue of future research.
The applicability of our approach need not be confined
to IPC planning domains. Further extensions may be real-
ized that allow a life-long robot companion (e.g. Thrun and
Mitchell 1995) to learn habitual patterns in every day rou-
tines – even allowing those patterns to change over time;
given sufficient background behaviour, plans being executed
can be monitored for any discrepancies observed as statis-
tical anomalies compared to the known n-gram statistics.
Many aspects of plan monitoring could also benefit from the
use of n-gram analysis on operator sequences.
Acknowledgements The authors gratefully acknowledge
funding from NSERC and Ontario’s ERA program.
References
Banerjee, S., and Pedersen, T. 2003. The Design, Implementa-
tion and Use of the Ngram Statistics Package. In Proceedings of
the Fourth International Conference on Intelligent Text Process-
ing and Computational Linguistics, 370–381.
Fern, A.; Tadepalli, P.; and Khardon, R. 2008. Results of IPC
2008: Learning track.
Garay-Vitoria, N., and Abascal, J. 2006. Text prediction systems:
a survey. Univers. Access Inf. Soc. 4(3):188–203.
Hoffmann, J., and Nebel, B. 2001. The FF planning system: Fast
plan generation through heuristic search. JAIR 14:253–302.
Richter, S.; Helmert, M.; and Westphal, M. 2008. Landmarks
revisited. In AAAI, 975–982.
Thrun, S., and Mitchell, T. 1995. Lifelong robot learning.
Robotics and autonomous systems 15(1):25–46.
Yoon, S.; Fern, A.; and Givan, R. 2008. Learning Control Knowl-
edge for Forward Search Planning. JAIR 9:683–718.
Sign up today - FREE
Mendeley saves you time finding and organizing research. Learn more
- All your research in one place
- Add and import papers easily
- Access it anywhere, anytime
Start using Mendeley in seconds!
Readership Statistics
6 Readers on Mendeley
by Discipline
by Academic Status
50% Ph.D. Student
17% Post Doc
17% Researcher (at an Academic Institution)
by Country
33% United States
33% United Kingdom
17% Canada


