Sign up & Download
Sign in

A Neural Network Model for Inter-problem Adaptive Online Time Allocation

by Matteo Gagliolo, Jürgen Schmidhuber
Artificial Neural Networks Formal Models and Their Applications ICANN 2005 15th International Conference Warsaw Poland September 1115 2005 Proceedings Part 2 (2005)

Abstract

One aim of Meta-learning techniques is to minimize the time needed for problem solving, and the effort of parameter hand-tuning, by automating algorithm selection. The predictive model of algorithm performance needed for task often requires long training times. We address the problem in an online fashion, running multiple algorithms in parallel on a sequence of tasks, continually updating their relative priorities according to a neural model that maps their current state to the expected time to the solution. The model itself is updated at the end of each task, based on the actual performance of each algorithm. Censored sampling allows us to train the model effectively, without need of additional exploration after each tasks solution. We present a preliminary experiment in which this new inter-problem technique learns to outperform a previously proposed intra-problem heuristic.

Cite this document (BETA)

Available from www.idsia.ch
Page 1
hidden

A Neural Network Model for Inter-problem Adaptive Online Time Allocation

A Neural Network Model for Inter-Problem Adaptive
Online Time Allocation
Matteo Gagliolo1 and J¤urgen Schmidhuber1,2
1
IDSIA, Galleria 2, 6928 Manno-Lugano, Switzerland
2TU Munich, Boltzmannstr. 3, 85748 Garching, M¤unchen, Germany
{matteo,juergen}@idsia.ch
Abstract. One aim of Meta-learning techniques is to minimize the time needed
for problem solving, and the effort of parameter hand-tuning, by automating al-
gorithm selection. The predictive model of algorithm performance needed for
this task often requires long training times. We address the problem in an online
fashion, running multiple algorithms in parallel on a sequence of tasks, continu-
ally updating their relative priorities according to a neural model that maps their
current state to the expected time to the solution. The model itself is updated at
the end of each task, based on the actual performance of each algorithm. Cen-
sored sampling allows us to train the model effectively, without need of addi-
tional exploration after each task’s solution. We present a preliminary experiment
in which this new inter-problem technique learns to outperform a previously pro-
posed intra-problem heuristic.
1 Problem statement
A typical machine learning scenario involves a (possibly inexperienced) practitioner
trying to cope with a set of problems, that could be solved, in principle, using one
element of a set of available algorithms. While most users still solve such dilemmas
by trial and error, or by blindly applying some unquestioned rule-of-thumb, the steadily
growing area of Meta-Learning [1] research is devoted to automating this process. Apart
from a few notable exceptions (e.g. [2,3,4,5], see [6], of which we adopt the notation
and terminology, for a commented bibliography), most existing techniques amount to
the selection of a single candidate solver (e.g. Algorithm recommendation [7]), or a
small subset of the available algorithms to be run in parallel with the same priority (e.g.
Algorithm portfolio selection [8]). This approach usually requires a long training phase,
which can be prohibitive if the algorithms at hand are computationally expensive; it also
assumes that the algorithm runtimes can be predicted of ine , based on problem features,
and do not exhibit large uctuations. In more complex cases, where the dif culty of the
problems cannot be precisely predicted a priori, a more robust approach would be to run
the candidate solvers in parallel, adapting their priorities online according to their actual
performance. We termed this Adaptive Online Time Allocation (AOTA) in [6], in which
we further distinguish between intra-problem AOTA, where the prediction of algorithm
performance is made according to some heuristic based on a-priori knowledge about
the algorithm’s behavior; and inter-problem AOTA, in which a time allocation strategy
is learned by collecting experience on a sequence of tasks.
Page 2
hidden
In this work we present an inter-problem approach for training a parametric model
of algorithm runtimes, and give an example of how this model can be used to allocate
time online, comparing its performance with the simple intra-problem heuristic from
[6].
2 A parametric model for inter-problem AOTA
Consider a nite algorithm set A containing n algorithms ai, i ∈ I = {1, . . . , n},
applied to the solution of the same problem and running according to some time allo-
cation procedure. Let ti be the time spent on ai; xi a feature vector, possibly including
information about the current problem, the algorithm ai itself (e.g. its kind, the values
of its parameters), and its current state di; Hi = {(x(r)i , t(r)i ), r = 0, . . . , hi} a set of
collected samples of these pairs; H = ∪i∈IHi the historic experience set relative to the
entire A.
In order to allocate machine time ef ciently, we would like t o map each pair in each
Hi to the time τi still left before ai reaches the solution. If we are allowed to learn such
mapping by solving a sequence of related tasks, we can, for a successful algorithm ai
that solved the problem at time t(hi)i , a posteriori evaluate the correct τ
(r)
i = t
(hi)
i −t
(r)
i
for each pair (x(r)i , t
(r)
i ) in Hi. In a rst tentative experiment, that led to poor results,
these values were used as targets to learn a regression from pairs (x, t) to residual time
values τ . The main problem with this approach is which τ values to choose as targets for
the unsuccessful algorithms. Assigning them heuristically would penalize with high τ
values algorithms that were stopped on the point of solving the task, or give incorrectly
low values to algorithms that cannot solve it; obtaining more exact targets τ by running
more algorithms until the end would increase the overhead.
The alternative we present here is inspired by censored sampling for lifetime dis-
tribution estimation [9], and consists in learning a parametric model g(τ |xi, ti;w) of
the conditional probability density function (pdf) of the residual time τ . To see how the
model can be trained, imagine we continue the time allocation for a while after the rst
algorithm solves the current task, such that we end up having one or more successful
algorithms ai, with indices i ∈ Is ⊆ I , for whose Hi the correct targets τ (r)i can be
evaluated as above. Assuming each τ (r)i to be the outcome of an independent experi-
ment, including t in x to ease notation, if p(x) is the (unknown) pdf of the x(r)i we can
write the likelihood of Hi as
Li∈Is(Hi) =
hi−1∏
r=0
g(τ
(r)
i |x
(r)
i ;w)p(x(r)i ) (1)
For the unsuccessful algorithms, the nal time value t(hi)i recorded in Hi is a lower
bound on the unknown, and possibly in nite, time to solve the problem, and so are the
τ
(r)
i , so to obtain the likelihood we have to integrate (1)
Li/∈Is(Hi) =
hi−1∏
r=0
[1 −G(τ (r)i |x
(r)
i ;w)]p(x(r)i ) (2)

Sign up today - FREE

Mendeley saves you time finding and organizing research. Learn more

  • All your research in one place
  • Add and import papers easily
  • Access it anywhere, anytime

Start using Mendeley in seconds!

Already have an account? Sign in

Readership Statistics

2 Readers on Mendeley
by Discipline
 
 
by Academic Status
 
50% Ph.D. Student
 
50% Researcher (at an Academic Institution)
by Country
 
50% Germany
 
50% Belgium