Sign up & Download
Sign in

Non-Bayesian particle filters

by Alexandre J Chorin, Xuemin Tu
Arxiv preprint arXiv09052181 (2009)

Abstract

Particle filters for data assimilation in nonlinear problems use "particles" (replicas of the underlying system) to generate a sequence of probability density functions (pdfs) through a Bayesian process. This can be expensive because a significant number of particles has to be used to maintain accuracy. We offer here an alternative, in which the relevant pdfs are sampled directly by an iteration. An example is discussed in detail.

Cite this document (BETA)

Available from arxiv.org
Page 1
hidden

Non-Bayesian particle filters

ar
X
iv
:0
90
5.
21
81
v1
[
ma
th.
NA
]
13
M
ay
20
09
Non-Bayesian particle filters
Alexandre J. Chorin and Xuemin Tu
Department of Mathematics, University of California at Berkeley
and
Lawrence Berkeley National Laboratory
Berkeley, CA, 94720
Abstract
Particle filters for data assimilation in nonlinear problems use “par-
ticles” (replicas of the underlying system) to generate a sequence of
probability density functions (pdfs) through a Bayesian process. This
can be expensive because a significant number of particles has to be
used to maintain accuracy. We offer here an alternative, in which
the relevant pdfs are sampled directly by an iteration. An example is
discussed in detail.
Keywords particle filter, chainless sampling, normalization factor, iter-
ation, non-Bayesian
1 Introduction.
There are many problems in science in which the state of a system must
be identified from an uncertain equation supplemented by a stream of noisy
data (see e.g. [1]). A natural model of this situation consists of a stochastic
differential equation (SDE):
dx = f(x, t) dt+ g(x, t) dw, (1)
where x = (x1, x2, . . . , xm) is an m-dimensional vector, dw is m-dimensional
Brownian motion, f is an m-dimensional vector function, and g is a scalar
(i.e., an m by m diagonal matrix of the form gI, where g is a scalar and I is
the identity matrix). The Brownian motion encapsulates all the uncertainty
in this equation. The initial state x(0) is assumed given and may be random
as well.
As the experiment unfolds, it is observed, and the values bn of a mea-
surement process are recorded at times tn; for simplicity assume tn = nδ,
1
Page 2
hidden
where δ is a fixed time interval and n is an integer. The measurements are
related to the evolving state x(t) by
bn = h(xn) + GWn, (2)
where h is a k-dimensional, generally nonlinear, vector function with k ≤ m,
G is a diagonal matrix, xn = x(nδ), and Wn is a vector whose components
are independent Gaussian variables of mean 0 and variance 1, independent
also of the Brownian motion in equation (1). The task is to estimate x on
the basis of equation (1) and the observations (2).
If the system (1) is linear and the data are Gaussian, the solution can be
found via the Kalman-Bucy filter. In the general case, it is natural to try to
estimate x as the mean of its evolving probability density. The initial state
x is known and so is its probability density; all one has to do is evaluate
sequentially the density Pn+1 of xn+1 given the probability density Pn of xn
and the data bn+1. This can be done by following “particles” (replicas of
the system) whose empirical distribution approximates Pn. In a Bayesian
filter (see e.g [2, 3, 4, 5, 6, 7, 8, 9], one uses the pdf Pn and equation (1) to
generate a prior density, and then one uses the new data bn+1 to generate
a posterior density Pn+1. In addition, one may have to sample backward to
take into account the information each measurement provides about the past
and avoid having too many identical particles. Evolving particles is typically
expensive, and the backward sampling, usually done by Markov chain Monte
Carlo (MCMC), can be expensive as well, because the number of particles
needed can grow catastrophically (see e.g. [10]).
In this paper we offer an alternative to the standard approach, in which
Pn+1 is sampled directly without recourse to Bayes’ theorem and backward
sampling, if needed, is done by chainless Monte Carlo [11]. Our direct sam-
pling is based on a representation of a variable with density Pn+1 by a col-
lection of functions of Gaussian variables parametrized by the support of Pn,
with parameters found by iteration. The construction is related to chainless
sampling as described in [11]. The idea in chainless sampling is to produce
a sample of a large set of variables by sequentially sampling a growing se-
quence of nested conditionally independent subsets. As observed in [12, 13],
chainless sampling for a SDE reduces to interpolatory sampling, as explained
below. Our construction will be explained in the following sections through
an example where the position of a ship is deduced from the measurements
of an azimuth, already used as a test bed in [6, 14, 15].
2
Page 3
hidden
2 Sampling by interpolation and iteration.
First we explain how to sample via interpolation and iteration in a simple
example, related to the example and the construction in [12]. Consider the
scalar SDE
dx = f(x, t)dt+

σ dw; (3)
we want to find sample paths x = x(t), 0 ≤ t ≤ 1, subject to the conditions
x(0) = 0, x(1) = X.
Let N(a, v) denote a Gaussian variable with mean a and variance v. We
first discretize equation (3) on a regular mesh t0, t1, . . . , tN , where tn = nδ,
δ = 1/N , 0 ≤ n ≤ N , with xn = x(tn), and, following [12], use a balanced
implicit discretization [16, 17]:
xn+1 = xn + f(xn, tn)δ + (xn+1 − xn)f ′(xn)δ + W n+1,
where f ′(xn, tn) = ∂f∂xn (xn, tn) and W n+1 is N(0, σ/N). The joint probability
density of the variables x1, . . . , xN−1 is Z−1 exp(−
∑N
0 V i), where Z is the
normalization constant and
Vi =
((1− δf ′)(xn+1 − xn)− δf)2
2σδ
=
(xn+1 − xn − δf/(1− δf ′))2
2σn
,
where f, f ′ are functions of the xj , and σn = σδ/(1 − δf ′)2 (see [18]). One
can obtain sample solutions by sampling this density, e.g. by MCMC, or one
can obtain them by interpolation (chainless sampling), as follows.
Consider first the special case f(x, t) = f(t), so that in particular f ′ = 0.
Each increment xn+1−xn is now a N(an, σ/N) variable, with the an = f(tn)δ
known explicitly. Let N be a power of 2. Consider the variable xN/2. On
one hand,
xN/2 =
N/2

1
(xn − xn−1) = N(A1, V1),
where A1 =
∑N/2
1 an, V1 = σ/2. On the other hand,
X = xN/2 +
N

N/2+1
(xn − xn−1),
3
Page 4
hidden
so that
xN/2 = N(A2, V2),
with
A2 = X −
N−1

N/2+1
an, V2 = V1.
The pdf of xN/2 is the product of the two pdfs; one can check that
exp
(
−(x−A1)
2
2V1
)
exp
(
−(x− A2)
2
2V2
)
=exp
(
−(x− a¯)
2
2v¯
)
exp(−φ),
where v¯ = V1V2V1+V2 , a¯ =
V1A1+V2A2
V1+V2
, and φ = (A2−A1)22(V1+V2) ; e
−φ is the probability of
getting from the origin to X, up to a normalization constant.
Pick a sample ξ1 from the N(0, 1) density; one obtains a sample of xN/2
by setting xN/2 = a¯+

v¯ξ1. Given a sample of xN/2 one can similarly sample
xN/4, x3N/4, then xN/8, x3N/8, etc., until all the xj have been sampled. If
we define ξ = (ξ1, ξ2, . . . , ξN−1), then for each choice of ξ we find a sample
(x1, . . . , xN−1) such that
exp
(
−ξ
2
1 + · · ·+ ξ2N−1
2
)
exp
(
−(X −

n an)2

)
=exp
(
−(x
1 − x0 − a0)2
2σ/N −
(x2 − x1 − a1)2
2σ/N
− · · · − (x
N − xN−1 − aN−1)2
2σ/N
)
, (4)
where the factor exp
(
− (X−
P
n an)
2

)
on the left is the probability of the fixed
end value X up to a normalization constant. In this linear problem, this
factor is the same for all the samples and therefore harmless. One can repeat
this sampling process for multiple choices of the variables ξj ; each sample of
the corresponding set of xn is independent of any previous samples of this
set.
Now return to the general case. The functions f , f ′ are now functions of
the xj . We obtain a sample of the probability density we want by iteration.
First pick Ξ = (ξ1, ξ2, . . . , ξN−1), where each ξj is drawn independently from
4
Page 5
hidden
the N(0, 1) density (this vector remains fixed during the iteration). Make
a first guess x0 = (x10, x20, . . . , xN−10 ) (for example, if X 6= 0, pick x = 0).
Evaluate the functions f, f ′ at xj (note that now f ′ 6= 0, and therefore the
variances of the various increments are no longer constants). We are back in
previous case, and can find values of the increments xn+1j+1−xnj+1 corresponding
to the values of f, f ′ we have. Repeat the process starting with the new
iterate. If the vectors xj converge to a vector x = (x1, . . . , xN−1), we obtain,
in the limit, equation (4), where now on the right side σ depends on n so
that σ = σn, and both an, σn are functions of the final x. The left hand side
of (4) becomes:
exp
(
−ξ
2
1 + · · ·+ ξ2N−1
2
)
exp
(
−(X −

n an)2
2

n σn
)
.
Note that now the factor exp
(
− (X−
P
n an)
2
2
P
n σn
)
is different from sample to sam-
ple, and changes the relative weights of the different samples. In averaging,
one should take this factor as weight, or resample as described at the end of
the following section. In order to obtain more uniform weights, one also can
use the strategies in [11, 12].
One can readily see that the iteration converges if KTM < 1, where K
is the Lipshitz constant of f , T is the length of the interval on which one
works (here T = 1), and M is the maximum norm of the vectors xj+1 −
xj. If this inequality is not satisfied for the iteration above, it can be re-
established by a suitable underrelaxation. One should course choose N large
enough so that the results are converged in N . We do not provide more
details here because they are extraneous to our purpose, which is to explain
chainless/interpolatory sampling and the use of reference variables in a simple
context.
3 The ship azimuth problem.
The problem we focus on is discussed in [6, 14, 15], where it is used to
demonstrate the capabilities of particular Bayesian filters. A ship sets out
from a point (x0, y0) in the plane and undergoes a random walk,
xn+1 = xn + dxn+1,
yn+1 = yn + dyn+1, (5)
5
Page 6
hidden
for n ≥ 0, and with x0 = y0 given, and dxn+1 = N(dxn, σ), dyn+1 =
N(dyn, σ), i.e., each displacement is a sample of a Gaussian random variable
whose variance σ does not change from step to step and whose mean is the
value of the previous displacement. An observer makes noisy measurements
of the azimuth arctan(yn/xn), recording
bn = arctan y
n
xn + N(0, s). (6)
where the variance s is also fixed; here the observed quantity b is scalar and
is not be denoted by a boldfaced letter. The problem is to reconstruct the
positions xn = (xn, yn) from equations (5,6). We take the same parameters as
[6]: x0 = 0.01, y0 = 20, dx1 = 0.002, dy1 = −0.06, σ = 1 · 10−6, s = 25 · 10−6.
We follow numerically M particles, all starting from X0i = x0, Y 0i = y0,
as described in the following sections, and we estimate the ship’s position
at time nδ as the mean of the locations Xni = (Xni , Y ni ), i = 1, . . . ,M of the
particles at that time. The authors of [6] also show numerical results for runs
with varying data and constants; we discuss those refinements in section 6
below.
4 Forward step.
Assume we have a collection of M particles Xn at time tn = nδ whose
empirical density approximates Pn; now we find increments dXn+1 such that
the empirical density of Xn+1 = Xn + dXn+1 approximates Pn+1. Pn+1 is
known implicitly: it is the product of the density that can be deduced from
the SDE and the one that comes from the observations, with the appropriate
normalization. If the increments were known, their probability p (the density
Pn+1 evaluated at the resulting positions Xn+1) would be known, so p is a
function of dXn+1, p = p(dXn+1). For each particle i, we are going to sample
a Gaussian reference density, obtain a sample of probability ρ, then solve (by
iteration) the equation
ρ = p(dXn+1i ) (7)
to obtain dXn+1i .
Define f(x, y) = arctan(y/x) and fn = f(Xn, Y n). We are working on
one particle at a time, so the index i can be temporarily suppressed. Pick
two independent samples ξx, ξy from a N(0, 1) density (the reference density
6
Page 7
hidden
in the present calculation), and set ρ = 12pi exp
(
− ξ2x2 −
ξ2y
2
)
; the variables
ξx, ξy remain unchanged until the end of the iteration. We are looking for
displacements dXn+1, dY n+1, and parameters ax, ay, vx, vy, φ, such that:
2πρ =exp
(
−(dX
n+1 − dXn)2
2σ −
(dY n+1 − dY n)2

−(f
n+1 − bn+1)2
2s
)
exp(φ)
= exp
(
−(dX
n+1 − ax)2
2vx
− (dY
n+1 − ay)2
2vy
)
(8)
The first equality states what we wish to accomplish: find increments dXn+1,
dY n+1, functions respectively of ξx, ξy, whose probability with respect to
Pn+1 is ρ. The factor eφ is needed to normalize this term (φ is called below a
“phase”). The second equality says how the goal is reached: we are looking
for parameters ax, ay, vx, vy, (all functions of Xn) such that the increments
are samples of Gaussian variables with these parameters, with the assumed
probability. One should remember that in our example the mean of dXn+1
is dXn, and similarly for dY n+1. We are not representing Pn+1 as a function
of a single Gaussian- there is a different Gaussian for every value of Xn.
To satisfy the second equality we set up an iteration for vectors dXn+1,j(=
dXj for brevity) that converges to dXn+1. Start with dX0 = 0. We now
explain how to compute dXj+1 given dXj.
Approximate the observation equation (6) by
f(Xj) + fx · (dXj+1 − dXj) + fy · (dY j+1 − dY j) = bn+1 + N(0, s), (9)
where the derivatives fx, fy are, like f , evaluated at Xj = Xn+dXj, i.e., ap-
proximate the observation equation by its Taylor series expansion around the
previous iterate. Define a variable ηj+1 = (fx ·dXj+1+fy ·dY j+1)/

f 2x + f 2y .
The approximate observation equation says that ηj+1 is a N(a1, v1) variable,
with
a1 = −
f − fx · dXj − fy · dY j − bn+1

f 2x + f 2y
,
v1 =
s
f 2x + f 2y
. (10)
7
Page 8
hidden
On the other hand, from the equations of motion one finds that ηj+1 is
N(a2, v2), with a2 = (fx · dXn + fy · dY n)/

f 2x + f 2y and v2 = σ. Hence the
pdf of ηj+1 is, up to normalization factors,
exp
(
−(x− a1)
2
2v1
− (x− a2)
2
2v2
)
= exp
(
−(x− a¯)
2
2v¯
)
exp(−φ),
where v¯ = v1v2v1+v2 , a¯ =
a1v1+a2v2
v1+v2
, φ = (a1−a2)22(v1+v2) = φ
j+1.
We can also define a variable ηj+1+ that is a linear combination of dXj+1,
dY j+1 and is uncorrelated with ηj+1:
ηj+1+ =
−fy · dY j+1 + fx · dXj+1

f 2x + f 2y
.
The observations do not affect ηj+1+ , so its mean and variance are known.
Given the means and variances of ηj+1, ηj+1+ one can easily invert the orthog-
onal matrix that connects them to dXj+1, dY j+1 and find the means and
variances ax, vx of dXj+1 and ay, vy of dY j+1 after their modification by the
observation (the subscripts on a, v are labels, not differentiations). Now one
can produce values for dXj+1, dY j+1:
dXj+1 = ax +
√vxξx, dY j+1 = ay +
√vyξy,
where ξx, ξy are the samples from N(0, 1) chosen at the beginning of the
iteration. This completes the iteration.
This iteration converges to Xn+1 such that f(Xn+1) = bn+1 + N(0, s),
and the phases φj converge to a limit φ = φi, where the particle index i
has been restored. The time interval over which the solution is updated
in each step is short, and we do not expect any problem with convergence,
either here or in the next section, and indeed there is none; in all cases the
iteration converges in a small number of steps. Note that after the iteration
the variables Xn+1i , Y n+1i are no longer independent- the observation creates
a relation between them.
Do this for all the particles. The particles are now samples of Pn+1, but
they have been obtained by sampling different densities (remember that the
parameters in the Gaussians in equation (8) vary). One can get rid of this het-
erogeneity by viewing the factors exp(−φ) as weights and resampling, i.e., for
each of M random numbers θk, k = 1, . . . ,M drawn from the uniform distri-
bution on [0, 1], choose a new Xˆn+1k = Xn+1i such that Z−1
∑i−1
j=1 exp(−φj) <
8
Page 9
hidden
θk ≤ Z−1
∑i
j=1 exp(−φj) (where Z =
∑M
j=1 exp(−φj)), and then suppress
the hat. We have traded the resampling of Bayesian filters for a resampling
based on the normalizing factors of the several Gaussian densities; this is a
worthwhile trade because in a Bayesian filter one gets a set of samples many
of which may have low probability with respect to Pn+1, and here we have a
set of samples each one of which has high probability with respect to a pdf
close to Pn+1.
Note also that the resampling does not have to be done at every step- for
example, one can add up the phases for a given particle and resample only
when the ratio of the largest cumulative weight exp(−
∑φi) to the smallest
such weight exceeds some limit L (the summation is over the weights accrued
to a particular particle i since the last resampling). If one is worried by
too many particles being close to each other (”depletion” in the Bayesian
terminology), one can divide the set of particles into subsets of small size
and resample only inside those subsets, creating a greater diversity. As will
be seen in section 6, none of these strategies will be used here and we will
resample fully at every step.
5 Backward sampling.
The algorithm of the previous section is sufficient to create a filter, but
accuracy may require an additional refinement. Every observation provides
information not only about the future but also about the past- it may, for
example, tag as improbable earlier states that had seemed probable before
the observation was made; one may have to go back and correct the past after
every observation (this backward sampling is often misleadingly motivated
solely by the need to create greater diversity among the particles in a Bayesian
filter). As will be seen below, this backward sampling does not provide a
significant boost to accuracy in the present problem, but it is described here
for the sake a completeness.
Given a set of particles at time (n+ 1)δ, after a forward step and maybe
a subsequent resampling, one can figure out where each particle i was in the
previous two steps, and have a partial history for each particle i: Xn−1i ,Xni ,Xn+1i
(if resamples had occurred, some parts of that history may be shared among
several current particles). Knowing the first and the last member of this
sequence, one can interpolate for the middle term as in section 2, thus pro-
jecting information backward. This requires that one recompute dXn.
9
Page 10
hidden
Let dXtot = dXn+dXn+1; in the present section this quantity is assumed
known and remains fixed. In the azimuth problem discussed here, one has
to deal with the slight complication due to the fact that the mean of each
increment is the value of the previous one, so that two successive increments
are related in a slightly more complicated way than usual. The displacement
dXn is a N(dXn−1, σ) variable, and dXn+1 is a N(dXn, σ) variable, so that
one goes from Xn−1 to Xn+1 by sampling first a (2dXn−1, 4σ) variable that
takes us from Xn−1 to an intermediate point P , with a correction by the
observation half way up this first leg, and then one samples a N(dXtot, σ)
variable to reach Xn+1, and similarly for Y . Let the variable that connects
Xn−1 to P be dXnew, so that what replaces dXn is dXnew/2. Accordingly,
we are looking for a new displacement dXnew = (dXnew, dY new), and for
parameters anewx , anewy , vnewx , vnewy such that
exp
(

ξ2x + ξ2y
2
)
=exp
(
−(dX
new − 2dXn−1)2
8σ −
(dY new − 2dY n−1)2

)
× exp
(
−(f
n − bn)2
2s
)
× exp
(
−(dX
new − dXtot)2
2σ −
(dY new − dXtot)2

)
exp(φ)
= exp
(
−(dX
new − a¯x)2
2vnewx
− (dY
new − a¯y)2
2vnewy
)
,
where fn = f(Xn−1 + dXnew/2, Y n−1 + dY new/2) and ξx, ξy are independent
N(0, 1) Gaussian variables. As in equation (8), the first equality embodies
what we wish to accomplish- find increments, functions of the reference vari-
ables, that sample the new pdf at time nδ defined by the forward motion,
the constraint imposed by the observation, and by knowledge of the position
at time (n + 1)δt. The second equality states that this is done by finding
particle-dependent parameters for a Gaussian density.
We again find these parameters as well as the increments by iteration.
Much of the work is separate for the X and Y components of the equations of
motion, so we write some of the equations for the X component only. Again
set up an iteration for variables dXnew,j = dXj which converge to dXnew.
Start with dX0 = 0. To find dXj+1 given dXj, approximate the observation
10
Page 11
hidden
equation (6), as before, by equation (9); define again variables ηj+1, ηj+1+ , one
in the direction of the approximate constraint and one orthogonal to it; in
the direction of the constraint multiply the pdfs as in the previous section;
construct new means a1x, a1y and new variances v1x, v1y for dX, dY at time n,
taking into account the observation at time n, again as before. This also
produces a phase φ = φ0.
Now take into account that the location of the boat at time n+1 is known;
this creates a new mean a¯x, a new variance v¯x, and a new phase φx, by v¯ =
v1v2
v1+v2
, a¯x = a1v1+a2v2v1+v2 , φx =
(a1−a2)2
v1+v2
, where a1 = 2a1, v1 = 4v1x, a2 = Xtot, v2 =
σ. Finally, find a new interpolated position dXj+1 = anewx /2 +
√vnewx ξx (the
calculation for dY j+1 is similar, with a phase φy), and we are done. The total
phase for in this iteration is φ = φ0 + φx + φy. As the iterates dXj converge
to dXnew, the phases converge to a limit φ = φi. The probability of a particle
arriving at the given position at time (n + 1)δt having been determined in
the forward step, there is no need to resample before comparing samples.
Once one has the values of Xnew, a forward step gives corrected values of
Xn+1; one can use this interpolation process to correct estimates of Xk by
subsequent observations for k = n− 1, k = n− 2, . . . , as many as are useful.
6 Numerical results.
Before presenting examples of numerical results for the azimuth problem, we
discuss the accuracy one can expect. A single set of observations for our
problem relies on 160 samples of a N(0, σ) variable. The maximum likeli-
hood estimate of σ given these samples is a random variable with mean σ
and standard deviation .11σ. We estimate the uncertainty in the position
of the boat by picking a set of observations, then making multiple runs of
the boat where the random components of the motion in the direction of the
constraint are frozen while the ones orthogonal to it are sampled over and
over from the suitable Gaussian density, then computing the distances to the
fixed observations, estimating the standard deviation of these differences,
and accepting the trajectory if the estimated standard deviation is within
one standard deviation of the nominal value of s. This process generates a
family of boat trajectories compatible with the given observations. In Table
I we display the standard deviations of the differences between the resulting
paths and the original path that produced the observations after the num-
ber of steps indicated there (the means of these differences are statistically
11
Page 12
hidden
indistinguishable from zero). This Table provides an estimate of the accu-
racy we can expect. It is fair to assume that these standard deviations are
underestimates of the uncertainty- a variation of a single standard deviation
in s is a strict constraint, and we allowed no variation in σ.
Table I
Intrinsic uncertainty in the azimuth problem
step x component y component
40 .0005 .21
80 .004 .58
120 .010 .88
160 .017 .95
If one wants reliable information about the performance of the filter, it
is not sufficient to run the boat once, record observations, and then use the
filter to reconstruct the boat’s path, because the difference between the true
path and the reconstruction is a random variable which may be accidentally
atypically small or atypically large. We have therefore run a large number
of such reconstructions and computed the means and standard deviations
of the discrepancies between path and reconstruction as a function of the
number of steps and of other parameters. In Tables II and III we display the
means and standard deviations of these discrepancies (not of their mean!)
in the the x and y components of the paths with 2000 runs, at the steps
and numbers of particles indicated, with no backward sampling. (Ref. [6]
used 100 particles). On the average the error is zero, and the error that can
be expected in any one run is of the order of magnitude of the unavoidable
error. The standard deviation of the discrepancy is not significantly smaller
with 2 particles that with 100- the main source of the discrepancy is the
uncertainty in the data. Most of time one single particle (no resampling) is
enough; however, a single particle may temporarily stray into low-probability
areas and creates large arguments and numerical difficulties in the various
functions used in the program. Two particles with resampling keep each
other within bounds, because if one of them strays it gets replaced by a
replica of the other. The various more sophisticated resampling strategies
at the end of section 4 make no discernible difference here, and backward
sampling does not help much either, because they too are unable to remedy
the limitation of the data set.
12
Page 13
hidden
Table IIa
Mean and standard variation of the discrepancy between synthetic data and
their reconstruction, 2000 runs, no back step, 100 particles
n. of steps x component y component
mean s.d. mean s.d.
40 .0004 .04 .0001 .17
80 -.001 .04 -.01 .54
120 -.0008 .07 -.03 1.02
160 -.002 .18 -.05 1.56
Table IIb
Mean and standard variation of the discrepancy between synthetic data and
their reconstruction, 2000 runs, no back step, 2 particles
n. of steps x component y component
mean s.d. mean s.d.
40 .002 .17 -.0004 .20
80 .01 .43 -.0006 .58
120 .01 .57 .009 1.08
160 .006 .54 .01 1.67
In Figure 1 we plot a sample boat path, its reconstruction, and the re-
constructions obtained (i) when the initial data for the reconstruction are
strongly perturbed (here, the initial data for x, y were perturbed initially by,
respectively, .1 and .4), and (ii) when the value of σ assumed in the recon-
struction is random: σ = N(σ0, ǫσ0), where σ0 is the constant value used until
now and ǫ = 0.4 but the calculation is otherwise identical. This produces
variations in σ of the order of 40%; any larger variance in the perturbations
produced negative value of σ. The differences between the reconstructions
and the true path remain within the acceptable range of errors. These graphs
show that the filter has little sensitivity to perturbations (we did not calculate
statistics here because the insensitivity holds for each individual run).
We now estimate the parameter σ from data. The filter needs an esti-
mate of σ to function, call this estimate σassumed. If σassumed 6= σ, the other
assumptions used to produce the data set (e.g. independence of the displace-
ments and of the observations) are also false, and all one has to do is detect
13
Page 14
hidden
0 0.5 1 1.5
10
12
14
16
18
20
22
X
Y


(1) boat path
(2) its reconstruction
(3) its reconstruction (i)
(4) its reconstruction (ii)
(4)
(1)
(3)
(2)
Figure 1: Some boat trajectories (explained in the text)
the fallacy. We do it by picking a trajectory of a particle and computing the
quantity
D = (
∑J
2 (dXj+1 − dXj))2 + (
∑J
2 (dY j+1 − dY j))2
∑J
2 (dXj+1 − dXj)2 +
∑J
2 (dY j+1 − dY j)2
.
If the increments are independent then on the average D = 1; we will try
to find the real σ by finding a value of σassumed for which this happens. We
chose J = 40 (the early part of a trajectory is less noisy than the later parts).
As we already know, a single run cannot provide an accurate estimate of
σ, and accuracy in the reconstruction depends on how many runs are used.
In Table III we display some values of D averaged over 200 and over 5000
runs as a function of the ratio of σassumed to the value of σ used to generate
the data. From the longer computation one can find the correct value of σ
with an error of about 3%, while with 200 runs the uncertainty is about 10%.
Table III
The mean of the discriminant D as a function of σassumed/σ, 30 particles
14
Page 15
hidden
σassumed/σ 5000 runs 200 runs
.5 1.14 ± .01 1.21 ± .08
.6 1.08 ± .01 1.14 ± .07
.7 1.05 ± .01 1.10 ± .07
.8 1.04 ± .01 1.14 ± .07
.9 1.00 ± .01 1.01 ± .07
1.0 1.00 ± .01 .96 ± .07
1.1 .97 ± .01 1.01 ± .07
1.2 .94 ± .01 .99 ± .07
1.3 .93 ± .01 1.02 ± .07
1.4 .90 ± .01 .85 ± .06
1.5 .89 ± .01 .93 ± .07
2.0 .86 ± .01 .78 ± .05
7 Conclusions.
We have exhibited a non-Bayesian filtering method, related to recent work on
chainless sampling, designed to focus particle paths more sharply and thus
require fewer of them, at the cost of an added complexity in the evaluation of
each path. The main features of the algorithm are a representation of a new
pdf by means of a set of functions of Gaussian variables and a resampling
based on normalization factors. The construction was demonstrated on a
standard ill-conditioned test problem. Further applications will be published
elsewhere.
8 Acknowledgments.
We would like to thank Prof. R. Kupferman, Prof. R. Miller, and Dr. J.
Weare for asking searching questions and providing good advice. This work
was supported in part by the Director, Office of Science, Computational
and Technology Research, U.S. Department of Energy under Contract No.
DE-AC02-05CH11231, and by the National Science Foundation under grant
DMS-0705910.
15
Page 16
hidden
References
[1] Doucet, A., de Freitas, N., and Gordon, N. (eds) (2001), Sequential
Monte Carlo Methods in Practice, Springer, New York.
[2] Maceachern, S., Clyde, M., and Liu, J. (1999), Can. J. Stat. 27:251–267.
[3] Liu, J. and Sabati, C. (2000), Biometrika 87:353–369.
[4] Doucet, A., Godsill, S., and Andrieu, C. (2000), Stat. Comp. 10:197–208.
[5] Arulampalam, M., Maskell, S., Gordon, N., and Clapp, T. (2002), IEEE
Trans. Sig. Proc. 50:174–188.
[6] Gilks, W. and Berzuini, C. (2001), J. Roy. Statist. Soc. B 63:127–146.
[7] Chorin, A.J. and Krause, P. (2004), Proc. Nat. Acad. Sci. USA
101:15013–15017.
[8] Dowd, M. (2006), Environmetrics 17:435–455.
[9] Doucet, A. and Johansen, A., Particle Filtering and Smoothing: Fif-
teen years Later, Handbook of Nonlinear Filtering (eds. D. Crisan et B.
Rozovsky), Oxford University Press, to appear.
[10] Snyder, C., Bengtsson, T., Bickel, P, and Anderson, J. (2008), Mon.
Wea. Rev. 136:4629–4640.
[11] Chorin, A.J. (2008), Comm. Appl. Math. Comp. Sc. 3:77–93.
[12] Weare, J. (2007), Proc. Nat. Acad. Sc. USA 104:12657–12662.
[13] Weare, J., Particle filtering with path sampling and an application to a
bimodal ocean current model, in press, J. Comput. Phys.
[14] Gordon, N., Salmond, D., and Smith A. (1993), IEEE Proceedings-F
140: 107–113.
[15] Carpenter, J., Clifford, P., and Fearnhead, P. (1999) IEEE Proceedings-
Radar Sonar and Navigation 146:2–7.
[16] Milstein, G., Platen, E., and Schurz H. (1998), SIAM J. Num. Anal.
35:1010–1019.
16
Page 17
hidden
[17] Kloeden, P. and Platen, E. (1992), Numerical Solution of Stochastic
Differential Equations, Springer, Berlin.
[18] Stuart, A., Voss, J., and Wilberg, P. (2004), Comm. Math. Sc. 4:685–
697.
17

Sign up today - FREE

Mendeley saves you time finding and organizing research. Learn more

  • All your research in one place
  • Add and import papers easily
  • Access it anywhere, anytime

Start using Mendeley in seconds!

Already have an account? Sign in

Readership Statistics

5 Readers on Mendeley
by Discipline
 
 
 
by Academic Status
 
20% Student (Bachelor)
 
20% Student (Master)
 
20% Ph.D. Student
by Country
 
60% United States
 
20% France