An improved saddlepoint approximation.
- PubMed: 17306841
Abstract
Given a set of third- or higher-order moments, not only is the saddlepoint approximation the only realistic 'family-free' technique available for constructing an associated probability distribution, but it is 'optimal' in the sense that it is based on the highly efficient numerical method of steepest descents. However, it suffers from the problem of not always yielding full support, and whilst S. Wang, General saddlepoint approximations in the bootstrap, Prob. Stat. Lett. 27 (1992) 61. neat scaling approach provides a solution to this hurdle, it leads to potentially inaccurate and aberrant results. We therefore propose several new ways of surmounting such difficulties, including: extending the inversion of the cumulant generating function to second-order; selecting an appropriate probability structure for higher-order cumulants (the standard moment closure procedure takes them to be zero); and, making subtle changes to the target cumulants and then optimising via the simplex algorithm.
Author-supplied keywords
An improved saddlepoint approximation.
of surmounting such difficulties, including: extending the inversion of the cumulant generating function to
Although stochastic population models [19] have proved to be a powerful tool in the study in
process generating mechanisms across a wide range of disciplines, all too often the associated
* Corresponding author. Fax: +141 552 2079.
E-mail address: eric@stams.strath.ac.uk (E. Renshaw).
www.elsevier.com/locate/mbs
Mathematical Biosciences 208 (2007) 359–3740025-5564/$ - see front matter 2006 Elsevier Inc. All rights reserved.second-order; selecting an appropriate probability structure for higher-order cumulants (the standard
moment closure procedure takes them to be zero); and, making subtle changes to the target cumulants
and then optimising via the simplex algorithm.
2006 Elsevier Inc. All rights reserved.
Keywords: Cumulants; Generating functions; Moment closure; Population dynamics; Simplex; Truncation
1. IntroductionColin S. Gillespie a, Eric Renshaw b,*
a School of Mathematics and Statistics, University of Newcastle, Newcastle upon Tyne, NE1 7RU, UK
b Department of Statistics and Modelling Science, Livingstone Tower, University of Strathclyde,
26 Richmond Street, Glasgow G1 1XH, UK
Received 19 May 2005; received in revised form 15 February 2006; accepted 29 August 2006
Available online 9 September 2006
Abstract
Given a set of third- or higher-order moments, not only is the saddlepoint approximation the only real-
istic ‘family-free’ technique available for constructing an associated probability distribution, but it is ‘op-
timal’ in the sense that it is based on the highly efficient numerical method of steepest descents. However, it
suffers from the problem of not always yielding full support, and whilst [S. Wang, General saddlepoint
approximations in the bootstrap, Prob. Stat. Lett. 27 (1992) 61.] neat scaling approach provides a solutionAn improved saddlepoint approximationdoi:10.1016/j.mbs.2006.08.026
will
highl
techniques is certainly a possibility, as illustrated by Marion et al. [13] implementation of the Tan
Th
mogo
are polynomial in form, rather than being, for example, fractional, then limited progress in the
deve
(i > 0
1 i
(Kill
the p
proc
360 C.S. Gillespie, E. Renshaw / Mathematical Biosciences 208 (2007) 359–374kN ¼ a1N b1Nsþ1; lN ¼ a2N þ b2Nsþ1 ð1:3Þ
for ai, biP 0, integer sP 1 and population size N = 0,1,2, . . . , int{(a1/b1)1/s}. For the basic logis-
tic (i.e., s = 1) case the forward differential equation for the c.g.f. is given by
oK
ot
¼ ½a1ðeh 1Þ þ a2ðeh 1Þ
oK
oh
þ ½b1ð1 ehÞ þ b2ðeh 1Þ
o2K
oh2
þ oK
oh
2" #
: ð1:4Þ
Whence on successively differentiating (1.4) with respect to h, and then placing h = 0, we obtain
the ser) bees of North and South America [16], Matis et al. [15] provide an excellent example of
ractical implementation of this procedure. They base their analysis on the power-law logistic
ess with population birth and death ratesKðh; tÞ ln½Mðh; tÞ
X
i¼1
jiðtÞh
i!
: ð1:2Þ
This representation carries a considerable advantage over the alternative probability, moment and
factorial moment generating functions, since for small i > 3 the ji(t) require little algebraic manip-
ulation in order to turn them into central moments, whilst j1(t), j2(t) and j3(t) denote the mean,
variance and skewness, directly.
Motivated by a desire to model both the annual catch of an invasion of muskrats in eleven
Dutch provinces between 1968 and 1991 [10], and the rapid colonisation of the Africanised honeylopment of moment expressions can be made by working in terms of the cumulants ji(t)
) of the process through the cumulant generating function (c.g.f.)Mðh; tÞ
X1
n¼0
pnðtÞenh ð1:1Þ
for the population size probabilities pn(t) at time t. In some situations, such as queueing [1] and
birth–death [22] type applications, or multiple immigration-death paradigms in quantum optics
[6,14,11] the underlying process is sufficiently tractable to enable the construction of exact expres-
sions for the pn(t). However, this is the exception rather than the norm, and in many biological
and ecological situations is rarely possible. However, provided the probability transition ratese essence of the problem lies in the general intractability of the forward (and backward) Kol-
rov partial differential equations for the moment generating function (m.g.f.)and Hsu [24] procedure, but as this may in itself inject spurious behaviour into the system such
approaches must usually be viewed with caution.hallenging analytic problems if useful progress is to be made. Whilst linearising such systems
render them tractable to direct solution, ignoring non-linear characteristics of the model is
y likely to have a serious impact on long term behaviour. Invoking nonlinear approximationmathematical development involves non-linear mathematics which immediately raises difficultystem of cumulant equations
truncated c.g.f.
jih
0
we have the approximation
Xn2 j hi Xn j hi
wher
C.S. Gillespie, E. Renshaw / Mathematical Biosciences 208 (2007) 359–374 361fnðxÞ ¼ 2p
i¼0
iþ2 0
i!
exp
i¼1
i 0
i!
h0x ; ð1:9Þ
e
x ¼
Xn1 jiþ1hi0 : ð1:10ÞhðxÞ ’ f ðxÞ ¼ ½2pK 00ðh0Þ1=2 expfKðh0Þ h0xg: ð1:8Þ
The power of this approach can be seen immediately on noting that it not only reproduces the
Normal p.d.f. exactly, but that the saddlepoint approximation for the gamma p.d.f. differs only
from the exact result in that C(a) is replaced by Stirling’s approximation in the normalising factor.
Furthermore, the relative error is of order O(n1). This error cannot be obtained by using tech-
niques such as Edgeworth expansions.
Easton and Ronchetti [3] propose a corresponding truncated approximation fn(x). This approx-
imation is obtained by replacing K(h) by Kn(h) in (1.7) and (1.8), thereby yielding
!1=2 ( )x ¼ K 0ðh0Þ ð1:7Þin tandem with the associated saddlepoint approximation [17]. Daniels [2] provides a superb ac-
count of its derivation; the approach essentially involves taking the dominant term in the contour-
integration formula for the inversion of the c.g.f. K(h) corresponding to the p.d.f. h(x). The key
result is that for h an appropriate root of the saddlepoint equationKnðhÞ
i¼1 i!
ð1:6Þ
Xn iture by taking all higher-order cumulants to be zero? This is addressed by considering thefor a = a1 a2, b = b1 + b2, c = a1 + a2 and d = b1 b2. Exact solutions of these equations are
not available, as the equation for any specific jth order cumulant function involves terms up to
the (j + 1)th order. For general s, terms up to the (j + s)th order are involved. Matis and Kiffe
[16] therefore adopt a moment closure approach by solving the system for the first j cumulant
equations with ji(t) 0 for all i > j + s.
Truncating the equations in this manner raises two non-trivial questions. The first [15] relates to
the accuracy of the resulting cumulant estimates. The second [20,21] is that assuming that only the
first (j +s) cumulants are known accurately, what error is induced in the underlying probabilitydj3ðtÞ=dt ¼ ½a bj1ðtÞj1ðtÞ þ ½3c b 6dj1ðtÞ 6bj2ðtÞj2ðtÞ
þ ½3a 3d 6bj1ðtÞj3ðtÞ 3bj4ðtÞ; etc:;dj2ðtÞ=dt ¼ ½c dj1ðtÞj1ðtÞ þ ½2a d 4bj1ðtÞj2ðtÞ 2bj3ðtÞ ð1:5Þdj1ðtÞ=dt ¼ ½a bj1ðtÞj1ðtÞ bj2ðtÞi¼0 i!
2. Se
On
we se
Thes
that,
x !
A similar approach can be applied to gP(x). For example, when a = 1 the values fP(0) = 0.32695,
and especially gP(0) = 0.37032 (i.e., S < 1 in both cases), compare well with the exact probability
362 C.S. Gillespie, E. Renshaw / Mathematical Biosciences 208 (2007) 359–3740.36788. Whilst when a = 10, fP(0) = 0 (S > 1) and gP(0) = 0.0001069 (S < 1) lie equidistant from
the true probability 0.000045.
In the case of the negative binomial example, we note that the c.g.f.
KðhÞ ¼ r ln pe
h
h
; ð2:6ÞFor since S will, in general, not equal one, we may simply take:
f Pð0Þ ¼1 S for S 6 1;
f Pð0Þ ¼0; and replace f PðxÞ by f PðxÞ=S otherwise: ð2:5ÞS
X1
x¼1
f PðxÞ: ð2:4Þ(1.8) that f (x) ! 1.
A way of overcoming this problem is to definee approximations clearly fail at x = 0, though this is not surprising. For Daniels [2] proves
for a distribution with support on (a,b), where a < b and a, b or both may be ±1, when
a,b then K 0(h0) ! a,b and so K00(h0) ! 0. Thus as K(h0) h0x is bounded, it follows from2pxxx 12xe that all the cumulants jia, and so f(x) and g(x) take the form
f PðxÞ ¼ a
xexaffiffiffiffiffiffiffip and gPðxÞ ¼ f PðxÞ 1 1
for x ¼ 1; 2; . . . : ð2:3ÞTo investigate the merits of this representation, let us consider the Poisson (a) and negative bino-
mial (r,p) distributions. First, as the Poisson c.g.f.
KPðhÞ ¼ aðeh 1Þ ¼ a
X1
i¼1
hi
i!
ð2:2ÞgðxÞ ¼ f ðxÞ 1þ 1
8
Kivðh0Þ
K 00ðh0Þ2
5
24
K 000ðh0Þ2
K 00ðh0Þ3
: ð2:1Þsion of K(h). This yields [2] the second-order approximation
!cond-order improvements
e obvious improvement is to include the second term in the contour integration for the inver-structure of a p.d.f. corresponding to a given finite set of cumulants. However, it does suffer from
a possible lack of support in the tail regions [20]. This paper therefore explores possible methods
for overcoming this problem, and also investigates whether the basic saddlepoint approximation
can be improved still further.is is a completely general representation and is extremely useful when we wish to examine the1þ ðp 1Þe
fNBðxÞ ¼ p
r
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
2pðx rÞ
p x
r
r=2 x r
xð1 pÞ
rx
ð2:7Þ
and the second-order approximation
gNBðxÞ ¼ fNBðxÞ 1þ ðr xÞ
2 þ rx
12rxðr xÞ
!
: ð2:8Þ
These approximations again fail at the ends of the support region, since f NB(x) and gNB(x) !1
as x#r. Though this problem is easily avoided if we implement the same procedure (2.5) as used for
the Poisson example.
Although the Poisson probabilities can be accurately represented by the saddlepoint approxima-
tion [20,8], an equally relevant question concerns the accuracy to which the cumulants of the sad-
dlepoint approximation reflect those of the original distribution? To answer this, we begin by
defining the relative difference
sadd
Note that both f (x) and g (x) ‘explode’ once we reach the eighth cumulant, which highlights
fj1
fj3
fj4
fj5
fj
C.S. Gillespie, E. Renshaw / Mathematical Biosciences 208 (2007) 359–374 3636
fj7 2.32 4.01 79.64 126.39 0.137 0.0146 1.35 0.206
fj88.06 0.518 0.108 0.42 0.359 0.0649 3.89 0.268
2.41 0.006 1.960 2.82 0.519 0.0691 0.283 0.098
25.50 2.28 12.09 14.41 0.322 0.0409 1.45 0.263
63.50 5.04 38.61 39.45 0.201 0.0233 1.69 0.266fj2 2.52 0.204 0.103 0.049 2.24 0.152 8.21 0.712a = 1 a = 10 r = 1 and p = 0.4 r = 1 and p = 0.8
fP(x) gP(x) fP(x) gP(x) fNB(x) gNB(x) fNB(x) gNB(x)
5.33 0.274 0.104 0.004 5.81 0.453 2.34 0.214Table 1
Relative differences (2.9) between the cumulants of the Poisson and Negative Binomial distributions and their first-
order, f(x), and second-order, g(x), saddlepoint approximations
Poisson Negative Binomial (geometric)lepoint procedure does not necessary improve the accuracy of the estimated distribution.
P Peji ¼ 100ðj^i jiÞ=ji; ð2:9Þ
where ji denotes the ith cumulant of the target distribution and j^i the ith cumulant of the approx-
imation. Now we see from the Poisson entries in Table 1 that for a = 1, and hence ji 1, the j^i do
not match ji particularly well, with large errors of 25%, 63% and 502% for the fifth, sixth and
eighth cumulants, respectively. In contrast, when a = 10, as the Poisson distribution can now
be approximated quite well by the Normal distribution (i.e., the saddlepoint approximation based
on j1 and j2), the estimated cumulants j^1; . . . ; j^6 are more accurate, though the same cannot be
said for higher-order cumulants. Moreover, including the extra term, gP(x), yields an improve-
ment in the accuracy of only the first two cumulants. This demonstrates that using a higher-order502.10 62.04 1573 1888 0.099 0.010 0.99 0.14
we h
notic
0
impr
xP j1
j22
2j3
; ð3:4Þ
C.S. Gillespie, E. Renshaw / Mathematical Biosciences 208 (2007) 359–374 365and so for large |j3| the approximation will fail for x < j1.
To obtain full support when only three cumulants are known, Wang [25] proposes multiplying
j3 by
hðhÞ ¼ expðj2b2h2=2Þ ð3:5Þ
thereby yielding
K3WðhÞ ¼ j1hþ j2
h2
2
þ j3
h3
6
hðhÞ; ð3:6Þ
where b > 0 is an ‘appropriately chosen’ constant and the subscript 3W denotes third-order trun-
cation under Wang’s improvement. The idea is to retain full support by controlling the effect of j3Easton and Ronchetti [3] and Renshaw [20] note that this approximation can collapse in the lower
tails of a distribution, due to the necessity of having w(x) > 0 in (3.2). This condition is equivalent
toas h iovement over the standard Normal approximation if we know the skewness. However, both3 2
[20]. This is a completely general and algebraically tractable result which provides a significantf3ðxÞ ¼ ð4p2wðxÞÞ1=4 expfð1=6j2Þ½j3 3j2wðxÞ þ 2wðxÞ3=2g ð3:3Þ2
whence the third-order approximation (1.9) takes the simple form3
h0 ¼ ðj2 þ
ffiffiffi
w
p
Þ=j3 where wðxÞ ¼ j2 þ 2j3ðx j1Þ; ð3:2Þ3. Approximating a distribution using just two or three cumulants
Given the first three cumulants, the truncated c.g.f. (1.6) becomes
K3ðhÞ ¼ j1hþ j2h2=2þ j3h3=6: ð3:1Þ
Thus on solving (1.10), namely K0 (h ) = x, we obtainrecentred at the point of interest. This contrasts with other types of approximations such as Edge-
worth or Taylor expansions which are high-order approximations based around a fixed point for
all values of the distribution’s support. It is with these features in mind that we now proceed to
examine the situation where we possess only limited knowledge of the cumulant structure.eable effect. Finally, we note that saddlepoint approximations are low-order approximationsonly the first few cumulants of a distribution, it is still able to provide an accurate approximation
to the probability structure. Even the exceedingly large errors of j8 do not appear to have anyonly 12-digit accuracy produces widely different values for higher-order cumulants. Third,
ave illustrated that even though the saddlepoint approximation might accurately reproducethen the cumulant errors rise dramatically to fj5 ¼ 3100, fj6 ¼ 18007, fj7 ¼ 150; 000 and
fj8 ¼ 1; 900; 000. Second, the calculations were performed using MAPLE with 50-digit accuracy;ncreases. Here h(h) serves this purpose well: for h(0) = 1, and so the j3-term remains virtually
reverts to the Normal distribution, for which full support is guaranteed. It is therefore always
possible to find a b large enough to allow Wang’s approximation to exist. Wang suggests taking
bP1/2, stating that if b < 1/2 then the modification has little effect. However, this approach is not
easy to defend, since it can be argued that by controlling j3 we are simply reducing the skewness
to a smaller level. So our saddlepoint approximation will capture this altered j3 rather than the
target value. Also, as b !1 the Normal approximation will be obtained, so we are simply ignor-
ing the j3 value altogether!
The approximation (3.6) has been implemented on a number of occasions, for example by
Let f
then
366 C.S. Gillespie, E. Renshaw / Mathematical Biosciences 208 (2007) 359–374ing Table 2 we see that this results in a substantial loss of accuracy in comparison with Example
A. Moreover, incorporating the extra saddlepoint term, g3W(x), results in a further decrease of
accuracy. Fig. 2 exposes yet more difficulties; not only does f3W(x) exhibit unwelcome ‘cusps’
at x = 10 and x = 25, but these cusps become even more prominent when g3W(x) is considered.
Although we took b = 0.5 in the birth–death example A, to achieve full support we only require
b ’ 0.2. However, whilst this lower value results in improved cumulant estimates, the resulting
distribution exhibits strange characteristics, similar to those shown in Fig. 2. Likewise, in Example
Table 2
Relative differences (2.9) between the target cumulants of Examples A and B, and the cumulants obtained using Wang’s
approach, and the saddlepoint approximations based on a Poisson (with b = 0.5) and Negative binomial (with b = 0.6)
moment structure
Example A Example B
f 3W(x) g3W(x) f 3P(x) g3P(x) f 3W(x) g3W(x) f 3NB(x) g3NB(x)
fj1 3.45 1.42 1.70 0.43 7.60 0.106 3.35 1.17
fj2 2.42 3.63 1.77 1.7 1.62 21.08 8.34 2.66
fj3to achieve full support we have to increase b to 0.6 because j3 is relatively larger. On inspect-rection term, and g3W(x) the corresponding approximation based on the second-order expression
(2.1), with b = 0.5. Then we see from Table 2 that f3W(x) estimates the target mean and variance to
within 3.5%, but has an error rate of over 32% for j3. Whilst although g
3W(x) involves a small loss
of accuracy in the variance, this is more than compensated for by large improvements in j1 and
j3. If we consider a more extreme situation, with target cumulants
Example B : j1 ¼ 12; j2 ¼ 18 and j3 ¼ 100;3W(x) denote the saddlepoint approximation using the first three cumulants and Wang’s cor-Harvill and Newton [9], Gatto [5], Good [7] and Lieberman [12]. However, the values of j3 used
have generally been small in comparison with the mean and variance. Wang [25], for example,
uses j1 = 2/7, j2 = 0.005102 and j3 = 9.718 · 105. We will now demonstrate that when j3 is
large in comparison to the mean and variance, Wang’s approximation presents problems.
As an example of a distribution which has a large third cumulant, suppose we consider the dis-
tribution of the simple birth–death process with parameters k = 1.5 and l = 1.0 and initial pop-
ulation size n0 = 10, at time t = 0.65. This produces target cumulants of
Example A : j1 ¼ 13:84; j2 ¼ 26:57 and j3 ¼ 82:87:32.34 13.28 9.90 2.38 53.54 34.22 14.19 9.05
0.
00
x
0 10 20 30 40
0.
0
x
Fig. 2. (a) Distributions obtained using cumulants j1 = 12, j2 = 18 and j3 = 100 for Wang’s approximation f
3W(x),
with b = 0.6, together with the saddlepoint approximation under a negative binomial structure with p = 0.4 and r = 10;
(b) as (a), but including the second term of the saddlepoint approximation.B we
incre
the e
show
To
name
This
If we
on ta
So l
sitive
ment
using
‘sens0.
10.
05
Pr
ob
a
0.
2
Pr
ob
a0.
1
bi
lity 0
.3
bi
lity0
00.
15
f3W(x)
f3NB(x)
4
0.
5
g3W(x)
g3NB(x)
a bsee that b can be reduced, but this also exacerbates the situation. Conversely, although
asing b leads to a cusp-free distribution, it does so at the expense of substantially increasing
rror surrounding the third cumulant. This raises the question of whether the distributions
n in Fig. 2 are aberrant results, or whether they form part of a larger problem?
answer this, consider the corresponding moment generating equation of the c.g.f. (3.6),
ly
M3WðhÞ ¼ expfK3WðhÞg ¼ exp j1hþ j2
h2
2
þ j3
h3
3!
ej2b
2h2=2
: ð3:7Þ
easily yields the raw moments via
l0k ¼
Z 1
1
xkpðxÞdx ¼ o
kMðhÞ
ohk
h¼0
: ð3:8Þ
calculate the eighth raw moment for Expression (3.7), since this is where Table 1 ‘explodes’,
king j1 = 0 for simplicity we obtain
l08 ¼ 35j2½3j32 þ 8j23ð1 2b2Þ: ð3:9Þ
0
8 may become negative when b > 1. However, since even raw moments are by definition po-
over xP 0, it follows that for j1 = 0 and b >
pð1=2þ 3j32=16j23Þ we have generated mo-
s which cannot correspond to any proper distribution. So it is by no means clear that
Wang’s saddlepoint approximation based on only the first three cumulants will lead to a
ible’ distribution.
are n
shows that not only does f3P(x) produce significantly better cumulant approximations than
Wang’s approach, but that these estimates may be refined still further through g3P(x).
368 C.S. Gillespie, E. Renshaw / Mathematical Biosciences 208 (2007) 359–374It is satisfying to observe that by invoking a specific cumulant structure we not only obtain bet-
ter cumulant estimates, but that these also correspond to an improved approximation for the tar-
get distribution. Fig. 3 shows the absolute error for five different approximation schemes used to
estimate the birth–death process (Example A), where the absolute error
err ¼ 100 pðxÞ f ðxÞ
pðxÞ
;Note that target cumulants ji are unaffected by the choice of a. Moreover, on noting that the sad-
dlepoint approximation only really captures the first few moments (see Table 1), our choice of a
should not exert an overt influence on the approximation.
Denote f3P(x) and g3P(x) to be the first- and second-order saddlepoint approximation of the
c.g.f. (4.1). Then as the saddlepoint approximation only captures the first few cumulants success-
fully (see Table 1), to ensure that the choice of a does not exert an undue influence on the approx-
imation, we base it on three criteria. First, it must lend full support over the required range.
Second, it should match our target cumulants ji to a ‘reasonable’ degree of accuracy; and third,
the resulting approximation should not have ‘unnatural’ characteristics as demonstrated, for
example, by the cusps in Fig. 2. In practice, setting a = cj3, where c takes the trial values
c = 1,2, . . . , 10, produces excellent results. For example, in the birth–death scenario of Example
A, we found that taking a = 4 · j3 = 331.48 produced a good approximation. Indeed, Table 2K3PðhÞ ¼ ðj1 aÞ
h
1!
þ ðj2 aÞ
h2
2!
þ ðj3 aÞ
h3
3!
þ aðeh 1Þ: ð4:1Þdeed, when we include the second-order saddlepoint approximation, g(x), we are estimating higher-
order moments of an ‘improper’ distribution with a greater degree of accuracy, which most likely
explains the strange shapes exhibited in Fig. 2. Such difficulties, of course, arise because the approx-
imations take full support over (1,1), rather than operating over the restricted range [0,1).
4. An improved method
Given that we have shown that Wang’s correction term does not provide good estimates of the
target cumulants as j3 increases, it is clearly necessary to find a alternative approach for obtaining
full support when using only three cumulants. We begin by acknowledging that when we just use
the first three cumulants, it is not that the remainder are zero but that we take them to be zero
simply because we do not possess any information on them. Thus a possible solution to this prob-
lem is to assume some form of non-zero structure for the unknown cumulants. This is particularly
apposite, since the presence of skewness automatically implies a non-Normal distribution, yet
having third- and higher-order cumulants equal to zero characterises the Normal distribution!
For example, suppose we take advantage of the simple structure of the Poisson process, for which
K(h) = a(eh 1) since all cumulants equal the Poisson mean a, by consideringen for the mild case of Example A, we discover that some even rawmoments of order above 52
egative. Whilst for the more extreme Example B, complications arise for orders above 32. In-
perfo
having
C.S. Gillespie, E. Renshaw / Mathematical Biosciences 208 (2007) 359–374 369notably in the centre of the distribution. Over 8 < x < 32 the error estimates for f3P(x) are about
10%, compared to a maximum of more than 40% for f3W(x). As to whether f3P(x) is superior to
g3P(x), we could argue either way; for although g3P possesses a lower error rate for much of the
distribution, it has a significantly poorer error rate near x = 10. Whilst no approximation per-
forms well in the tails of the distribution, it should be recalled that we are only using three cumu-
lants for a particularly long-tailed distribution.
When we examine Example B, where the target cumulants are j1 = 12, j2 = 18 and j3 = 100,
we discover that an assumed Poisson structure for the remaining cumulants does not produce sat-
isfactory results. Instead, using a negative binomial cumulant structure provides a much better
approximation. Denote f3NB(x) and g3NB(x) to be the first- and second-order saddlepoint approx-
imations using the associated c.g.f.
KNBðhÞ ¼ j1
r
p
h
1!
þ j2
rð1 pÞ
p2
h2
2!
þ j3
rð1 pÞð2 pÞ
p3
h3
3!
þ rThen
param
sadd
‘p.d.f
corre
simil
W
Wan
f3NB(
highl
mialp(x) denotes the exact distribution and f(x) its approximation. Overall, f3P(x) and g3P(x) out-
rm both Wang’s f3W(x) and g3W(x) approximations, and the Normal approximation, most
parameters k = 1.5, l = 1.0 and n0 = 10.
Fig. 3. A plot of the absolute error values for five different approximations of a birth–death process at time t = 0.65, ln pe
h
1þ ehðp 1Þ
: ð4:2Þ
on parallelling the approach used when choosing a in the Poisson c.g.f. (4.1), we select the
eters p and r in (4.2) to ensure that the associated cumulants correspond closely to the target
lepoint cumulants, whilst at the same time allowing the approximation to generate a realistic
. shape’. Here, values of p = 0.4 and r = 10 achieve this goal; note that these parameters
spond to a negative binomial distribution with j1 = 25, j2 = 37.5 and j3 = 150, which are
ar to our target values.
e observe from Table 2 that f3NB(x) produces better estimates for our target cumulants than
g’s expansion, with g3NB(x) providing yet further improvement. Moreover, Fig. 2 shows that
x) and g3NB(x) do not exhibit the cusps produced via Wang’s approximation. Now it seems
y likely that yet further improvement canbe gained by extending thePoisson andnegative bino-
expressions (4.1) and (4.2) into a more general framework. For consider the representation
have knowledge of the first three cumulants, as a can always be chosen to ensure that the target
these
a clo
K(h0
woul
strain
A
woul
13.84
370 C.S. Gillespie, E. Renshaw / Mathematical Biosciences 208 (2007) 359–374, and then proceed to calculate overall relative error, namely
S ¼ a j1 j^1
þ b
j2 j^2
þ c
j3 j^3
ð5:1Þ1 k
resentation (4.6), but additionally to choose our initial cumulants more carefully. For example, we
observe from Table 2 that in the birth–death case A we have a relative error of 1.70% for f3P(x)
when estimating j1. So suppose we input a slightly lower target value of j1, say 13.5 instead offurther, albeit intuitive, variation for achieving an ‘optimal’ saddlepoint approximation
d be not only to alter say a in the Poisson-based c.g.f. (4.1), or jG; . . . ;jG in the general rep-some appropriately small A to avoid the cusp-like behaviour exhibited in Fig. 2.
5. A further refinementd entail the development of least squares, minimum entropy, etc., criteria subject to the con-
ts that f(x) is non-negative, has full support over the range of interest, and |f 0 0(x)| < A forvia KGen(h) will not be explicitly expressible purely in terms of x, but only in terms of h0. This,
however, is a small price to pay for acquiring a substantial improvement in the quality of the
approximation.
Further research clearly needs to be undertaken to develop a general methodology for con-
structing ‘optimal’ k-order c.g.f.’s
KGenðhÞ ¼
Xk
i¼1
ðji jGi Þ
hi
i!
þ KGðhÞ; ð4:6Þ
where jGi is the ith cumulant of the c.g.f. K
G(h). This, however, is a difficult problem, since itand further candidate c.g.f.’s would be highly desirable. Note that we can only write down
sed expression for the saddlepoint approximation, such as (1.8), if the saddlepoint equation
) = x is a polynomial of degree less than five. So in essence, this means that solutions obtainedcumulants are well-matched, and that the associated approximating p.d.f. (1.8), namely
f ðxÞ ’ f2p½KGenðh0Þ00g1=2 expfKGenðh0Þ h0xg ð4:5Þ
has full support and exhibits smooth behaviour. Even so, much more detailed examination of1! 2! 3!
where KG(h) is the c.g.f. of any distribution whose first three cumulants are jG1 , j
G
2 and j
G
3 . For
example, the uniform distribution p(x) = 1/(n + 1) for x = 0,1, . . . ,n takes the relatively simple
c.g.f.
KGðhÞ ¼ ln e
ðnþ1Þh 1
ðeh 1Þðnþ 1Þ
ð4:4Þ
which is highly amenable to manipulation. Study of the negative binomial, Poisson and uniformKGenðhÞ ¼ ðj1 jG1 Þ
h þ ðj2 jG2 Þ
h2 þ ðj3 jG3 Þ
h3 þ KGðhÞ; ð4:3Þj1 j2 j3
the fu
dime
ners of a tetrahedron. In the simplex approach, originally developed by Spendley et al. [23] and
C.S. Gillespie, E. Renshaw / Mathematical Biosciences 208 (2007) 359–374 371then extended by Nelder and Mead [18], the function is first minimised at the corners of a regular
simplex (which have sides of equal length), and then a new simplex is formed by deleting the cor-
ner at which the function value is largest. A replacement vertex is then formed by reflecting the
rejected vertex through the centre of all of the remaining vertices, except for the rejected one.
We then proceed by reducing the size of the simplex (known as contraction). Nelder and Mead
extended the model to allow expansion of the simplex as well as contraction, in order to reduce
the chance of the procedure becoming trapped in a local minimum. In general, the following pro-
cedure works well.
Simplex Algorithm:
1. set the initial saddlepoint cumulants ji to the target values ji and choose an appropriate val-
ue for a;
2. evaluate the saddlepoint probabilities over the required region of support X, and re-scale to
ensure that
P
x2Xf
3PðxÞ ¼ 1;
3. evaluate the cumulants, j^i, corresponding to the saddlepoint approximation f
3P(x);
4. evaluate S, i.e., Expression (5.1), and use the simplex NAG routine E04CCF to construct j1,
j2;j3 and a;
5. print j1, j2; j3 and a;
6. if S reaches say 106, then stop; else return to 2.
Whilst in general the target cumulants appear to be good starting values, it is recommended
that different initial values of a be used and the results compared [21]. Furthermore, experimen-
tation with the coefficients a, b and c in S is helpful in finding an optimal solution (especially if the
simplex routine takes a long time to converge). In practise we used the values a = 3/2, b = 1 and
c = 1/2. This means that more emphasis is placed on lower-order cumulants. Note that S may not
always be the most efficient function to implement, especially for small target cumulants. An alter-
nate measure could, for example, involve the sum-of-squares
~S ¼
X3
i¼1
ðji j^iÞ2; ð5:2Þ
which is less susceptible to small ji’s; whilst other possibilities include minimising v
2 or entropy.
In Example A, minimising ~S (i.e., Expression (5.2) where j^i are the cumulants from f
3P(x)), in-
volves five-dimensional space, since we are changing the parameters j1;j2;j3 and a. Now imple-
menting the above algorithm yields the minimising values j1 ¼ 13:603;j2 ¼ 27:378; j3 ¼ 90:557
3~SPand anction S. We define a simplex to be a set of n + 1 points in n-dimensional space. So in two-
nsions the corners of a triangle form a simplex, while in three dimensions it would be the cor-for positive constants a, b and c, and cumulants j^i corresponding to the chosen saddlepoint
approximation. Then if S is reduced we accept the proposed value, otherwise we do not. As (here)
three cumulants are involved, the full approach is to alter j1, j2, j3 and a, to j1, j2, j3 and a, in
order that we achieve the smallest possible value of S.= 324.987. Whence on denoting f ðxÞ to be the saddlepoint approximation based on these
compared with the other approximations in Table 2. Furthermore, we observe from Fig. 4 that
372 C.S. Gillespie, E. Renshaw / Mathematical Biosciences 208 (2007) 359–374this procedure also yields an improved approximation of the target distribution. For x > 12 we
have a lower error rate than f3P(x) and, in general, a slightly lower rate than g3P(x). Additionally,
there is a reduced error rate in the right-hand tail.
However, extending this simplex approach to the far more extreme Example B does not give
rise to satisfactory results. For although we do obtain better cumulants in relation to other target
values, the resulting shape of the distribution is not realistic since the spike shown in Fig. 2 be-
comes even more pronounced. However, it should be stressed that when j3 is reduced to approx-
imately three times j2, this cusp no longer features and the method once again works extremely
well.
Finally, we note that when extending the simplex algorithm to allow even more cumulants to be
changed, i.e., fourth- and higher-order cumulants, caution needs to be exercised. For if more than
six parameters are used, then the simplex routine is more likely to find a local minimum than a
global one.
6. Summary and conclusionsvalues, we see that the three relative error rates between the cumulants of f 3~SPðxÞ and the target
Fig. 4. A plot contrasting the relative error values (5.1) of three approximations using the cumulants in Example A.The general intractability of nonlinear Kolmogorov equations poses a major problem in the
analysis of applied stochastic processes. For although many standard ‘statistical’ procedures em-
ploy the assumption of an underlying Normal distribution, with the Physics paradigm being the
adoption of ‘mean-field’ theory, it is becoming increasingly acknowledged that Gaussian processes
have little relevance in many real-life situations. Whilst in general, the standard saddlepoint
approximation arguably provides the ‘optimal’ approximating distribution, since it is based on
the method of fastest descents, the fact that it can easily give rise to limited regions of support
is clearly a major problem. Moreover, including the second term in the underlying contour inte-
gration (see Expression (2.1)) does not lead to a uniformly better improvement. Consideration of
the Poisson and negative binomial distributions highlights the accuracy to which both probabil-
ities and cumulants can be reproduced under both first- and second-order saddlepoint approxima-
tions. The key point is that at x = j1 the basic saddlepoint approximation can never be improved
provides yet further improvement. So not only do these new techniques for producing the ‘best’
C.S. Gillespie, E. Renshaw / Mathematical Biosciences 208 (2007) 359–374 373Acknowledgement
This work was supported by an EPSRC Research Studentship.
References
[1] A. Chen, E. Renshaw, The M/M/1 queue with mass exodus and mass arrivals when empty, J. Appl. Prob. 34 (1997)
192.
[2] H.E. Daniels, Saddlepoint approximations in statistics, Ann. Math. Stat. 25 (1954) 631.
[3] G.S. Easton, E. Ronchetti, General saddlepoint approximations with applications to L statistics, J. Am. Stat.
Assoc. 81 (1986) 420.
[4] C.A. Field, E. Ronchetti, Small Sample Asymptotics. Lecture Notes and Monograph Series, 13, Institute of
Mathematical Statistics, Hayward, 1990, p. 4.
[5] R. Gatto, Saddlepoint approximations of marginal densities and confidence intervals in the logistic regression
measurement error model, Biometrics 52 (1996) 1096.
[6] C.S. Gillespie, E. Renshaw, The evolution of a batch-immigration death process subject to counts, Proc. R. Soc.
Lond. A 461 (2005) 1563.
[7] I.J. Good, The multivariate saddlepoint method and chi-squared for the multinomial, Ann. Math. Stat. 32 (1961)
535.
[8] C. Goutis, G. Casella, Explaining the saddlepoint approximation, Am. Stat. 53 (1999) 216.
[9] J.L. Harvill, H.J. Newton, Saddlepoint approximations for the difference of order statistics, Biometrika 82 (1995)
226.cumulant structure in a given situation prove to be extremely useful and beneficial, but the exten-
sion of these ideas into the construction of a general kth-order methodology could well prove to
be a highly profitable field of future research development.on that given by the Normal approximation, which suggest that inclusion of higher-order terms in
the inversion of K(h) should result in an improved approximation since more information is
utilised.
If we are given just the first three cumulants, then although the resulting saddlepoint approx-
imation f3(x) has the closed form (3.3), it may not take full support. Wang’s scaling of j3 does
reclaim full support, but the price paid involves both reduced accuracy and the chance of produc-
ing aberrant (e.g. cusped) distributions. Development of an improved method which yields good
accuracy, full support and also ‘sensible’ approximating p.d.f.s therefore requires an alternative
approach. To construct this we dispense with the ‘usual’ moment closure technique based on
replacing all higher-order cumulants ji for i > k by zero, and adopt, instead, a specific non-zero
structure. Here we consider the Poisson and negative binomial cases, and through them show that
this procedure yields substantially better approximations than those delivered through Wang’s ap-
proach. Progress can be advanced still further by choosing our ‘start’ cumulants more carefully.
For example, we show that making a subtle change to j1, and then minimising overall relative
error (weighted with respect to cumulant order) via the simplex algorithm, has the potential for[10] R. Hengeveld, Dynamics of Biological Invasions, Chapman & Hall, London, 1984.
Soc. Lond. A 459 (2003) 623.
[12] O. Lieberman, The effect of nonnormality, Econ. Th. 13 (1997) 52.
[13] G. Marion, E. Renshaw, G. Gibson, Stochastic effects in a model of nematode infection in ruminants, IMA J.
Math. Appl. Med. Biol. 15 (1998) 97.
[14] J.O. Matthews, K.I. Hopcraft, E. Jakeman, Generation and monitoring of discrete stable random processes using
multiple immigration population models, J. Phys. A 36 (2003) 11585.
[15] J.H. Matis, T.R. Kiffe, P.R. Parthasarathy, Density-dependent birth–death-migration models, in: XVIII
International Biometric Conference, Amsterdam, I5, 1996, p. 79.
[16] J.H. Matis, T.R. Kiffe, On approximating the moments of the equilibrium distributions of a stochastic logistic
model, Biometrics 52 (1996) 980.
[17] J.H. Matis, T.R. Kiffe, E. Renshaw, J. Hassan, A simple saddlepoint approximation for the equilibrium
distribution of the stochastic logistic model of population growth, Ecol. Model. 161 (2003) 239.
[18] J.A. Nelder, R. Mead, A simplex method for function minimization, Comp. J. 7 (1965) 308.
[19] E. Renshaw, Modelling Biological Populations in Space and Time, Cambridge University Press, Cambridge, MA,
1991.
[20] E. Renshaw, Saddlepoint approximations for stochastic processes with truncated cumulant generating functions,
IMA J. Math. Appl. Med. Biol. 15 (1998) 41.
[21] E. Renshaw, Applying the saddlepoint approximation to bivariate stochastic processes, Math. Biosci. 168 (2000)
57.
[22] E. Renshaw, A. Chen, Birth–death processes with mass annihilation and state-dependent immigration, Stoch.
Models 13 (1997) 239.
[23] W. Spendley, G.R. Hext, F.R. Himsworth, Sequential applications of the simplex designs on optimization and
374 C.S. Gillespie, E. Renshaw / Mathematical Biosciences 208 (2007) 359–374evolutionary operation, Technometrics 4 (1962) 441.
[24] W.Y. Tan, H. Hsu, Some stochastic models of AIDS spread, Stat. Med. 8 (1989) 121.
[25] S. Wang, General saddlepoint approximations in the bootstrap, Prob. Stat. Lett. 27 (1992) 61.
Sign up today - FREE
Mendeley saves you time finding and organizing research. Learn more
- All your research in one place
- Add and import papers easily
- Access it anywhere, anytime


