Handbook on Constructing Composite Indicators: Methodology and User Guide
- ISBN: 9789264043459
- DOI: 10.1787/9789264043466-en
Abstract
This Handbook is a guide for constructing and using composite indicators for policy makers, academics, the media and other interested parties. While there are several types of composite indicators, this Handbook is concerned with those which compare and rank country performance in areas such as industrial competitiveness, sustainable development, globalisation and innovation. The Handbook aims to contribute to a better understanding of the complexity of composite indicators and to an improvement of the techniques currently used to build them. In particular, it contains a set of technical guidelines that can help constructors of composite indicators to improve the quality of their outputs.
Handbook on Constructing Composite Indicators: Methodology and User Guide
Table 16. Eigenvalues of TAI data set
Eigenvalue Variance
(%)
Cumulative variance
(%)
1 3.3 41.9 41.9
2 1.7 21.8 63.7
3 1.0 12.3 76.0
4 0.9 11.1 87.2
5 0.5 6.0 93.2
6 0.3 3.7 96.9
7 0.2 2.2 99.1
8 0.1 0.9 100
The third step deals with the rotation of factors (Table 17). The rotation (usually the varimax
rotation) is used to minimise the number of individual indicators that have a high loading on the same
factor. The idea behind transforming the factorial axes is to obtain a “simpler structure” of the factors
(ideally a structure in which each indicator is loaded exclusively on one of the retained factors). Rotation
is a standard step in factor analysis – it changes the factor loadings and hence the interpretation of the
factors, while leaving unchanged the analytical solutions obtained ex-ante and ex-post the rotation.
Table 17. Factor loadings of TAI based on principal components
Factor loading Squared factor loading (scaled to unity sum)
Factor 1 Factor 2 Factor 3 Factor 4 Factor 1 Factor 2 Factor 3 Factor 4
Patents 0.07 0.97 0.06 0.06 0.00 0.67 0.00 0.00
Royalties 0.13 0.07 -0.07 0.93 0.01 0.00 0.00 0.49
Internet 0.79 -0.21 0.21 0.42 0.24 0.03 0.04 0.10
Tech exports -0.64 0.56 -0.04 0.36 0.16 0.23 0.00 0.07
Telephones 0.37 0.17 0.38 0.68 0.05 0.02 0.12 0.26
Electricity 0.82 -0.04 0.25 0.35 0.25 0.00 0.05 0.07
Schooling 0.88 0.23 -0.09 0.09 0.29 0.04 0.01 0.00
University 0.08 0.04 0.96 0.04 0.00 0.00 0.77 0.00
Expl.Var 2.64 1.39 1.19 1.76
Expl./Tot 0.38 0.20 0.17 0.25
Note: Expl.Var is the variance explained by the factor and Expl./Tot is the explained variance divided by the total variance of the
four factors.
The last step deals with the construction of the weights from the matrix of factor loadings after
rotation, given that the square of factor loadings represents the proportion of the total unit variance of the
indicator which is explained by the factor. The approach used by Nicoletti et al., (2000) is that of
grouping the individual indicators with the highest factors loadings into intermediate composite
indicators. With the TAI data set there are four intermediate composites (Table 17). The first includes
Internet (with a weight of 0.24), electricity (weight 0.25) and schooling (weight 0.29).24 Likewise the
second intermediate is formed by patents and exports (worth 0.67 and 0.23 respectively), the third only
by university (0.77) and the fourth by royalties and telephones (weighted with 0.49 and 0.26).
The four intermediate composites are aggregated by assigning a weight to each one of them equal to
the proportion of the explained variance in the data set: 0.38 for the first (0.38 =
2.64/(2.64+1.39+1.19+1.76)), 0.20 for the second, 0.17 for the third and 0.25 for the fourth (Table 18).25
Note that different methods for the extraction of principal components imply different weights, hence
different scores for the composite (and possibly different country rankings). For example, if Maximum
plus an error term, accounting, for example, for the error in the sampling of firms. Therefore, estimating
the unknown component sheds some light on the relationship between the composite and its components.
The weight obtained will be set to minimise the error in the composite. This method resembles the well
known regression analysis. The main difference lies in the dependent variable, which is unknown under
UCM.
Let ph(c) be the unknown phenomenon to be measured. The observed data consist in a cluster of
q=1,…,Q(c) indicators, each measuring an aspect of ph(c). Let c=1,…M(q) be the countries covered by
indicator q. The observed score of country c on indicator q, I(c,q) can be written as a linear function of
the unobserved phenomenon and an error term, )q,c(H :
)]q,c()c(ph)[q()q()q,c(I HED ++= (23)
)q(D and )q(E are unknown parameters mapping ph(c) on I(c,q).
The error term captures two sources of uncertainty. First, the phenomenon can be only imperfectly
measured or observed in each country (e.g. errors of measurement). Second, the relationship between
ph(c) and I(c,q) may be imperfect (e.g. I(c,q) may be only a noisy indicator of the phenomenon if there
are differences between countries on the indicator). The error term )q,c(H is assumed to have a zero
mean, 0))q,c((E =H , and the same variance across countries within a given indicator, but a different
variance across indicators, 2q
2 ))q,c((E VH = ; it also holds that 0))h,i()q,c((E =HH for ic z or
hq z .
The error term is assumed to be independent across indicators, given that each individual indicator
should ideally measure a particular aspect of the phenomenon independent of others. Furthermore, it is
usually assumed that ph(c) is a random variable with zero mean and unit variance, and the indicators are
normalised using Min-Max to take values between zero and one. The assumption that both ph(c) and
)q,c(H are both normally distributed simplifies the estimation of the level of ph(c) in country c. This is
done by using the mean of the conditional distribution of the unobserved component, once the observed
scores are appropriately re-scaled:
¦
=
=
)c(Q
1q )q(
)q()q,c(I
)q,c(w))]c(Q,c(I),...,1,c(I/)c(ph[E
E
D (24)
The weights are equal to:
¦
=
+
= )(
1
2
2
1
),(
cQ
q q
qqcw
V
V
(25)
where w(c,q) is a decreasing function of the variance of indicator q, and an increasing function of the
variance of the other indicators. The weight, w(c,q), depends on the variance of indicator q (numerator)
and on the sum of the variances of the all the other individual indicators, including q (denominator).
However, since not all countries have data on all individual indicators, the denominator of w(c,q) could
be country specific. This may produce non-comparability of country values for the composite, as in
BOD. Clearly, whenever the set of indicators is equal for all countries, weights will no longer be country
specific and comparability will be assured. The variance of the conditional distribution is given by:
Sign up today - FREE
Mendeley saves you time finding and organizing research. Learn more
- All your research in one place
- Add and import papers easily
- Access it anywhere, anytime



