'Are we nearly there?' Alice managed to pant out at last.
'Nearly there!' the Queen repeated. 'Why, we passed it 
ten minutes ago!  Faster!'  Through the Looking Glass, 
Lewis Carroll.

12.  SUMMARY, DISCUSSION, MAIN CONCLUSIONS AND 
RECOMMENDATIONS
12.1  SUMMARY

In the early 1960's, Kalman made an important 
contribution to the constant struggle to extract signal 
from noise (or in other words, to extract information 
from data).  This paper shows that the Kalman filter is a 
generalization of recursive Bayesian estimation, which is 
itself the repeated application of Bayes theorem.  It is 
also shown that recursive Bayesian estimation is a 
generalization of ordinary least squares regression, and 
so the Kalman filter has its roots very deep in the past. 

Five problems have been worrying me for several years;  
so much so that for a while I believed that two of them 
(what to do when the last few residuals are 
autocorrelated, and how to treat preliminary data) had no 
fully satisfactory resolution, even though they are very 
important to the practical business forecaster.  The five 
problems are described in chapter one, and the way that 
recursive Bayesian estimation can cope with them is 
described in chapter three. 

Recursive Bayesian estimation was used to estimate a very 
simple model in chapter five; only one dependent and only 
one explanatory variable can be treated.  But the results 
were sufficiently encouraging to want to treat multiple 
dependent and multiple explanatory variables, and the 
theory for this is developed in chapter 6. 


At this point, the notation and theory is sufficiently 
well explained to make it possible to examine past work 
in this area.  Although much use has been made of the 
Kalman filter in process control, it has been used only a 
little by statisticians, and barely at all by 
econometricians.  I know of some 20 econometric/time 
series analysis packages, and none of them offer Kalman 
filtering as an option, except the ones implementing the 
Harrison/Stevens technique, which inhibits causal 
modelling. 

But does the Kalman filter estimate better models than 
ordinary least squares?  Chapter eight addresses this 
question by generating artificial data with known 
properties, and then estimating the same model using the 
two techniques.  The conclusion is that the Kalman filter 
does give better models, but not by much for the range of 
models studied. 

Chapter nine is a practical application of the Kalman 
filter to an energy demand model, showing how a Kalman 
filter estimation of two fairly straightforward models 
compares with the ordinary least squares estimation. 

The longest chapter of this paper is chapter ten.  It 
deals with the estimation of a wool consumption model.  
Most of the features available with Kalman filtering are 
used, and the effect of : 
       - zero, non-zero or variable W 
       - giving older and preliminary data less precision 
       - prior information 
       - incorporating information from another source 
(in this case, a time-series of total fibre consumption) 
are all demonstrated.  The dynamic sum of squared errors 


(DSSE) from the best model is less than a fifth of the 
OLS DSSE.  A wide variety of assumptions about V, W and 
prior data are used;  the conclusion is that the 
improvements offered by the Kalman filter are robust to 
these assumptions.


12.2  DISCUSSION


The Kalman filter is difficult to grasp intuitively;  it 
was quite hard at first for me to see what all those 
matrices were for and what they were doing.  By 
approaching it via recursive Bayesian estimation, it is 
much easier to see what is going on (I re-invented 
recursive Bayesian estimation before I found that the 
Kalman filter had already been invented).  I have not yet 
found a textbook (Chow, 1981, Harvey 1981a, Harvey 1981b 
and Maddala, 1977 included) that explained the Kalman 
filter in terms that are familiar to econometricians, and 
this is probably one of the reasons why so few papers in 
the econometric literature use it, although in the last 
few years this is beginning to change.  Another reason 
must be the non-availability of software.  It is hoped 
that this paper goes some way towards remedying both of 
these problems;  the FORTRAN Kalman filter software is 
offered in 11.1 above and in appendix D, and Appendix G 
has a program written for a Texas Instruments 
programmable calculator which copes with one dependent 
variable and two explanatory variables. 

The Kalman filter offers a modest improvement in 
forecasting ability over OLS-estimated models in most 
cases, a small worsening in some, and a dramatic 


improvement in others.  It makes it easier to feed any 
available prior information into the estimating process, 
to use information available from other time series (as 
in chapter ten, the total fibre consumption series is 
used), to give much less weight to unreliable (because 
preliminary or because very old) data.  It also has the 
advantage of being able to allow parameters to 
random-walk over time, and to allow some parameters to 
walk faster than others.  These advantages make it 
possible for the Kalman filter to estimate the same model 
as ordinary least squares, but with an improved dynamic 
sum of squared errors (although in the case of UK 
industrial energy demand, the Kalman filter was worse 
than OLS, despite many trials).  

The DSSE used throughout this paper has one major 
limitation;  it is calculated on one-step-ahead 
forecasts.  This means that errors are not allowed to 
accumulate in the Kalman filter, but are corrected in the 
period after the forecast is made.  The same is true for 
OLS;  in the simulations of chapter 8, the forecasts made 
using both techniques were only one step ahead.  But it 
could be that the Kalman filter benefits more than OLS 
from the fact that only one-step-ahead forecasts are 
made, so the results must be viewed in this light.  
Further work in this field could use a more distant 
forecast horizon to compare the forecasting ability of 
the two techniques.

The statistical significance of estimates of parameters 
using the Kalman filter with non-zero W is much less than 
the significance that we have come to expect of 
OLS-estimated models.  This is not intrinsic to the 
Kalman filter, nor is it saying that the Kalman filter 


produces inferior (because less precise) parameter 
estimates.  It is very simply a reflection of the fact 
that if we admit that the parameters of our models are 
not totally rigid, but are somewhat movable, then we are 
able to be less precise about where they are.  The more 
movable we allow the parameters to be, the less we can 
say about them. 

The contention is that parameters do drift, and the 
exactitude with which econometricians are wont to quote 
their parameter estimates is usually unwarranted, as it 
is based on the assumption that they do not drift.  This 
contention is unprovable, as we do not know how the real 
(as opposed to the estimated) parameters behave, but it 
is quite plausible, and is commonly held by 
non-econometricians. 

The values used for W (governing the rate at which old 
data become irrelevant) are not too critical to the 
parameter estimates (an order of magnitude difference in 
W seems to have little effect) but the W should not be 
too large, or the Kalman filter is not able to carry very 
much information forward from year to year, as it 
"forgets" too fast what it has learned. 

The values used for V are likewise not critical, although 
it is important that non-zero values be used, especially 
if W is very small or zero. 

In this paper, suitable W's have been found to lie in the 
range 10-3 to 10-5 for a wide variety of models.  This 
may be a consequence of the kind of modelling being done 
(i.e. annual data, demand models), but it at least 
suggests that future work could start with a W of 10-4. 



The problem of selecting W's to use in the Kalman filter 
is rather difficult.  In a four parameter regression, 
there are 10 numbers to be selected for the W-matrix (the 
other 6 come from the symmetry of the matrix).  Clearly 
it is not going to be often that these 10 elements of the 
W-matrix will be estimable simultaneously with the four 
parameters.  So the W-matrix will usually have to be 
specified by the analyst, from considerations of how fast 
he thinks the parameters are likely to drift.  The 
experimentation with W in chapter 10, however, does show 
that the choice of W is not overly critical.  The 
specification of the off-diagonal elements of W is even 
more problematic than the diagonal elements, and this 
paper has not even attempted to treat this problem, 
except superficially.

The same problem applies to H.  Throughout this paper, H 
has been assumed to be the identity matrix, but McWhorter 
et al. (1976) show that the effects of a misspecification 
of H are serious.

The contribution of the Kalman filter to forecasting has 
so far been slight.  This is partly because of the 
newness of the technique, but also because of the lack of 
packages available to econometricians and statisticians 
that make the use of the Kalman filter as painless as is 
currently the case with OLS.  It may also be that a 
number of researchers have tried using the Kalman filter, 
but found it inferior to OLS, and simply not reported 
their results.  Some work has been done, however, and 
this is reported in chapter 7.  Opinion is divided as to 


the efficacy of the Kalman filter, but one clear message 
is that there is more that the analyst must specify, 
compared to OLS, and hence more to misspecify.  This 
misspecification, if gross, can lead to very bad 
forecasts.

The work of chapter eight showed that the Kalman filter 
copes better than ordinary least squares when parameters 
drift with time, but it also showed that the Kalman 
filter stands up well when parameters do not drift.  This 
is satisfying, as it means that the Kalman filter can be 
used to cope with suspected drifting parameters, and will 
not give absurd answers when the reality is constant 
parameters. 

Weighted least squares or discounted least squares (which 
is sometimes suggested as a way of coping with one or 
more of the problems listed in chapter one) was examined, 
but was found to give an inferior performance to 
recursive Bayesian estimation, and to be highly sensitive 
to choice of weights.  Also, it is necessary to specify a 
single weight per time-period, and so it is not possible 
to allow the different parameters of the model to be 
affected differently. 


12.3  MAIN CONCLUSIONS 


The Kalman filter can be understood and used fairly 
easily by econometricians provided it is explained in 
terms that they understand, and especially if it is 
incorporated into an easy-to-use package.  

More effort is required to estimate a Kalman filter 
model, as the econometrician has to set up various 
matrices (such as V, W, prior information where 
available).  There is more that must be specified, and 
therefore more that can be misspecified. 

The Kalman filter will cope with the five problems 
described in 1.1 above, and will do so more easily and 
naturally than conventional methods. 

The Kalman filter is more appropriate where data are of a 
highly variable quality, or where there is strong prior 
information on the parameters, or where old data are 
considerably less relevant to the current values of the 
parameters than are recent data. 

The Kalman filter's recognition of the variable precision 
of the data can lead to better models. 

The Kalman filter will often estimate a model which will 
forecast better than OLS.  Sometimes the forecasts will 
be much better, sometimes they will be slightly worse. 

A good choice for W might lie in the range 10-3 to 10-5. 




12.4  RECOMMENDATIONS


1.  The Kalman filter should be incorporated into 
existing econometric packages.
 

2.  The Kalman filter should be covered in econometrics 
courses. 

3.  Econometricians should re-estimate some of the models 
that they currently estimate using conventional 
techniques, using whatever prior information they have 
available.  They can then decide for themselves whether 
the Kalman filter works better for their particular 
applications, and whether it is worth the extra trouble. 

4.  When Kalman filter estimated models are written up, 
the values of V and W, and the starting values for M and 
C should be reported, whether these are estimated, 
assumed, or from prior information.