next up previous
Next: Rescaled Gaussian linear process Up: Surrogate time series Previous: Test design

Fourier based surrogates

  In this section, we will discuss a hierarchy of null hypotheses and the issues that arise when creating the corresponding surrogate data. The simpler cases are discussed first in order to illustrate the reasoning. If we have found serial correlations in a time series, that is, rejected the null hypothesis of independence, we may ask of what nature these correlations are. The simplest possibility is to explain the observed structures by linear two-point autocorrelations. A corresponding null hypothesis is that the data have been generated by some linear stochastic process with Gaussian increments. The most general univariate linear process is given by
 equation1027
where tex2html_wrap_inline1912 are Gaussian uncorrelated random increments. The statistical test is complicated by the fact that we do not want to test against one particular linear process only (one specific choice of the tex2html_wrap_inline1946 and tex2html_wrap_inline1948), but against a whole class of processes. This is called a composite null hypothesis. The unknown values tex2html_wrap_inline1946 and tex2html_wrap_inline1948 are sometimes referred to as nuisance parameters. There are basically three directions we can take in this situation. First, we could try to make the discriminating statistic independent of the nuisance parameters. This approach has not been demonstrated to be viable for any but some very simple statistics. Second, we could determine which linear model is most likely realised in the data by a fit for the coefficients tex2html_wrap_inline1946 and tex2html_wrap_inline1948, and then test against the hypothesis that the data has been generated by this particular model. Surrogates are simply created by running the fitted model. This typical realisations approach is the common choice in the bootstrap literature, see e.g. the classical book by Efron [29]. The main drawback is that we cannot recover the true underlying process by any fit procedure. Apart from problems associated with the choice of the correct model orders M and N, the data is by construction a very likely realisation of the fitted process. Other realisations will fluctuate around the data which induces a bias against the rejection of the null hypothesis. This issue is discussed thoroughly in Ref. [8], where also a calibration scheme is proposed.

The most attractive approach to testing for a composite null hypothesis seems to be to create constrained realisations [25]. Here it is useful to think of the measurable properties of the time series rather than its underlying model equations. The null hypothesis of an underlying Gaussian linear stochastic process can also be formulated by stating that all structure to be found in a time series is exhausted by computing first and second order quantities, the mean, the variance and the auto-covariance function. This means that a randomised sample can be obtained by creating sequences with the same second order properties as the measured data, but which are otherwise random. When the linear properties are specified by the squared amplitudes of the (discrete) Fourier transform
 equation1029
that is, the periodogram estimator of the power spectrum, surrogate time series tex2html_wrap_inline1962 are readily created by multiplying the Fourier transform of the data by random phases and then transforming back to the time domain:
 equation1031
where tex2html_wrap_inline1964 are independent uniform random numbers.




next up previous
Next: Rescaled Gaussian linear process Up: Surrogate time series Previous: Test design

Thomas Schreiber
Mon Aug 30 17:31:48 CEST 1999