Typical vs. constrained realisations

Next: The null hypothesis: model Up: Surrogate data testing Previous: Surrogate data testing

Typical vs. constrained realisations

Traditional bootstrap methods use explicit model equations that have to be extracted from the data and are then run to produce Monte Carlo samples. This typical realisations approach can be very powerful for the computation of confidence intervals, provided the model equations can be extracted successfully. The latter requirement is very delicate. Ambiguities in selecting the proper model class and order, as well as the parameter estimation problem have to be addressed. Whenever the null hypothesis involves an unknown function (rather than just a few parameters) these problems become profound. A recent example of a typical realisations approach to creating surrogates in the dynamical systems context is given by Ref. [24]. There, a Markov model is fitted to a coarse-grained dynamics obtained by binning the two dimensional delay vector distribution of a time series. Then, essentially the transfer matrix is iterated to yield surrogate sequences. We will offer some discussion of that work later in Sec. 7.

As discussed by Theiler and Prichard [25], the alternative approach of constrained realisations is more suitable for the purpose of hypothesis testing we are interested in here. It avoids the fitting of model equations by directly imposing the desired structures onto the randomised time series. However, the choice of possible null hypothesis is limited by the difficulty of imposing arbitrary structures on otherwise random sequences. In the following, we will discuss a number of null hypotheses and algorithms to provide the adequately constrained realisations. The most general method to generate constrained randomisations of time series [26] is described in Sec. 5.

Consider as a toy example the null hypothesis that the data consists of independent draws from a fixed probability distribution. Surrogate time series can be simply obtained by randomly shuffling the measured data. If we find significantly different serial correlations in the data and the shuffles, we can reject the hypothesis of independence. Constrained realisations are obtained by creating permutations without replacement. The surrogates are constrained to take on exactly the same values as the data, just in random temporal order. We could also have used the data to infer the probability distribution and drawn new time series from it. These permutations with replacement would then be what we called typical realisations.

Obviously, independence is not an interesting null hypothesis for most time series problems. It becomes relevant when the residual errors of a time series model are evaluated. For example in the BDS test for nonlinearity [27], an ARMA model is fitted to the data. If the data are linear, then the residuals are expected to be independent. It has been pointed out, however, that the resulting test is not particularly powerful for chaotic data [28].

Next: The null hypothesis: model Up: Surrogate data testing Previous: Surrogate data testing

Thomas Schreiber
Mon Aug 30 17:31:48 CEST 1999