The null hypothesis: model class vs. properties

Next: Test design Up: Surrogate data testing Previous: Typical vs. constrained realisations

The null hypothesis: model class vs. properties

From the bootstrap literature we are used to defining null hypothesis for time series in terms of a class of processes that is assumed to contain the specific process that generated the data. For most of the literature on surrogate data, this situation hasn't changed. One very common null hypothesis goes back to Theiler and coworkers [6] and states that the data have been generated by a Gaussian linear stochastic process with constant coefficients. Constrained realisations are created by requiring that the surrogate time series have the same Fourier amplitudes as the data. We can clearly see in this example that what is needed for the constrained realisations approach is a set of observable properties that is known to fully specify the process. The process itself is not reconstructed. But this example is also exceptional. We know that the class of processes defined by the null hypothesis is fully parametrised by the set of ARMA(M,N) models (autoregressive moving average, see Eq.(6) below). If we allow for arbitrary orders M and N, there is a one-to-one correspondence between the ARMA coefficients and the power spectrum. The power spectrum is here estimated by the Fourier amplitudes. The Wiener-Khinchin theorem relates it to the autocorrelation function by a simple Fourier transformation. Consequently, specifying either the class of processes or the set of constraints are two ways to achieve the same goal. The only generalisation of this favourable situation that has been found so far is the null hypothesis that the ARMA output may have been observed by a static, invertible measurement function. In that case, constraining the single time probability distribution and the Fourier amplitudes is sufficient.

If we want to go beyond this hypothesis, all we can do in general is to specify the set of constraints we will impose. We cannot usually say which class of processes this choice corresponds to. We will have to be content with statements that a given set of statistical parameters exhaustively describes the statistical properties of a signal. Hypotheses in terms of a model class are usually more informative but specifying sets of observables gives us much more flexibility.

Next: Test design Up: Surrogate data testing Previous: Typical vs. constrained realisations

Thomas Schreiber
Mon Aug 30 17:31:48 CEST 1999