I will first give a short review of "Independent Component Analysis" (ICA) which actually should rather be called "Least dependent Component Analysis". This is a method, generalizing the well known principle component analysis, to find least
dependent components in a linearly mixed signal, based on some non-linear statistical dependency or contrast measure. Thus, a basic ingredient in this analysis is a reliable measure of statistical dependencies. A natural candidate is, as was pointed out by many authors, mutual information (MI). But MI is not easy to estimate. I propose to use recently developed precise MI estimators, based on k-nearest neighbour statistics.
On the one hand this seems to lead to better blind source separation than with any other presently available algorithm. On the other hand it has the advantage, compared to other implementations of ICA, some of which are based on crude approximations for MI, that the numerical values of the MI can be used for (i) estimating residual dependencies between the output components; (ii) estimating the reliability of the output, by comparing the pairwise MIs with those of re-mixed components; (iii) clustering the output according to the residual interdependencies; and (iv) a unified treatment of the case when the samples from which MI has to be estimated are correlated themselves (in applications to tme series analysis, this means that the time sequences are not iid; in image analysis, it corresponds to spatial correlations). We call this set of routines "Mutual Information based Least dependent Component Analysis" (MILCA). After several tests with artificial data, I will apply MILCA to the ECG of a pregnant woman and to superpositions of chemical infraread spectra. |