An interesting
approach for splitting a data set into two subsets is the application of Kohonen
neural networks [26],[27],[50].
These networks with two layers are unsupervised networks, which can be used
as 2-dimensional mapping method. For the repartitioning, a Kohonen network is
trained using the complete data set. Then, for each neuron a specific number
of samples which excited this neuron during training, are selected for the first
data set. The other samples are used for the second data set. This approach
allows a very efficient distribution of the samples into subsets that cover
the complete variable space.
Yet,
using Kohonen networks for several subsampling runs is difficult, as the
creation of different selection rules for samples exciting a neuron is rather
subjective for an arbitrary number of runs and needs user input from data set
to data set.