How to estimate the prediction accuracy
Previous  Top  Next

The prediction accuracy of each parameter set is estimated using N-fold cross-validation or repetition within the learn data sample.
 
· N-fold cross-validation  
The learn data sample is divided into N sub-samples of approximately the same size. Each of the smaller sub-samples is kept aside as tuning test data sample once. The data tuples of the other N-1 sub-samples build the tuning learn data sample, which is used to learn a model, whose prediction accuracy is then determined on the corresponding tuning test data sample. A special type of cross-validation is the so called leave-one-out cross-validation (LOOCV) for which N is chosen as the number of learn data tuples.  
 
· N-fold repetition  
The learn data sample is N times randomly divided into a tuning learn and a tuning test data sample of the pre-specified sizes. A model is learned using the tuning learn data sample and its prediction accuracy is evaluated with respect to the tuning test data sample.  


Figure: Illustration of the data splitting for the N-fold repetition


The mean of the
 single prediction accuracies is the resulting prediction accuracy.


More Info