Find the word definition

Wiktionary
cross-validation

n. (context statistics English) Any technique or instance of assessing how the results of a statistical analysis will generalize to an independent dataset.

Wikipedia
Cross-validation (statistics)

Cross-validation, sometimes called rotation estimation, is a model validation technique for assessing how the results of a statistical analysis will generalize to an independent data set. It is mainly used in settings where the goal is prediction, and one wants to estimate how accurately a predictive model will perform in practice. In a prediction problem, a model is usually given a dataset of known data on which training is run (training dataset), and a dataset of unknown data (or first seen data) against which the model is tested (testing dataset). The goal of cross validation is to define a dataset to "test" the model in the training phase (i.e., the validation dataset), in order to limit problems like overfitting, give an insight on how the model will generalize to an independent dataset (i.e., an unknown dataset, for instance from a real problem), etc.

One round of cross-validation involves partitioning a sample of data into complementary subsets, performing the analysis on one subset (called the training set), and validating the analysis on the other subset (called the validation set or testing set). To reduce variability, multiple rounds of cross-validation are performed using different partitions, and the validation results are averaged over the rounds.

One of the main reasons for using cross-validation instead of using the conventional validation (e.g. partitioning the data set into two sets of 70% for training and 30% for test) is that there is not enough data available to partition it into separate training and test sets without losing significant modelling or testing capability. In these cases, a fair way to properly estimate model prediction performance is to use cross-validation as a powerful general technique.

In summary, cross-validation combines (averages) measures of fit (prediction error) to derive a more accurate estimate of model prediction performance.

Cross-validation

Cross-validation could refer to:

  • Cross-validation (statistics), a technique for estimating the performance of a predictive model;
  • Cross-validation (analytical chemistry), the practice of confirming an experimental finding by repeating the experiment using an independent assay technique;
Cross-validation (analytical chemistry)

Cross-validation is an approach by which the sets of scientific data generated using two or more methods are critically assessed. The cross-validation can be categorized as

  • method validation
  • analytical data validation