Definition Training dataset
In the contexts of machine learning and regression analysis, it is a common best practice to split the dataset on which the the model is calculated into three portions: training dataset, validation dataset, and test dataset. The training dataset is the portion of the dataset on which the model is actually calculated. Using the training dataset as a point of departure, the validation dataset is used to evaluate quality and to fine-tune/adjust the hyperparameters. The final evaluation is made using the test dataset; if the model can predict the values of the test dataset with enough precision, then the model is considered to be reliable.
Please note that the definitions in our statistics encyclopedia are simplified explanations of terms. Our goal is to make the definitions accessible for a broad audience; thus it is possible that some definitions do not adhere entirely to scientific standards.
- t-test
- Training dataset
- Trend
- Time series analysis
- Time series
- Test dataset