Learning curves¶
These are plots of the model’s performance on the training set and the validation set as a function of the degree.
To generate the plots, follow these steps.
Step 1: Train different polynomial model of degrees: \(({0, 1, \ldots, k})\)
Step 2: Calculate training and validation root mean square error (RMSE).
Step 3: Plot degree vs. RMSE plot.
Inference:
-
The training and validation errors are close until certain degree.
-
After a point, the training error continues to reduce, while the validation error keeps increasing.
-
RMSE increases sharply for degrees \(\ge 7\). This is a signature of overfitting.
Issues with polynomial regression
-
Higher order polynomial models are very flexible, or in other words, they have higher capacity compared to lower order models.
-
Hence they are prone to overfitting compared to the lower degree polynomials.
-
Perfect fit to training data, but poor prediction accuracy on validation data.