Skip to content

Learning curves

These are plots of the model’s performance on the training set and the validation set as a function of the degree.

To generate the plots, follow these steps.

Step 1:  Train different polynomial model of degrees: \(({0, 1, \ldots, k})\)

Step 2: Calculate training and validation root mean square error (RMSE).

Step 3: Plot degree vs. RMSE plot.

Inference:

  1. The training and validation errors are close until certain degree.

  2. After a point, the training error continues to reduce, while the validation error keeps increasing.

  3. RMSE increases sharply for degrees \(\ge 7\). This is a signature of overfitting.

Issues with polynomial regression

  1. Higher order polynomial models are very flexible, or in other words, they have higher capacity compared to lower order models.

  2. Hence they are prone to overfitting compared to the lower degree polynomials.

  3. Perfect fit to training data, but poor prediction accuracy on validation data.