Skip to content

Generating Polynomial Transformations

Question

What are the steps for generating a polynomial transformation of degree \(M\)?

1.   Generate combinations of input features of lengths \(= 0, 1, \ldots, M\).  

2.   Perform multiplication operations on existing features to obtain the new features.

  • For a single feature \(x_1, \phi_m = \left[1, x_1^1, x_1^2, \ldots, x_1^m \right]\)

  • Generate combinations of \(\\{1, x_1, (x_1,x_1) (x_1, x_1, x_1), \ldots, (x_1, x_1, \ldots,( m \space \text{times})\\}\)

    • 0-th degree: 1 (bias)

    • 1-st degree: \(x_1\)

    • 2-nd degree: \((x_1, x_1)\)

    • 3-rd degree: \((x_1, x_1, x_1)\)

    • m-th degree: \((x_1, x_1, x_1, \ldots m \text{times})\)

  • Taking the product of elements in combination: 

    • \(\phi_m(x_1) = {1, x_1, (x_1*x_1), (x_1* x_1*x_1), \ldots, \pi_{i=1}^{m} x_1}\)

    \(\phi_m(x_1) = \{1, x_1, x_1^2, x_1^3, \ldots, x_1^m\}\)

  • For two features say \((x_1, x_2)\), obtain \(\phi_2(x_1, x_2)\):

  • Generate combinations of \(\{1, x_1, x_2, (x_1, x_1), (x_2, x_2), (x_1, x_2)\}\)

    • 0-th degree: 1 (bias)

    • 1-st degree: \(x_1, x_2\)

    • 2-nd degree: \((x_1, x_1), (x_1, x_2), (x_2, x_2)\)

  • Taking the product of elements in combination:

    • \(\phi_2(x_1, x_2)  = \{1, x_1, x_2, (x_1*x_1), (x_2*x_2), (x_1*x_2)\}\)

    \(\phi_2(x_1, x_2) = \{1, x_1, x_2, x_1^2, x_2^2, x_1 x_2 \}\)

PolynomialFeatures with degree=3 would not only add the features \(x_1^2\), \(x_2^3\), \(x_2^2\) and \(x_3^3\), but also the combinations \(x_1x_2\), \(x_1^2x_2\), and \(x_1x_2^2\).

Examples

For input feature vector \(x\), let us compute polynomial features \(\phi_2\) with degree = 2.

  1. \(x = \begin{bmatrix} 2 \end{bmatrix}\) \(\rightarrow \phi_2= \begin{bmatrix} 1 & 2 & 4 \end{bmatrix}\)

  2. \(x = \begin{bmatrix} 2 & 3 \\ 5 & 6 \end{bmatrix}\) \(\rightarrow \phi_2= \begin{bmatrix} 1 & 2 & 3 & 4 & 6 & 9 \\ 1 & 5 & 6 & 25 & 30 & 36 \end{bmatrix}\)

Now, let us fit polynomial regression models of degrees 2 to 9 to the data that we had initially.

What did you infer from the above plots?