Matrices
Import
Matrix as a NumPy array
Everything in NumPy is an array. A matrix is also an array. Let us create a simple matrix:
\[ \textbf{M} = \begin{bmatrix} 1 & 2 & 3\\ 4 & 5 & 6\\ 7 & 8 & 9 \end{bmatrix} \]
In NumPy:
Adding two matrices
Let us now add the following matrices:
\[ \textbf{A} = \begin{bmatrix} 1 & 2\\ 3 & 4 \end{bmatrix}, \textbf{B} = \begin{bmatrix} 5 & 6\\ 7 & 8 \end{bmatrix} \]
then,
\[ \textbf{C} = \textbf{A} + \textbf{B} = \begin{bmatrix} 6 & 8\\ 10 & 12 \end{bmatrix} \]
In NumPy:
Scaling a matrix
Scaling a matrix is nothing but element-wise multiplication:
\[ \textbf{M} = \begin{bmatrix} 1 & 2 & 3\\ 4 & 5 & 6\\ 7 & 8 & 9 \end{bmatrix} \]
then,
\[ 3 \textbf{M} = \begin{bmatrix} 3 & 6 & 9\\ 12 & 15 & 18\\ 21 & 24 & 27 \end{bmatrix} \]
In NumPy:
Element-wise multiplication of matrices
Consider two matrices:
\[ \textbf{A} = \begin{bmatrix} 1 & 2\\ 3 & 4 \end{bmatrix}, \textbf{B} = \begin{bmatrix} 5 & 6\\ 7 & 8 \end{bmatrix} \]
The element-wise product is given by \(\textbf{A} \odot \textbf{B}\):
\[ \textbf{C} = \textbf{A} \odot \textbf{B} = \begin{bmatrix} 5 & 12\\ 21 & 32 \end{bmatrix} \]
In NumPy:
Element-wise functions of matrices
Given a matrix, we sometimes would want to apply a function to every element of the matrix. We will consider two examples.
Example-1
For example, we may want to take the absolute value of all the elements. Let us say \(f(x) = |x|\), then:
\[ \mathbf{A} = \begin{bmatrix} -1 & 2\\ -3 & -4 \end{bmatrix} \]
then:
\[ \begin{bmatrix} f(-1) & f(2)\\ f(-3) & f(-4) \end{bmatrix} = \begin{bmatrix} 1 & 2\\ 3 & 4 \end{bmatrix} \]
In NumPy, this becomes:
Example-2
We might want to square each element of the matrix. If \(\textbf{A}\) is a matrix, then \(\textbf{B}\) could be defined element-wise as follows:
\[ B_{ij} = A_{ij}^2 \]
Let us compute \(\mathbf{B}\) for the following matrix:
\[ \mathbf{A} = \begin{bmatrix} 1 & \sqrt{2}\\ \sqrt{3} & 2 \end{bmatrix} \]
In NumPy:
Transpose of a matrix
Given a matrix \(\textbf{M}\):
\[ \textbf{M} = \begin{bmatrix} 1 & 2 & 3\\ 4 & 5 & 6 \end{bmatrix} \]
then, its transpose \(\textbf{M}^{T}\) is:
\[ \textbf{M}^{T} = \begin{bmatrix} 1 & 4\\ 2 & 5\\ 3 & 6 \end{bmatrix} \]
In NumPy:
Alternatively:
Equivalently, we have:
This is what we will be sticking to given that it is very close to the corresponding math notation.
Shape and dimension of a matrix
Matrices are “two dimensional” arrays. So all matrices in NumPy have array-dimension equal to two. The shape of the NumPy array gives what we usually call the dimension of the matrix in the linear algebra sense. Let us explore these two ideas for:
\[ \mathbf{M} = \begin{bmatrix} 1 & 2 & 3\\ 4 & 5 & 6\\ \end{bmatrix} \]
In NumPy:
M is a two dimensional NumPy array of shape \((2, 3)\) which represents a matrix \(\mathbf{M} \in \mathbb{R}^{2 \times 3}\).
Vectors as matrices
Each vector can be viewed as a matrix. Column vectors are matrices of shape \((d, 1)\). Row vectors are matrices of shape \((1, d)\). Let us look at how NumPy treats both these cases:
To create a row or column vector from a 1D NumPy array, we can use np.newaxis. Focus on the syntax for now; the reason we are doing this will become clear when we discuss indexing and slicing.
Alternatively, we can also use a method called reshape:
reshape is a powerful method and will come in handy in various applications.
Products involving matrices and vectors
We will look at the following products: - matrix - matrix - matrix - vector - vector - matrix - vector - vector
Product of two matrices
Given two matrices compatible for multiplication:
\[ \textbf{A} = \begin{bmatrix} 1 & 2 & 3\\ 4 & 5 & 6 \end{bmatrix}, \textbf{B} = \begin{bmatrix} 6 & 7\\ 8 & 9\\ 10 & 11 \end{bmatrix} \]
then,
\[ \textbf{C} = \textbf{A} \times \textbf{B} = \begin{bmatrix} 52 & 58\\ 124 & 139 \end{bmatrix} \]
In NumPy:
Product of a matrix and a (column) vector
Given the matrix \(\mathbf{A}\) and the vector \(\mathbf{x}\):
\[ \mathbf{A} = \begin{bmatrix} 1 & 2 & 3\\ 4 & 5 & 6\\ 7 & 8 & 9 \end{bmatrix}, \mathbf{x} = \begin{bmatrix} 6\\ 7\\ 8 \end{bmatrix} \]
The product \(\mathbf{Ax}\) is given by:
\[ \mathbf{C} = \mathbf{A x} = \begin{bmatrix} 44\\ 107\\ 170 \end{bmatrix} \]
In NumPy:
Product of a (row) vector and a matrix
Given the matrix \(\mathbf{A}\) and the vector \(\mathbf{x}\):
\[ \mathbf{A} = \begin{bmatrix} 1 & 2 & 3\\ 4 & 5 & 6\\ 7 & 8 & 9 \end{bmatrix}, \mathbf{x} = \begin{bmatrix} 6\\ 7\\ 8 \end{bmatrix} \]
The product \(\mathbf{x}^T \mathbf{A}\) is given by:
\[ \mathbf{C} = \mathbf{x}^T \mathbf{A} = \begin{bmatrix} 90 & 111 & 132 \end{bmatrix} \]
In NumPy:
(Inner) Product of a (row) vector and a (column) vector
The product of a row vector and a column vector is nothing but the usual dot product:
\[ \mathbf{x}^T = \begin{bmatrix} 1 & 2 & 3 \end{bmatrix}, \quad \mathbf{y} = \begin{bmatrix} 4\\ 5\\ 6 \end{bmatrix} \]
The product \(\mathbf{x}^T \mathbf{y}\) is then:
\[ \mathbf{x}^T \mathbf{y} = 32 \]
In NumPy:
(Outer) Product of a (column) vector and a (row) vector
The product of a column vector and a row vector is an outer product:
\[ \mathbf{x} = \begin{bmatrix} 1\\ 2\\ 3 \end{bmatrix}, \quad \mathbf{y} = \begin{bmatrix} 4 & 5 & 6 \end{bmatrix} \]
The product \(\mathbf{x} \mathbf{y}^T\) is then:
\[ \mathbf{x} \mathbf{y}^T = \begin{bmatrix} 4 & 5 & 6\\ 8 & 10 & 12\\ 12 & 15 & 18 \end{bmatrix} \]
In NumPy:
array([[ 4, 5, 6],
[ 8, 10, 12],
[12, 15, 18]])
Equivalently:
array([[ 4, 5, 6],
[ 8, 10, 12],
[12, 15, 18]])
Notice the amazing power of the @ operator and the seemingly magical way in which NumPy combines the left and right operands. Let us see how NumPy achieves this in the particular case of a row-vector matrix product. Consider multiplying \(\begin{bmatrix}1 & 2\end{bmatrix}\) and \(\begin{bmatrix}3 & 4\\5 & 6\end{bmatrix}\).
When we execute x @ M, NumPy does the following: - Convert the 1D array x into a 2D array by adding an extra dimension at the beginning so that the result is a row vector whose shape is \((1, 2)\). - Now that x and M are compatible for matrix multiplication because their shapes are \((1, 2)\) and \((2, 2)\) respectively, matrix multiplication happens. - The resulting output of this operation has shape \((1, 2)\). Since this is effectively a vector, NumPy drops the extra dimension and outputs a 1D array of shape \((2, )\).
np.matmul gives the same result. But we will stick to the @ operator.
Special matrices
Let us now look at some special matrices that can be generated using NumPy.
Matrix of zeros
In many algorithms, we might have to initialize a matrix with zeros. For example, consider a \(2 \times 4\) matrix:
\[ \mathbf{M} = \begin{bmatrix} 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 \end{bmatrix} \]
In NumPy:
Matrix of ones
Similar to a matrix of zeros, we can come up with a matrix of ones.
\[ \mathbf{M} = \begin{bmatrix} 1 & 1\\ 1 & 1\\ 1 & 1 \end{bmatrix} \]
In NumPy:
We have to be careful about the argument. It has to be a tuple (or some sequence, say a list). See what happens if pass it as two separate arguments: np.ones(3, 2),
Identity matrix
Often, we might have to deal with identity matrices. A \(3 \times 3\) identity matrix is as follows:
\[ \mathbf{I} = \begin{bmatrix} 1 & 0 & 0\\ 0 & 1 & 0\\ 0 & 0 & 1 \end{bmatrix} \]
In NumPy:
array([[1., 0., 0., 0., 0.],
[0., 1., 0., 0., 0.],
[0., 0., 1., 0., 0.],
[0., 0., 0., 1., 0.],
[0., 0., 0., 0., 1.]])
A scalar matrix can be obtained by multiplying the identity matrix by the appropriate scalar.
Diagonal matrices
Another special kind of matrix. Let us create the following matrix:
\[ \mathbf{D} = \begin{bmatrix} 1 & 0 & 0 & 0\\ 0 & 2 & 0 & 0\\ 0 & 0 & 3 & 0\\ 0 & 0 & 0 & 4 \end{bmatrix} \]
In NumPy:
Indexing and Slicing
Just like lists in Python, NumPy arrays can be indexed and sliced. Slicing is useful if we want to work with a portion of an array. We will look at some examples.
Example-1: Indexing
Consider:
\[ M = \begin{bmatrix} 1 & 3 & 0\\ 4 & 2 & \color{red}{5}\\ 9 & 8 & 7 \end{bmatrix} \]
To access the element \(5\), we can do the following:
Alternatively, we can also use:
Note that NumPy uses zero-indexing, just like Python. In general, M[row, col] is the element in the row row and columnn col assuming zero indexing.
Example-2: Row-slice
We will extract the third row of the matrix \(\mathbf{M}\):
\[ \mathbf{M} = \begin{bmatrix} 1 & 2\\ 3 & 4\\ 5 & 6\\ 7 & 8\\ 9 & 10 \end{bmatrix} \]
In NumPy:
The syntax is to be understood as follows. To extract the third row, we fix the row index to be \(2\) (zero-indexing) and get hold of all elements in that row. To do this, we use a : in the place of the column index so that it allows all elements to come in.
Example-3: Column slice
Let us now extract the second column of the following matrix:
\[ \mathbf{M} = \begin{bmatrix} 1 & 2 & 3\\ 4 & 5 & 6\\ 7 & 8 & 9 \end{bmatrix} \]
In NumPy:
Here, we want the second column. So we fix the column index to \(1\) and set the row index to :, indicating that all elements in that column should be included.
Example-4: Submatrix slice
Now, we want to extract the \(2 \times 2\) submatrix colored in blue from \(M\):
\[ \mathbf{M} = \begin{bmatrix} 1 & 2 & 3 & 4\\ 5 & 6 & \color{blue}7 & \color{blue}8\\ 9 & 10 & \color{blue}{11} & \color{blue}{12}\\ 13 & 14 & 15 & 16 \end{bmatrix} \]
In NumPy:
array([[ 7, 8],
[11, 12]])
The start-point of the slice is included and the end-point is excluded, just as we do in native Python.
Other Matrix Operations
There are several important matrix operations that we will list down here. The np.linalg module helps us perform these operations effortlessly.
Rank
We can get the rank of a matrix as follows. For example:
\[ \mathbf{M} = \begin{bmatrix} 1 & 2 & 3 & 4\\ 5 & 6 & 7 & 8\\ 9 & 10 & 11 & 12\\ 13 & 14 & 15 & 16 \end{bmatrix} \]
Determinant
We can compute the determinant of a square matrix. For example:
\[ \mathbf{M} = \begin{bmatrix} 1 & 2 & -1\\ 0 & 1 & -1\\ 1 & 0 & 2 \end{bmatrix} \]
Inverse
We can get the inverse as follows. For example:
\[ M = \begin{bmatrix} 3 & -4\\ 4 & 3 \end{bmatrix} \]
Pseuodoinverse
The Moore Penrose pseudoinverse \(\mathbf{M}^{\dagger}\) of a matrix \(\mathbf{M}\) can be computed:
\[ \mathbf{M} = \begin{bmatrix} 1 & 2 & 3\\ 3 & 6 & 9 \end{bmatrix} \]
In NumPy:
array([[0.00714286, 0.02142857],
[0.01428571, 0.04285714],
[0.02142857, 0.06428571]])
Recall that every matrix has a pseudoinverse. The pseudoinverse reduces to the inverse for square invertible matrices.
System of Linear Equations
We can solve a system of linear equations of the form \(\mathbf{A} \boldsymbol{\theta} = \mathbf{b}\). Consider the following example:
\[ \begin{aligned} 2x + 3y - 4z &= -4\\ x + y - z &= 0\\ 2x + y + z &= 7 \end{aligned} \]
where \(\boldsymbol{\theta} = (x, y, z)\) and \(\mathbf{A}\) is the matrix of coefficients. The following method in NumPy returns the exact solution if the matrix \(\mathbf{A}\) is invertible.
A = np.array([
[2, 3, -4],
[1, 1, -1],
[2, 1, 1]
])
b = np.array([-4, 0, 7])
theta = np.linalg.solve(A, b)
thetaarray([1., 2., 3.])
If \(\mathbf{A}\) is not a square matrix or if it is singular, then the method will return an error.
Least Squares Solution
In general, \(\mathbf{A} \boldsymbol{\theta} = \mathbf{b}\) may not be solvable, or even if it is, it may not have a unique solution. In such situations, we look for the least square solution. Consider:
\[ \begin{aligned} 2x + 3y - 4z &= -4\\ x + y - z &= 0\\ 2x + 2y - 2z &= 0.1 \end{aligned} \]
where \(\boldsymbol{\theta} = (x, y, z)\) and \(\mathbf{A}\) is the matrix of coefficients. The following method in NumPy returns the least squares solution of minimum norm.
A = np.array([
[2, 3, -4],
[1, 1, -1],
[2, 2, -2]
])
b = np.array([-4, 0, 0.1])
theta = np.linalg.lstsq(A, b)[0]
thetaarray([2.07333333, 0.01333333, 2.04666667])
np.linalg.lstsq returns three objects, the first of which is the least square solution. To see how close this solution is to the vector \(\mathbf{b}\), let us compute \(\mathbf{A} \boldsymbol{\theta}\):
Notice that this is quite close to the vector \(\mathbf{b} = (-4, 0, 0.1)\), which is what we expect of a least squares solution.
Eigenvalues and Eigenvectors (symmetric)
Given a symmetric matrix \(\mathbf{M}\) let us find its eigenvalues and eigenvectors:
\[ \mathbf{M} = \begin{bmatrix} 1 & 0 & -3\\ 0 & 5 & 2\\ -3 & 2 & 8 \end{bmatrix} \]
array([-0.20942046, 4.36588492, 9.84353554])
The h in eigh referes to Hermitian. Recall that Hermitian matrices are the complex analogue of symmetric (real) matrices. The eigenvalues of Hermitian matrices are real and NumPy outputs them in ascending order. The corresponding eigenvectors appear in the columns of the matriix eigvec. The eigenpairs are given below:
Eigenvalues and Eigenvectors (general)
For a square matrix \(\mathbf{M}\), not necessarily square, let us find its eigenvalues and eigenvectors. Consider:
\[ \mathbf{M} = \begin{bmatrix} 1 & 2\\ -1 & 3 \end{bmatrix} \]
[2.+1.j 2.-1.j]
[[0.81649658+0.j 0.81649658-0.j ]
[0.40824829+0.40824829j 0.40824829-0.40824829j]]
Note that the eigenvalues could be complex, which is the case here. This means that the eigenvalues returned by NumPy may not have any ordering.
SVD
The singular value decomposition of a matrix \(\mathbf{M}\) is of the form:
\[ \mathbf{M} = \mathbf{U} \boldsymbol{\Sigma} \mathbf{V}^{T} \]
Consider an example:
\[ \mathbf{M} = \begin{bmatrix} 1 & 3 & 1\\ 2 & 4 & 2 \end{bmatrix} \]
The dimensions of the matrices should turn out to be:
- \(\mathbf{U} \rightarrow (2, 2)\)
- \(\boldsymbol{\Sigma} \rightarrow (2, 3)\)
- \(\mathbf{V} \rightarrow (3, 3)\)
M = np.array([
[1, 3, 1],
[2, 4, 2]
])
U, Sigma, Vt = np.linalg.svd(M)
print(U.shape)
print(Sigma.shape)
print(Vt.shape)(2, 2)
(2,)
(3, 3)
The SVD method in NumPy returns three arrays:
U, a 2D array, whose columns are left singular vectorsSigma, the singular values as a 1D arrayVt, a 2D array, whose rows are the right singular vectors
We can easily recover the diagonal-ish \(\boldsymbol{\Sigma}\) matrix from this. In contrast to the eigenvalues returned by the eigh method which are in ascending order, the singular values are sorted in descending order, as is the convention in the linear algebra literature.
Recall that \(\mathbf{U}\) and \(\mathbf{V}\) are orthogonal matrices. That is, \(\mathbf{U U}^{T} = \mathbf{U}^{T} \mathbf{U} = \mathbf{I}\). A similar result holds for \(\mathbf{V}\). Let us quickly see if this is the case:
array([[ 1.00000000e+00, 2.97804097e-17, 1.56730402e-17],
[ 2.97804097e-17, 1.00000000e+00, -2.93975883e-17],
[ 1.56730402e-17, -2.93975883e-17, 1.00000000e+00]])
In upcoming notebooks, we will see how we can compare two matrices element by element. For now, it suffices to note that the non-diagonal elements are vanishingly small and are as good as zero. Let us now verify the SVD. Since we have only two singular values, it suffices to retain the first two rows of \(\mathbf{V}^{T}\).
This is exactly equal to \(\mathbf{M}\).