Linear Transformation and Determinants are the two concepts one needs to be clear about before attempting to learn what Eigenvectors and Eigenvalues are. Once familiar with them, the core concept is pretty simple to grasp.

Assuming you are familiar with Linear Transformation and Determinants, let's jump into the core concept of Eigenvectors and Eigenvalues. Consider Matrix A and how it transforms a set of vectors V

$$\begin{array}{l} A = \begin{bmatrix}{} 3 & 1 \\ 0 & 2 \end{bmatrix} \quad V = \begin{bmatrix}{} 1 & -1 \\ 1 & 1 \end{bmatrix} \\ \\ \therefore \quad A \times V = \begin{bmatrix}{} 1 & -2 \\ 2 & 2 \end{bmatrix} \end{array}$$

If we plot this transformation on a graph, it looks something like this

We can see that vector A1 has transformed into vector A2 along the same span - span A. But in the case of vector B1, it has transformed into vector B2 which has been pushed away from its original span - span B.

So for Matrix A, vector A2 is an **Eigenvector**. Also, when A1 transforms into A2, it scales by a factor of 2. This scaling factor is the **Eigenvalue** for Matrix A.

To summarise -

Any vectors that remain on their original span after the transformation are theEigenvectorsand the factor by which they scale are theEigenvaluesof that matrix.

But since a matrix transforms an entire vector space and there could be an infinite number of vectors in that space, it implies that any vectors that coincide with the spans of these vectors are also the eigenvectors of Matrix A - as illustrated below.

When we plot the vectors of Matrix A (green vectors) on the graph, we observe that the first vector, with coordinates (3, 0), would also remain on the same span after the transformation, which is the X-axis. Hence it is also one of the eigenvectors of the Matrix.

Therefore, any vectors along span A and the X-axis, vectors C and D for example, are also the eigenvectors of Matrix A.

To put things into perspective, we can take a quick overview of one of the use cases of eigenvectors and eigenvalues without dwelling on too many details.

Imagine a cube in space, formed by a set of 3 vectors, which is rotated (transformed) by a 3D matrix by a certain degree.

If you were to compute the eigenvector of that 3D matrix, what you would end up with is the axis of rotation. Since the rotation axis always remains on its span no matter the transformation. Also, rotation means there is no expansion or contraction of any kind and hence, the eigenvalue, in this case, is 1.

To define the core concept symbolically, let us assume *A* is the transformation matrix and *v* is the eigenvector of that transformation. And since the transformation does not change the span of the eigenvector, it is the same as multiplying the eigenvector with some scalar quantity. This scalar quantity is the eigenvalue and it is represented by a . So the equation becomes.

$$A\vec{v} = \lambda \vec{v}$$

But fundamentally, we are equating a Matrix-vector multiplication with a scalar-vector product, which needs to be fixed.

To fix that, we can multiply with an Identity Matrix. This converts the RHS of the equation into a Matrix-vector product as shown here.

$$\begin{array}{} A\vec{v} = (\lambda I) \vec{v} \\ \text{where I is an identity matrix} \end{array}$$

Since we are solving for and *v*, we can subtract RHS from LHS and equate it to a zero vector. And then factor out the *v.*

$$\begin{array}{} A\vec{v} - (\lambda I) \vec{v} = \vec{0} \\ (A - \lambda I) \vec{v} = \vec{0} \\ \text{where } \vec{v} \text{ is a non-zero vector} \end{array}$$

The vector *v* needs to be non-zero otherwise it ends up having infinite solutions. So the only other way to satisfy this expression is when the product of the eigenvector *v* and the resultant matrix of (A - I) equals zero. And as we have learned from the Determinants article, this is only possible if that matrix transforms the vector space into a lower dimension i.e. determinant of (A - I) should be 0

$$\therefore |A - \lambda I| = 0$$

For the above expression, is the only unknown factor. This means that we need to find a value for such that when subtracted from A, the determinant of the resultant matrix is zero. This is best understood with an example.

Find the eigenvectors and eigenvalues for Matrix A

$$A = \begin{bmatrix}{} 3 & 1 \\ 0 & 2 \end{bmatrix}$$

To find the eigenvalue, we know that the determinant of (A - I) = 0

$$\begin{array}{l} |A - \lambda I| = 0 \\ \\ \begin{vmatrix}{} 3 - \lambda & 1 \\ \\ 0 & 2 - \lambda \end{vmatrix} = 0 \\ \\ (3 - \lambda)(2 - \lambda) - 1 \times 0 = 0 \\ \\ 6 - 3 \lambda - 2 \lambda + \lambda^2 = 0 \\ \\ \therefore \quad \lambda^2 - 5\lambda + 6 = 0 \end{array}$$

The resulting quadratic equation from the above calculations is known as the characteristic polynomial of Matrix A.

To find the possible values of , we know that the solution of a quadratic equation is given by...

$$\begin{array}{l} {-b \pm \sqrt{b^2-4ac} \over 2a} \\ \\ \therefore \quad \lambda = {5 \pm \sqrt{5^2-4 \cdot 1 \cdot 6} \over 2 \cdot 1} \\ \\ \qquad = 6/4 \text{ or } 4/2 \\ \\ \qquad = 3 \text{ or } 2 \end{array}$$

Let's substitute = 2 in (A - I)*v* = 0, we get

$$\begin{array}{l} \begin{bmatrix}{} 3-2 & 1 \\ 0 & 2-2 \end{bmatrix} \begin{bmatrix}{} x \\ y \end{bmatrix} = \begin{bmatrix}{} 0 \\ 0 \end{bmatrix} \\ \\ \begin{bmatrix}{} 1 & 1 \\ 0 & 0 \end{bmatrix} \begin{bmatrix}{} x \\ y \end{bmatrix} = \begin{bmatrix}{} 0 \\ 0 \end{bmatrix} \\ \\ \therefore x + y = 0 \\ \\ \therefore x = -y \end{array}$$

If we assume an arbitrary value of x, say x = 1, then the vector with coordinates (1, -1) is the eigenvector of that Matrix A and as we know any vectors along the span of the eigenvector are also the eigenvectors of the matrix. So the equation x = -y is the equation of the span along which all the vectors are the solution for Matrix A

And of course, = 2 is the eigenvalue.

What do you think happens when we substitute = 3 in (A - I)

v= 0? Is the solution also an eigenvector? Try it out and leave your answers in the comments.

These are some special cases that are good to have at the back of your mind as they frequently appear in real-world analytics.

A shear matrix is a special matrix that only transforms either one of the two base axes (X or Y). In this case, we are going to consider a shear Matrix S - that transforms all vectors except the ones on the X-axis.

$$S = \begin{bmatrix}{} 1 & 1 \\ 0 & 1 \end{bmatrix}$$

Diagrammatically, it looks something like this

To find the eigenvector and eigenvalue first, we need to find the value of

$$\begin{array}{l} |S - \lambda I| = 0 \\ \\ \begin{vmatrix}{} 1 - \lambda & 1 \\ \\ 0 & 1 - \lambda \end{vmatrix} = 0 \\ \\ (1 - \lambda)(1 - \lambda) - 1 \times 0 = 0 \\ \\ (1 - \lambda)^2 = 0 \\ \\ \lambda^2 - 2\lambda + 1 = 0 \\ \\ \therefore \quad \lambda = 1 \\ \text{which is the only solution for this equation} \end{array}$$

Since the eigenvalue is 1 and we know that this shear matrix transforms all the vectors except the ones on the X-axis, we can say that all the vectors on the X-axis are the only eigenvectors for this matrix.

A linear transformation may not have any eigenvectors and consecutively no eigenvalues. One such example is a matrix that only rotates the vector space by some degree. Say for example the following Matrix R

$$R = \begin{bmatrix}{} 0 & -1 \\ 1 & 0 \end{bmatrix}$$

It would be easier to understand why it cannot have any eigenvectors if we plot the transformation on a graph.

We can see that the orange vector leaves the X-axis and aligns with the Y-axis while the purple vector rotates off the Y-axis and aligns with the X-axis in the second quadrant.

So all that Matrix R really does is rotate the vector space by 90 degrees counter-clockwise, this means that any and all the vectors in the space shall also rotate off their spans. If no vectors remain on their span after the transformation then there can be no eigenvectors or eigenvalues for this matrix.

We can also prove this by attempting to calculate the eigenvalues for Matrix R.

$$\begin{array}{l} |R - \lambda I| = 0 \\ \\ \begin{vmatrix}{} 0 - \lambda & -1 \\ \\ 1 & 0 - \lambda \end{vmatrix} = 0 \\ \\ (0 - \lambda)(0 - \lambda) - (-1)(1) = 0 \\ \\ \lambda^2 + 1 = 0 \\ \\ \lambda = \sqrt{-1} \\ \\ \therefore \quad \lambda = i \quad \text{or} \quad \lambda = -i \\ \text{where } i \text{ is an imaginary number} \end{array}$$

Since the eigenvalues are imaginary numbers, there cannot be real eigenvectors

Although this is a shortcut to calculating eigenvalues, it only applies to 2D matrices. And anyway, if we are to find eigenvectors for 3D or higher dimensional matrices, we are better off computing them using a computer.

The trick to finding possible eigenvalues of a 2D matrix is

$$\lambda_1, \lambda_2 = m \pm \sqrt{m^2 - p}$$

Here, *m* is the mean of the first diagonal elements, and *p* is the determinant.

Let's try out an example. Find the eigenvalues for Matrix M

$$\begin{array}{l} M = \begin{bmatrix} 2 & 7 \\ 1 & 8 \end{bmatrix} \\ \\ m = {2 + 8 \over 2} = 5 \\ p = (2 \times 8) - (7 \times 1) = 9 \\ \\ \therefore \quad \lambda_1, \lambda_2 = 5 \pm \sqrt{5^2 - 9} = 5 \pm \sqrt{16} \\ \therefore \quad \lambda_1, \lambda_2 = 9, 1 \end{array}$$

Wasn't that quick?

Thanks for reading! If you think this article has helped you in any way or form, you can show your appreciation by reacting to this post and if you have any feedback, please leave them in the comments below.

Thanks again!

]]>So assuming that you are familiar with the concept of Linear Transformation, consider an arbitrary Matrix A that transforms the vector space as shown here

We observe that the vectors in the new vector space seem to be stretched out. Similarly, another matrix may transform the vector space into a smaller size, or in other words, squish them. In either case, there seems to be a change taking place with these vectors and the vector space itself. But exactly how much does it change?

All changes taking place in the world of mathematics originate from somewhere. There has to be something that "*determines*" this change.

Let us take an example of Matrix B and apply it to a vector with coordinates (0, 2). We shall also add a unit square somewhere on the graph and observe how it reshapes as the vector space around it warps.

$$B = \left[\begin{array}{r} 1 & 1 \\ 0 & 2 \end{array}\right] \times \left[\begin{array}{r} 0 \\ 2 \end{array}\right] = \left[\begin{array}{r} 2 \\ 4 \end{array}\right]$$

We can see that the original vector (0, 2) has stretched to a new vector (2, 4) but so has consequently, the unit area (purple square) stretched into a new area (orange parallelogram). The area of the square is 1 x 1 = 1 unit square and the area of the parallelogram is 1 x 2 = 2 unit square (base x height). This shows that the area of the square has doubled after the transformation and the determinant of Matrix B is 2

Hence, the factor by which the linear transformation scales a unit area on the vector space is known as the determinant of that transformation.

Consider an arbitrary Matrix A

$$\begin{array} & A = \begin{bmatrix} a & b \\ c & d \ \end{bmatrix} \end{array}$$

For the sake of calculations, we are going to assume they represent two vectors in the first quadrant as shown below

We know that any area on that graph reshapes along with the vectors. So what could be the best choice to fit an area such that when it changes, we can easily compute the factor by which it increases or decreases?

Since we know how the vectors behave, it would make sense to have an area attached to them. So when the vectors reshape so does the area with them.

We add the vectors together to close the loop and the diagram evolves into a parallelogram. Their intersection point is the addition of their corresponding x and y values i.e (a+c, b+d)

Now that we have a shape whose area can be determined, we project the coordinates of the vertices on the X and Y axes

We can observe that the whole figure resides in a rectangle formed by four right-angled triangles and two small rectangles and the parallelogram (shaded region). To find its area we need to take away the areas of the unshaded regions.

$$\begin{array}{l} & \text{Let Area of parallelogram} = X \\ & \therefore X = (a+c)(b+d)-2bc-2 \times \frac{1}{2}ab - 2 \times \frac{1}{2}cd \\ & \quad\quad = ab + ad + cb +cd -2bc - ab - cd \\ & \quad\quad = ad - bc \end{array}$$

This area is nothing but the determinant of Matrix A. which is represented as shown below

$$\begin{array}{l} & |A| = \begin{array} & \begin{vmatrix} a & b \\ c & d \ \end{vmatrix} \end{array} = ad - bc \end{array}$$

If we take a closer look, we see that the final outcome consists only of the original elements of Matrix A. Multiplying the top-left and the bottom-right elements minus the top-right and the bottom-left elements. Quite simple, isn't it?

As an example, we can verify that Matrix B from the first section doubles the area by calculating its determinant

$$\begin{array}{} B = \left[\begin{array}{r} 1 & 1 \\ 0 & 2 \end{array}\right] \\ \therefore|B| = \begin{vmatrix}{} 1 & 1 \\ 0 & 2 \end{vmatrix} = (2 \times 1) - (1 \times 0) = 2 \end{array}$$

Since the determinant of B is 2, it doubles any area projected on the vector space.

Calculate the determinant of the following Matrix and think about how it may affect a unit area on a vector space. You can leave your answers in the comments.

$$\begin{bmatrix}{} 1 & 1 \\ 0 & 0.5 \end{bmatrix}$$

Since Matrix A was 2D we were able to plot the vectors in just X and Y axes but what if a matrix is in 3D? Consider the following Matrix B

$$\begin{array}{l} & B = \begin{bmatrix} a & b & c \\ d & e & f \\ g & h & i \\ \end{bmatrix} \end{array}$$

If we plot this 3D matrix on the three X, Y, and Z axes then we end up with a wonky cuboid shape called *parallelepiped*. Which looks something like this

And the determinant of Matrix B will be the volume of this parallelepiped. Now we can sit and work out the volume of this shape as we did for the parallelogram but someone has already done that for us and for the sake of time we are directly gonna apply it as shown here

$$\begin{array}{l} |B| = \begin{vmatrix} a & b & c \\ d & e & f \\ g & h & i \\ \end{vmatrix} \\ \\ \quad = a \begin{vmatrix} e & f \\ h & i \end{vmatrix} - b \begin{vmatrix} d & f \\ g & i \end{vmatrix} + c \begin{vmatrix} d & e \\ g & h \end{vmatrix} \\ \\ \quad = a(ei-fh) -b(di-gf) +c(dh-eg) \end{array}$$

It may seem complex but there is a pattern here if you haven't noticed already. We pick an element from the top row and eliminate the others in the same row or column of the selected element, and then multiply it with the determinant of the remaining elements. And just be mindful of the alternating - and + signs between the elements.

Using what we just learned, let us try a few examples.

$$\begin{array}{l} A = \left[\begin{array}{r} 2 & 1 \\ 1 & 2 \end{array}\right] \therefore |A| = \left|\begin{array}{r} 2 & 1 \\ 1 & 2 \end{array}\right| = (2 \times 2) - (1 \times 1) = 3 \\ \\ B = \left[\begin{array}{r} 1 & 2 \\ 2 & 1 \end{array}\right] \therefore |B| = \left|\begin{array}{r} 1 & 2 \\ 2 & 1 \end{array}\right| = (1 \times 1) - (2 \times 2) = -3 \end{array}$$

Both of these determinants increase the area in the space by a factor of 3. But then what does the negative sign indicate? To understand this we shall transform a set of vectors using each of the above matrices separately and observe the difference.

When Matrix A transforms the two vectors with coordinates (0, 2) and (2, 0) we can see that after the transformation, the purple vector remains closer to the Y axis and the orange vector remains closer to the X axis

On the other hand, when Matrix B transforms the same set of vectors we observe that the orange vector has shifted closer to the Y axis and the purple vector has shifted closer to the X axis. They seem to have flipped with each other, the purple vector is now below the orange vector and vice versa. Whereas, earlier the purple vector remain atop the orange vector even after the transformation.

Notice that in both cases, the resulting vectors still end up on the same two points of (2, 4) and (4, 2). This tells us that the area scaled by both matrices must be the same in terms of magnitude, which is 3.

Hence, the negative signs indicate the flipping of not just vectors but the entire vector space! It is just like flipping a page over.

So as not to clutter the space too much, I purposely left out the matrix multiplication before the graphs. The set of vectors that Matrix A and Matrix B transformed are given below as a single matrix. You may want to try multiplying this matrix with A and B, and see if you get the same results as shown in the graph

$$V = \begin{bmatrix}{} 0 & 2 \\ 2 & 0 \end{bmatrix}$$

A matrix can have a determinant = 0. And by definition, this means that any area on the vector space must scale to 0. But how does it look graphically and what does it signify? Let's have a look.

Consider Matrix C with determinant = 0

$$\begin{array}{l} C = \begin{bmatrix}{} 2 & 1 \\ 2 & 1 \end{bmatrix} \\ \\ |C| = \begin{vmatrix}{} 2 & 4 \\ 2 & 4 \end{vmatrix} = 8 - 8 = 0 \end{array}$$

Observe how it transforms the same set of vectors as before

$$\begin{array}{l} \begin{bmatrix}{} 2 & 1 \\ 2 & 1 \end{bmatrix} \begin{bmatrix}{} 0 & 2 \\ 2 & 0 \end{bmatrix} = \begin{bmatrix}{} 2 & 4 \\ 2 & 4 \end{bmatrix} \end{array}$$

We can see that the vectors align on top of each other which leaves no scope for an area. This means that any area before the transformation would collapse onto a single line after the transformation.

This is significant as it tells us that whenever the determinant of a matrix is zero, the transformation of that matrix reduces the vector space to a lower dimension.

If a 3D Matrix has a determinant = 0, then what does it mean? Do they reduce the vector space to two dimensions? one dimension? or something else? Leave your answers in the comments

The last two sections - Negative Determinant and Zero Determinant - are significant in the world of matrices as they provide critical information on how they transform the vector space around them. They play an important role for Eigenvectors and Eigenvalues.

Thanks for reading! You can read the next article in the series which explains what is Matrix Inversion.

If this has helped you in any form you can show your appreciation by reacting to this post. If you have any feedback please feel free to leave them in the comments below.

Thanks again!

]]>Transformation is another word for function. A function takes in some input, *transforms* it, and spits out an output. This function or transformation is represented by a matrix. To better understand it, consider the following matrix.

$$A = \left[\begin{array}{} 2 & 0 \\ 2 & 1 \\ \end{array} \right]$$

This matrix represents a certain transformation, to see it in action we can multiply it with an arbitrary vector with coordinates (2, 1)

When we multiply, we get a new vector with coordinates (4, 5) as shown here

$$\left[\begin{array}{} 2 & 0 \\ 2 & 1 \\ \end{array} \right] \left[\begin{array}{} 2 \\ 1 \\ \end{array} \right] = \left[\begin{array}{} 4 \\ 5 \\ \end{array} \right]$$

And when we plot this on the graph we can see how Matrix A "*transformed*" the vector (2, 1) into another vector (4, 5)

This is why it represents a transformation. But what is so "linear" about it?

For a transformation to be linear it needs to maintain the following two properties after the transformation.

All lines must remain straight

The origin must not move

Considering the example above, we can say that Matrix A can transform any vector in the 2D space into another vector. In other words, the Matrix transforms the entire plane (aka vector space) into a different one while preserving its structure.

A shear matrix is a special matrix that only transforms either one of the two base axes (either X or Y)

Consider a shear matrix transforming a vector with coordinates (0, 2). Numerically we can show the transformation like

$$\left[\begin{array}{} 1 & 1 \\ 0 & 1 \\ \end{array} \right] \times\left[\begin{array}{} 0 \\ 2 \\ \end{array} \right]$$

But when plotted on a graph, it becomes apparent how the transformation reshapes the vector space around it.

You can see in the figure above how a shear matrix transforms a vector (purple arrow) but notice that the transformation also applies to the entire vector space around it (orange arrow and lines).

All the [orange] lines are straight after the transformation

The origin has not moved

Therefore we can say that the shear matrix has "Linearly Transformed" the vector space.

But it isn't always so convenient to imagine/draw a vector space transforming into a new shape. Instead, we focus on the input vectors and just plot how they reshape while keeping the base grid as it is. This helps us focus on how the vectors are affected and not worry about how the entire vector space transforms.

A more generalised way to measure the effect is to apply the linear transformation only to a set of unit vectors - *i* and *j* with coordinates (1, 0) and (0, 1) respectively. And the whole grid of the vector space would basically follow the same path

Thank you so much for reading! You can read the next article in the series which explains what are Determinants

If this has helped you in any form, you can show your appreciation by reacting to this article and if you have any questions or feedback please leave them in the comments below

Thanks again!

]]>To develop an intuition for Matrix Inversion it is important to be familiar with the concepts of Linear Transformation and Matrix Determinants. Please read these articles if you need a refresher.

The Inverse of a matrix depends on a very important condition that the linear transformation in question should not transform the space into a lower dimension. As we have learnt in the Matrix Determinants that such matrices have a determinant of zero.

Let's begin our understanding with the simple case where the determinant of the Matrix is non-zero. Consider an arbitrary Matrix A which transforms the purple vectors into the orange vectors

Then an inverse of Matrix A will restore the original position of the vectors, transforming orange vectors back into purple vectors.

$$\text{The inverse of Matrix A is denoted by } A^{-1}$$

Sure, it's easy to imagine them going back to their original position. But how does that work? What is the inverse of A?

Well, the inverse of A is a separate matrix in itself and like any other matrix, is a linear transformation that does its job of transforming the vector space around it.

The unique property of this matrix is that it is responsible for undoing what Matrix A performs. This is represented theoretically in the form of matrix multiplication. And as we know matrix multiplication is not commutative so we need to be careful of the order in which we express the inverse of A.

To represent the transformations in the above image, we need to check the order in which they were applied. In theory, the associativity of matrix multiplication is from right to left, therefore we write the first Matrix to the right and the second Matrix to the left as shown below

$$A^{-1} \times A$$

And since the vectors end up back where they started, the resultant of this multiplication should also have no effect. In the world of Matrices, we know that this is the identity matrix*.* Therefore...

$$A^{-1} \times A = I$$

There is an alternate reasoning that ends up with the same conclusion as above but it begins with something we already know. You can skip to the "How to" section if you want.

$$\begin{array}{l} 3 \times 2 = 6 \\ 6 \div 2 = 3 \\ \text{which can also be written as} \\ 6 \times \frac{1}{2} = 3 \\ \text{where } \frac{1}{2} \text{ is the inverse of 2} \end{array}$$

$$\begin{array}{l} 3 + 2 = 5 \\ 5 - 2 = 3 \\ \text{which can also be written as} \\ 5 + (-2) = 3 \\ \text{where -2 is the inverse of 2} \end{array}$$

But why is 1/2 the inverse of 2 in the case of multiplication?

And why is -2 the inverse of 2 in the case of addition?

What is the reason that makes it obvious?Inverse means to be the opposite or contrary in nature to something else. But if they are opposites, upon combining they should nullify each other. In other words, nothing should change or the state should remain

.*IDENTICAL*When you multiply 2 by 1/2 it results in 1 which means nothing has changed.

Hence, the**Multiplicative identity is 1.**When you add 2 with -2 it results in 0 which also means nothing has changed.

Hence, the**Additive identity is 0.**The same applies to matrices but in the world of matrices, the identity is not scalar, it is an

**Identity Matrix!**Therefore, the cross product of a Matrix with its Inverse should result in an identity matrix as shown below.

$$\begin{array}{} A \times A^{-1} = I \\ \text{where } A^{-1} \text{ is the inverse of matrix A} \end{array}$$

Append the Matrix in question with its Identity Matrix to create an Augmented Matrix

Perform row operations on it such that the LHS of the Augmented Matrix becomes the Identity Matrix and the resulting RHS will be the Inverse of the original Matrix

(Optional) To verify the solution we can multiply the inverse and the original Matrices and that should give us the Identity Matrix

Note: If you are unfamiliar with row operations or need a quick revision, you can checkout this article

Find the inverse of the following matrix

$$\begin{array}{} A = \begin{bmatrix} 2 & 0 \\ 1 & 4 \end{bmatrix} \end{array}$$

$$\left[\begin{array}{rr|rr} 2 & 0 & 1 & 0 \\ 1 & 4 & 0 & 1 \\ \end{array}\right]$$

$$\left[\begin{array}{rr|rr} 1 & 0 & \frac{1}{2} & 0 \\ 1 & 4 & 0 & 1 \\ \end{array}\right] R_1 = R_1 \div 2$$

$$\left[\begin{array}{rr|rr} 1 & 0 & \frac{1}{2} & 0 \\ 0 & 4 & -\frac{1}{2} & 1 \\ \end{array}\right] R_2 = R_2 - R_1$$

$$\left[\begin{array}{rr|rr} 1 & 0 & \frac{1}{2} & 0 \\ 0 & 1 & -\frac{1}{8} & \frac{1}{4} \\ \end{array}\right] R_2 = R_2 \div 4$$

$$\begin{array}{l} \therefore A^{-1} = \left[\begin{array}{rr|rr} \frac{1}{2} & 0 \\ -\frac{1}{8} & \frac{1}{4} \\ \end{array}\right] \end{array}$$

Multiplying Matrix A with its inverse should give us an identity matrix

$$\begin{array}{l} A \times A^{-1} = \begin{bmatrix} 2 & 0 \\ 1 & 4 \end{bmatrix} \times \begin{bmatrix} \frac{1}{2} & 0 \\ -\frac{1}{8} & \frac{1}{4} \end{bmatrix} \\ \quad\quad\quad\quad = \begin{bmatrix} 2 \times \frac{1}{2} + 0 & 0 + 0 \\ 1 \times \frac{1}{2} + 4 \times -\frac{1}{8} & 1 \times 0 + 4 \times \frac{1}{4} \end{bmatrix} \\ \quad\quad\quad\quad = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} \end{array}$$

Hence, our solution is verified!

The steps remain exactly the same for a 3x3 Matrix with no changes. You can try solving this question yourself and check your solution here.

Find the inverse of...

$$B = \left[\begin{array}{r} 3 & 0 & 2 \\ 2 & 0 & -2 \\ 0 & 1 & 1 \\ \end{array}\right]$$

Augment with Identity Matrix

$$\left[\begin{array}{rrr|rrr} 3 & 0 & 2 & 1 & 0 & 0 \\ 2 & 0 & -2 & 0 & 1 & 0 \\ 0 & 1 & 1 & 0 & 0 & 1 \\ \end{array}\right]$$

Perform Row Operations

$$\begin{array}{l} = \left[\begin{array}{rrr|rrr} 3 & 0 & 2 & 1 & 0 & 0 \\ 2 & 0 & -2 & 0 & 1 & 0 \\ 0 & 1 & 1 & 0 & 0 & 1 \\ \end{array}\right] \\ \\ = \left[\begin{array}{rrr|rrr} 5 & 0 & 0 & 1 & 1 & 0 \\ 2 & 0 & -2 & 0 & 1 & 0 \\ 0 & 1 & 1 & 0 & 0 & 1 \\ \end{array}\right] R_1 = R_1 + R_2 \\ \\ = \left[\begin{array}{rrr|rrr} 1 & 0 & 0 & 0.2 & 0.2 & 0 \\ 0 & 0 & -2 & -0.4 & 0.6 & 0 \\ 0 & 1 & 1 & 0 & 0 & 1 \\ \end{array}\right] \begin{array}{l} R_1 = R_1 \div 5\\ R_2 = R_2 - 2R_1 \end{array} \\ \\ = \left[\begin{array}{rrr|rrr} 1 & 0 & 0 & 0.2 & 0.2 & 0 \\ 0 & 0 & 1 & 0.2 & 0.3 & 0 \\ 0 & 1 & 0 & -0.2 & 0.3 & 1 \\ \end{array}\right] \begin{array}{l} R_2 = R_2 \div -2\\ R_3 = R_3 - R_2 \end{array} \\ \\ = \left[\begin{array}{rrr|rrr} 1 & 0 & 0 & 0.2 & 0.2 & 0 \\ 0 & 1 & 0 & -0.2 & 0.3 & 1 \\ 0 & 0 & 1 & 0.2 & 0.3 & 0 \\ \end{array}\right] R_2 \iff R_3 \end{array}$$

Therefore, the inverse of Matrix B is

$$B^{-1} = \left[ \begin{array}{r} 0.2 & 0.2 & 0 \\ -0.2 & 0.3 & 1 \\ 0.2 & 0.3 & 0 \\ \end{array} \right]$$

I'll encourage you to multiply the inverse with B and verify the solution as well

Thanks for reading! You can read the next article in the series which explains the process of Gaussian Reduction

If this article has helped you in any form you can show your appreciation by reacting to this article and if you have any feedback please feel free to leave them in the comments below.

Thanks again!

]]>Consider the following system of two simultaneous equations

$$\begin{array}{r} & 3a + b = 8 \quad...(1) \\ & a - 2b = 5 \quad...(2) \\ \end{array} $$

We can solve it by multiplying equation (1) by 2 and adding it to equation (2) as shown

$$\begin{array}{r} &6a + 2b = 16 \\ + &a-2b = 05 \\ \hline &7a = 21\\ \therefore&a = 3 \end{array}$$

Now that we have the values for "a" we can substitute it in the first equation

$$\begin{array}{r} & 3 \times 3 + b = 8 \\ \therefore& b = -1 \end{array}$$

That was pretty simple and quick but what if we need to solve for 3 simultaneous equations? Sure, we pick two equations, eliminate the common variable, and substitute the solution in the third one to find the answer. While this is doable, **things get more complex as we increase the number of equations/variables to solve**.

Gaussian Reduction is an efficient and less error-prone method to solve a system of linear equations. The goal here is to reach the **Row Echelon** form of the matrix. In this form, all the elements in the lower left triangle of a square matrix are zeros. In other words, **all the elements in the upper right triangle and the diagonal are non-zero.**

$$\begin{array} &\text{Row Echelon form} \\ \\ \begin{bmatrix} a & b & c & d \\ 0 & e & f & g \\ 0 & 0 & h & i \\ 0 & 0 & 0 & j \end{bmatrix} \end{array}$$

Before we solve an example, we need to look at the operations that we are allowed to be performed on the matrix.

Swapping of rows

Add non-scalar multiples of rows

Multiply rows by non-zero scalars

Consider the three equations below and find the solution using Gaussian Reduction

$$\begin{array}{r} & a + b + c = 2 \\ & 2a - b + c = -5 \\ & -a +3b + 3c = 18 \\ \end{array}$$

Now that we have established the goal and the rules, we are equipped to find the solution.

Rewrite the equations in matrix form with the coefficients of all the variables in a 3x3 matrix and constants in another 1x3 matrix as shown below.

$$\begin{bmatrix} 1 & 1 & 1 \\ 2 & -1 & 1 \\ -1 & 3 & 3 \end{bmatrix} \begin{bmatrix} 2 \\ -5 \\ 18 \end{bmatrix} \begin{array}{r} ...R_1 \\ ...R_2 \\ ...R_3 \end{array}$$

Rows have been labeled *R1, R2* and *R3* for convenience

Remember that we need to reach the row echelon form which means that, in our case, we need to make the first element of R2 and the first two elements of R3 as zeros. Let's start our calculations with that in mind.

We observe that adding R1 to R3 makes the first element of R3 zero and the matrix becomes...

$$= \begin{bmatrix} 1 & 1 & 1 \\ 2 & -1 & 1 \\ 0 & 4 & 4 \end{bmatrix} \begin{bmatrix} 2 \\ -5 \\ 20 \end{bmatrix} \begin{array}{l} \quad R_3 = R_3 + R_1 \ \end{array}$$

Note: You are allowed to perform any row operations at any time as you see fit. Here, operating on R3 first is entirely a personal choice

Next, we target the first element of R2 to become zero. To do so, we can multiply the R1 by 2 and subtract it from R2. The matrix then becomes...

$$= \begin{bmatrix} 1 & 1 & 1 \\ 0 & -3 & -1 \\ 0 & 4 & 4 \end{bmatrix} \begin{bmatrix} 2 \\ -9 \\ 20 \end{bmatrix} \begin{array}{l} \quad R_2 = R_2 - 2R_1 \ \end{array}$$

The only element that remains is the second element of R3. Now, it may seem tempting to just multiply R1 by 4 and subtract it from R3 *but* by doing so we are also reducing the third element of R3 to zero. Recall that in the row echelon form, the upper right triangle and the diagonal elements are supposed to be **non-zero**.

What we can do instead is add R2 with R3 which makes the two 4's as 1 and 3 respectively and then subtract R1 from R3 to make the required element zero.

However, there is a faster way to achieve this by multiplying R3 by 3 and R2 by 4 and then adding them up.

$$= \begin{bmatrix} 1 & 1 & 1 \\ 0 & -3 & -1 \\ 0 & 0 & 8 \end{bmatrix} \begin{bmatrix} 2 \\ -9 \\ 24 \end{bmatrix} \begin{array}{l} \quad R_3 = 3R_3 + 4R_2 \ \end{array}$$

And so we reach our goal of converting the matrix into the row echelon form.

But now what? Where is the answer?

The final step of the process is to reduce the row echelon matrix into an identity matrix. An identity matrix is a square matrix with all diagonal elements as 1 and the remaining elements as zeros. This is also known as the Reduced Row Echelon form.

$$\begin{array} &\text{Identity Matrix} \\ \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix} \end{array}$$

Note: The real goal of this method is to reduce the matrix into an identity matrix and the row echelon form is actually an important milestone in the process.

Resuming our journey, we aim to make the diagonal as 1's. Since the first element of R1 is already 1 we can simply divide R2 by -3 and R3 by 8.

$$= \begin{bmatrix} 1 & 1 & 1 \\ 0 & 1 & 1/3 \\ 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} 2 \\ 3 \\ 3 \end{bmatrix} \begin{array}{l} \quad R_2 = R_2 / -3 \\ \quad R_3 = R_3 / 8 \\ \end{array}$$

To reduce the upper right triangle to zeros we can do the following steps...

$$= \begin{bmatrix} 1 & 1 & 0 \\ 0 & 1 & 1/3 \\ 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} -1 \\ 3 \\ 3 \end{bmatrix} \begin{array}{l} \quad R_1 = R_1 - R_3 \end{array}$$

And then...

$$= \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} -3 \\ 2 \\ 3 \end{bmatrix} \begin{array}{l} \quad R_2 = R_3/3 - R_2 \\ \quad R_1 = R_1 - R_2 \end{array}$$

That's it! we have successfully reduced the original matrix into an identity matrix and the answers are in plain sight. We can convert the matrix back into its equation form like...

$$\begin{array}{l} & a + 0 + 0 = -3 \quad\therefore a = -3\\ & 0 + b + 0 = 2 \quad\quad\therefore b = 2\\ & 0 + 0 + c = 3 \quad\quad\therefore c = 3\\ \end{array}$$

It seems daunting at first to solve equations using this method as it involves writing the same matrix over and over again. But as you may have observed we can perform multiple row operations at the same time and reduce the number of substeps, as shown in step 3.

The great thing about a system of linear equations is that we can easily assess our answers by cross-checking our solution with the original equations and if they do not fit, we have made a mistake somewhere.

For a quick check, we can substitute the values in the first equation...

$$\begin{array}{r} & a + b + c = 2 \\ & -3 + 2 + 3 = 2 \\ & 2 = 2 \\ \end{array}$$

...and the math checks out!

This way we can be sure that our solution is correct

Thanks for reading! You can read the next article in the series which explains what are Eigenvectors and Eigenvalues

If this article has helped you, you can express your appreciation by reacting to this article. If you have any questions, comments, or constructive criticism please feel free to leave them in a comment below :)

Thanks again!

]]>`monthrange()`

method from the calendar module as shown here. Not an inconvenience really.. 🤷`>>> import calendar>>> calendar.monthrange(2022, 2)[1]28`

Thanks for reading :)

]]>Instead, use comprehensions to simplify or prettify your code. Beautiful is better than ugly.

- List comprehension
- Set comprehension
- Dictionary comprehension
- Generator comprehension
- Bonus tip

Syntax: `[expression with element for element in iterable if condition is met]`

where the `expression`

and the `if`

clauses are optional. Let's see it in action

`# Create a list of alphabetsimport string# A normal for loop would take up 3 lines of code and an indentationalphabets = []for alphabet in string.ascii_lowercase: # string.ascii_lowercase returns >>> 'abcdefghijklmnopqrstuvwxyz' alphabets.append(alphabet)print(alphabets)"""OUTPUT:['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']"""`

`import stringalphabets = [alphabet for alphabet in string.ascii_lowercase]print(alphabets)"""OUTPUT:['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']"""`

**Note:** The best way to get a list of alphabets is perhaps `list(string.ascii_lowercase)`

. The above example is just for illustration purposes. `string.ascii_lowercase`

serves as an example of an iterable

`if`

clause`vowels = ['a', 'e', 'i', 'o', 'u'] # Note the if condition below alphabets = [alphabet for alphabet in string.ascii_lowercase if alphabet not in vowels]print(alphabets)# The if condition filters out the vowels"""OUTPUT:['b', 'c', 'd', 'f', 'g', 'h', 'j', 'k', 'l', 'm', 'n', 'p', 'q', 'r', 's', 't', 'v', 'w', 'x', 'y', 'z']"""`

We can use the `ord()`

function on the alphabet before the `for`

clause as shown below

`vowels = ['a', 'e', 'i', 'o', 'u']unicodes = [ord(alphabet) for alphabet in string.ascii_lowercase if alphabet not in vowels]print(unicodes)"""OUTPUT:[98, 99, 100, 102, 103, 104, 106, 107, 108, 109, 110, 112, 113, 114, 115, 116, 118, 119, 120, 121, 122]"""`

**Note:** We can basically use any function or expressions before the `for`

clause to modify the element at runtime. The most common use case is to write a `lambda`

function in that space

Syntax: `{expression with element for element in iterable if condition is met}`

where the `expression`

and the `if`

clauses are optional. Let's see it in action

`# Create a set of unique characters for the below stringstring_ = "The quick brown Fox jumps Over the lazy Dog"# The for loop would look something like thisuniques = set()for char in string_: uniques.add(char)print(uniques)"""OUTPUT:{'r', 'm', 'F', 'p', 's', 'i', 'u', 'j', 'b', 'c', 'a', 'k', 'o', 'n', 'T', 'x', ' ', 't', 'v', 'z', 'l', 'e', 'w', 'y', 'q', 'h', 'g', 'D', 'O'}"""`

`uniques = {char for char in string_}print(uniques)"""OUTPUT:{'r', 'm', 'F', 'p', 's', 'i', 'u', 'j', 'b', 'c', 'a', 'k', 'o', 'n', 'T', 'x', ' ', 't', 'v', 'z', 'l', 'e', 'w', 'y', 'q', 'h', 'g', 'D', 'O'}"""`

`if`

clause`uniques = {char for char in string_ if char != ' '}print(uniques)"""OUTPUT:{'r', 'm', 'F', 'p', 's', 'i', 'u', 'j', 'b', 'c', 'a', 'k', 'o', 'n', 'T', 'x', 't', 'v', 'z', 'l', 'e', 'w', 'y', 'q', 'h', 'g', 'D', 'O'}"""`

`uniques = {char.lower() for char in string_ if char != ' '}print(uniques)"""OUTPUT:{'w', 'o', 'b', 'x', 'q', 'v', 'h', 'f', 'j', 'e', 'l', 'r', 'm', 'p', 'c', 't', 'n', 's', 'i', 'y', 'g', 'd', 'a', 'k', 'z', 'u'}"""`

Syntax: `{expression for key: expression for valuefor element in iterable if condition is met}`

where the `expression`

and the `if`

clauses are optional. Let's see it in action

`# Create a dictionary of uppercase characters and their unicodesimport stringunicode_map = {}for char in string.ascii_uppercase: unicode_map[char] = ord(char)print(unicode_map)"""OUTPUT:{ 'A': 65,'B': 66,'C': 67,'D': 68,'E': 69,'F': 70,'G': 71,'H': 72,'I': 73, 'J': 74,'K': 75,'L': 76,'M': 77,'N': 78,'O': 79,'P': 80,'Q': 81,'R': 82, 'S': 83,'T': 84,'U': 85,'V': 86,'W': 87,'X': 88,'Y': 89,'Z': 90}"""`

`import stringunicode_map = {char: ord(char) for char in string.ascii_uppercase}print(unicode_map)"""OUTPUT:{ 'A': 65,'B': 66,'C': 67,'D': 68,'E': 69,'F': 70,'G': 71,'H': 72,'I': 73, 'J': 74,'K': 75,'L': 76,'M': 77,'N': 78,'O': 79,'P': 80,'Q': 81,'R': 82, 'S': 83,'T': 84,'U': 85,'V': 86,'W': 87,'X': 88,'Y': 89,'Z': 90}"""`

The *expression* and the `if`

clauses would work exactly the same as it worked for the list and set comprehensions

Syntax: `(expression with element for element in iterable if condition is met)`

where the `expression`

and the `if`

clauses are optional. The syntax is very similar to that of a list comprehension just with parenthesis instead of square brackets.

**Note:** Unlike others, generator comprehensions are a replacement for generator functions rather than for loops.

`import stringdef generate_alphabets(): for char in string.ascii_lowercase: yield char# The following for loop is to make the generator yield the values.# It's is NOT the one we need to focus onfor alphabet in generate_alphabets(): print(f"{alphabet},", end='')"""OUTPUT:a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z,"""`

`import stringgenerate_alphabets = (char for char in string.ascii_lowercase)for alphabet in generate_alphabets: print(f"{alphabet},", end='')"""OUTPUT:a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z,"""`

The *expression* and the `if`

clause would work exactly the same as the other comprehensions

`:`

`for char in ['a', 'e', 'i', 'o', 'u']: print(char)# The above for loop can also be written asfor char in ['a', 'e', 'i', 'o', 'u']: print(char)`

]]>

`groupby()`

method.`>>> from itertools import groupby>>> unsorted_iterable = ['red','orange','green','red','green']>>> for key, group in list(groupby(unsorted_iterable)):... print(key)...redorangegreenredgreen# Strings are repeated. This is NOT desired`

`red`

and `green`

were repeated at the end of the output. Lets take a look for a sorted iterable`>>> from itertools import groupby>>> sorted_iterable = ['red','red','orange','green','green']>>> for key, group in list(groupby(sorted_iterable)):... print(key)...redorangegreen# We get only the unique strings`

`from itertools import groupbyfruits = [ {'name': 'apple', 'color': 'red'}, {'name': 'cherry','color': 'red'}, {'name': 'orange','color': 'orange'}, {'name': 'pear', 'color': 'green'}, {'name': 'grape', 'color': 'green'}]for color, group in groupby(fruits, key=lambda fruit:fruit['color']): print(f"\nAll {color} fruits") print(list(group))`

`# OutputAll red fruits[{'name': 'apple', 'color': 'red'}, {'name': 'cherry', 'color': 'red'}]All orange fruits[{'name': 'orange', 'color': 'orange'}]All green fruits[{'name': 'pear', 'color': 'green'}, {'name': 'grape', 'color': 'green'}]`

`cost`

field is added to our list of fruits to be grouped by. We wish to group by the color first and then by the cost.`from itertools import groupbyfruits = [ # New field {'name': 'apple', 'color': 'red', 'cost': 12}, {'name': 'cherry','color': 'red', 'cost': 12}, {'name': 'orange','color': 'orange', 'cost': 10}, {'name': 'pear', 'color': 'green', 'cost': 12}, {'name': 'grape', 'color': 'green', 'cost': 15}]for color, color_group in groupby(fruits, key=lambda fruit:fruit['color']): print("-"*20) print(f"All {color} fruits") print(list(color_group)) # Note: list() operation made the group generator yield out all the values for cost, cost_group in groupby(color_group, key=lambda fruit:fruit['cost']): print(f"\tAll {color} fruits that cost {cost} bucks") print("\t\t", list(cost_group))`

`--------------------All red fruits[{'name': 'apple', 'color': 'red', 'cost': 10}, {'name': 'cherry', 'color': 'red', 'cost': 12}]--------------------All orange fruits[{'name': 'orange', 'color': 'orange', 'cost': 12}]--------------------All green fruits[{'name': 'pear', 'color': 'green', 'cost': 15}, {'name': 'grape', 'color': 'green', 'cost': 15}]`

`print(list(color_group))`

and prevent the yielding of values, then the following output would be produced. Where the inner loop was also executed.`--------------------All red fruits All red fruits that cost 10 bucks [{'name': 'apple', 'color': 'red', 'cost': 10}] All red fruits that cost 12 bucks [{'name': 'cherry', 'color': 'red', 'cost': 12}]--------------------All orange fruits All orange fruits that cost 12 bucks [{'name': 'orange', 'color': 'orange', 'cost': 12}]--------------------All green fruits All green fruits that cost 15 bucks [{'name': 'pear', 'color': 'green', 'cost': 15}, {'name': 'grape', 'color': 'green', 'cost': 15}]`

]]>