Lecture 2. Operations on matrices
In this lecture, we will introduce basic operations on matrices, consider the properties of these operations, and give examples of performing operations on matrices. We worked with matrices when solving systems of linear equations.
Let's remember that by matrix we mean a table divided into rows and columns. If there are m rows and n columns in a matrix, then it is said that there is a matrix m by n. In general, we write the matrix in the specified form as on the slide, and we denote any element with a letter with two indices, where the first index is the row number, the second index is the column number. In an abbreviated form, we will denote the matrix in this way, writing the element (aij). If it is necessary to emphasize the dimension of the matrix, then we will indicate the dimension of this matrix as the lower index.
We will often use the concept of equality. Two matrices A and B are considered equal if, firstly, they have the same dimensions, and, secondly, the corresponding elements are equal, that is, for each index i and j, aij = bij. We fix an arbitrary dimension m and n, and denote the set of all matrices of a given size as M. Note that, in particular, m and n can be equal to 1, we get a trivial case when there is only one row and one column and the matrix degenerates into a single number. If the number of rows and columns is the same and equal to the number n, then the matrix is called a square n-th order matrix. If the matrix is square, in this case, for the characteristic and dimension, we simply specify one number n. And the set of all n-th order square matrices is denoted by Mn.
Let's start studying operations. Let's start with the addition operation. In order to add two matrices, you need to take two matrices of the same size and add their corresponding elements, so the sum of 2x matrices is called the matrix A+B, each element of which is equal to the sum of the elements of the original matrix. For example, let's take matrices of size 2 by 3 and, adding the corresponding elements, after the calculation, we get the matrix indicated on the slide. As you can see, the operation of matrix addition is completely determined by the addition of the corresponding elements, that is, the addition of numbers
Note the properties of this operation. Whatever three matrices A, B, and C of fixed size we take, the following properties are true for them. First, the addition operation is commutative. This term means that the summands can be swapped. The second property, associativity by addition, means that in a sum containing three terms, you can put brackets at random.
We introduce the concept of a zero matrix. This is a matrix with all elements equal to zero. Obviously, if we add a null matrix to matrix A, we get the original matrix A.
Next, for the matrix A, you can determine the opposite of it-A, for this it is necessary to consider the opposite for each element. If you add two opposite matrices, obviously, you get a zero matrix. The concept of the opposite matrix allows you to introduce a subtraction operation. And so by subtraction, you can understand a matrix of this kind, in order to subtract B from A, you need to add to A the opposite of B. In other words, you need to subtract the corresponding elements from these matrices.
The next operation is multiplying a number by a matrix. So, suppose we are given a matrix of size m by n and an arbitrary number r , in order to multiply the matrix by this number, we need to multiply each element of the matrix by this number. For example, take two second-order matrices, multiply the first one by 3 and subtract the second one from the resulting matrix (see the video). What properties does the specified operation of multiplying a number by a matrix have? Firstly, if you multiply the matrix by 1, the matrix will not change. Secondly, if you take two numbers and multiply them by a matrix, then again you can put the brackets at random, the result will not change. There are also properties 3 and 4 (see the video).
Note that these properties and the four previous ones eventually give us 8 properties, and these are exactly the properties that define the concept of a vector space.
Thus, on the set of all matrices of fixed size, we have two operations. The operation of adding and multiplying a matrix by a number. These operations satisfy the properties, which means that such a theorem is valid. The set of all matrices of fixed size, relative to the introduced operations, forms a vector space.
Note that properties 3 and 4 are called distributive laws. That is, when we put the matrix A in the bracket in which the sum of the numbers is written, we have distributivity with respect to the numbers, but when we multiply the number by the sum of the matrices, that is, we put the number in the bracket, we get another analog of the distributional law.
There is an interesting transpose operation, it is not a complicated operation. If we have an arbitrary matrix A, and to get the matrix transposed to it, it is necessary to make each row a column. If we had m rows and n columns in matrix A, then the transposed matrix will have n rows and m columns. For example (see the video), take a matrix of size 3 by 2 and transpose it, we get the following matrix. The elements as such have not changed, the only thing is that the first column 1,3,5 has moved to the first row or in other words the first row 1,2 has moved to the first column. The same is true for the other rows and columns. In simple terms, the transposed matrix has rows and columns swapped.
Properties of the transpose operation. We take two arbitrary matrices of some suitable size, and if we perform this operation twice in a row, we get the original matrix. The property is obvious, that is, at first the rows will go to columns, and then back the columns will go to rows.
Further, the matrix transposed to the sum is equal to the sum of the transposed matrices.
There is a third property, the numerical coefficient can be taken out of the bracket. This will not change the result in any way.
The 4th property is interesting, not so obvious, its meaning is as follows. Let me remind you that the rank of the matrix is the rank of its system of row vectors, that is, the number of vectors in the basis. It turns out that if we take an arbitrary matrix and consider the row vectors, and then for the same matrix consider a system consisting of column vectors, then the ranks of these systems will be the same.
Now let's move on to the matrix multiplication operation, before that we talked about multiplying a number by a matrix, and now we will consider multiplying two matrices. This operation is arranged in a more complex way than the previous operations, but nevertheless it is reduced to the operation of addition and multiplication of numbers. So, in order to multiply the matrix A by B, you need to require that the number of columns in the first matrix is equal to the number of rows in the second matrix. That is, pay attention to the matrices A and B of this size (see the video). The numbers m, n and p are arbitrary, but n is the number of columns in the first matrix, as well as the same number of rows in the second matrix.
After multiplication, we get a new matrix, its size will be m by p, that is, the new matrix will have as many rows as in the first matrix, and as many columns as in the matrix B. Each element is defined according to the specified rule, it is written by a formula (see the video), and, at first glance, the formula is complex, and it is not clear what lies behind it. We will try to understand this rule on the examples and with the help of certain schemes. The main thing to remember is that we take to find the element standing in the i-th row and j-th column, the i-th row in the first matrix and the j-th column in the second matrix.
So let's turn to the scheme. Let the matrix A have the form (see in the video), the elements aij, the elements of the matrix B bij have this form (see in the video ), and we need to find all the elements in the new matrix, that is, the elements cij. As mentioned, you need to take the i-th row in matrix A and the j-th column in matrix B (see the video). In short, they say that the i row is multiplied by the j column, which means that each element of the i-th row is multiplied by the corresponding element of the j-th column and the sum is obtained. In the diagram, it is written in detail, that is, the element ai1 is multiplied by the element b1j, then we take the second elements and so on, then we sum the resulting products, thereby obtaining the desired element cij and, thus, iterating over all the values of i and j, we get the matrix as a result.
So, let’s work out this operation on an example (see the video). Once again, we will note the scheme and try to see how it will be in reality. So, let's take two matrices, the first matrix is 2 by 3, the second one is three by two. The condition is met, the matrix has the desired dimensions. And so the resulting matrix will be 2 by 2, because in the first matrix there are two rows, in the second one - two columns, so we need to find 4 elements. We’ll try to prescribe in detail, let's start with the first element c11, that is, take the first row of the first matrix and the first column in the second matrix and multiply elementwise, that is, the corresponding element 1 by 2, 3 by -1, 1 by 3 and add. So we recorded this amount in detail. Now we will make a similar sum for all the elements, and then calculate.
And now let's find the element c12, that is, the element that is in the first row of column 2. So, we take the first row of the first matrix, and the second column of the second matrix is also multiplied. We get the written expression, according to the same principle. Now let's move on to the second line. We need to take the second row of the first matrix. First, we multiply by the first column of the second matrix, we get the corresponding element c21. This kind, but then the first row is multiplied by the second column. We get element c22 in the desired matrix. And so it remains to make calculations and get the result. This is how the specified procedure of multiplication of two matrices is performed. Now consider the properties of the operation. We take the matrices A, B and C of the required size. First, we note the equality that is not true.
Unlike the addition operation, the multiplication operation is no longer commutative. That is, by swapping the matrices A and B, we may not get the matrix, because the dimensions of the matrix A and B may not coincide. But even if we get some matrix, it does not have to match the original one, that is, in the general case, as it was already mentioned, the multiplication operation does not have a commutative property. But the associative law is valid, if we have three multipliers of the matrix A, B and C, then the brackets can be put in any order, but at the same time, without changing the order of writing the matrices.
Further, distribution law is valid for matrices, that is, the matrix A can be multiplied by the sum, opening the brackets, we can multiply by the sum on the right, we can multiply by the sum on the left. Since the operation does not have a commutative property, we are talking about two laws and in future we will often say such words. Multiply the matrices in a certain order, that is, multiply some matrix by another matrix on the left or on the right. This is important because of the above said.
In short, property 3 means that the multiplication operation is distributive with respect to the matrix addition operation
Next, there is the 4th property, the relation of the matrix multiplication operation, the multiplication operation by a number. We introduce an important notion of the unit matrix. This matrix, which is obtained from unit vectors, and such a matrix can be made for any n, that is, for each number, a unit matrix of the nth order is obtained, and the following equality takes place. If we multiply the matrix A by a unit one, we get the matrix A, and even if we multiply it in a different order, nothing will change, that is, getting again the analog with multiplying the number by one.
And now, if we talk about the analogy between matrices and numbers, we can note the following.Ffirst, as for numbers, we have learned to add matrices, and all the properties for adding matrices are similar to those for adding numbers. By virtue of how the addition operation is defined. Next, we introduced the matrix multiplication operation, but this operation is similar to the multiplication operation in terms of properties. Except for the first remark, that the matrix multiplication operation is not commutative. Otherwise, all the properties are exactly the same. In particular, associativity, distributivity and the analog of units, where the unit matrix is as if a numerical unit. Further, in addition to addition, subtraction, multiplication, we still have division operations for numbers
When we divide the number a by the number b, we in other words multiply a by the number inverse of b, according to this the question arises. What is an inverse matrix and can matrices be divided in general? It turns out that it is possible to introduce the concept of an inverse matrix, but the inverse matrix does not exist for any matrix. The definition is as follows. Take a square matrix A, then the inverse of it is called such a matrix, which, when multiplied by the original one, gives a unit, and in any order.
To denote the inverse matrix, use the same letter with the superscript -1. It is read as A minus first power. Now I propose to consider examples. Another example for multiplication, and then an example with the inverse matrix. This example illustrates one of the properties mentioned.
Let's take some matrix and multiply it by the unit one. Let's make sure that the matrix doesn't change in this case. That is, we will illustrate the specified property with a specific example. After performing the transformation, notice what we do, first we multiply the first row by the first column. Due to the fact that there are many zeros, we have marked a non-zero product. One by one, well, plus 0, because all the remaining products will be reset to zero and so on for each element. After these transformations, it is easy to see that nothing changes.
Now let's multiply two matrices of the second order, we get the following matrix (see the video). Again, we perform four transformations, because each row in the first one is to be multiplied by each column in the second matrix. We get the following matrix (see the video), and note that it will be a unit one. If we do the multiplication in reverse order, we get exactly the same matrix. I suggest you to make sure of this yourself, that is, despite the fact that I said that the multiplication operation is not commutative, this does not mean that equality is always not fulfilled. In some cases, it can be done.
We have obtained that the two specified matrices are mutually inverse, that is, each of them is the inverse of the other by definition, because their product gives the unit matrix.
Let's multiply the second-order null matrix by an arbitrary one. Obviously by the definition of multiplication it will result in a zero matrix. Everything will be reset to zero and it is clearly seen that for a zero matrix, the inverse does not exist, because the result is again a zero matrix. You can also note that even if we multiplied not the zero matrix, but this one (see the video), that is, let's leave zeros in the first row, and write arbitrary numbers in the second.
After multiplication, the first row will still be zero, because the first row is multiplied by any column, it is clear that it will be 0. In this case, as I have already said, the first line will contain only 0s, which means that the unit matrix will not work in any way, and therefore the inverse of it will also not exist, that is, the inverse matrix does not exist for all square matrices.
Now let's talk about the relationship of the multiplication operation with systems of linear equations. Let's write down an arbitrary system, as we did with you before, and consider some matrices for this system. We have already worked with the main matrix made up of the coefficients of these equations, and we take the coefficients from the left side. Separately, we write a matrix composed of free elements, write it in a column, and also denote the column of unknowns as X.
So, we have three matrices, and X and B are matrices in which there is exactly one column. Let's multiply A by X, in other words, we multiply each row of matrix A by a single column of matrix X. By virtue of our definition, we will get such a matrix (see the video). Let us note again. Consider the first row, multiply by the column, that is, multiply the corresponding elements of the row and column, then add. It turns out that such an element, as you can see, it is the same as the left part of the first equation. Similarly, we multiply the second row by the column, we get the left part of the second level, and so on.
Thus, by replacing the left parts of the equations with the right ones, the matrix B is obtained. In other words, we wrote down the original system of linear equations, as they say in the mother form as A*X = B. And so such a short notation A*X = B is called the mother form of a system of linear equations. What can we get from here?
The following theorem is valid. A system of equations has a unique solution if and only if its main matrix A has the inverse one. Let's think what this means and how this solution can be found. And so we write the system in the mother form A*X = B and multiply this equality on the left by the inverse matrix. Assuming that the inverse matrix exists. The following equation is obtained (see the video).Here it is important that we multiply on the left, that is, in each part of the equation, the multiplier A-1 is written on the left.
Next, we move the parentheses on the left side of the equation and replace the product A-1*A with the unit matrix. By property of the unit matrix it turns out that the matrix X is on the left, that is the unknown column without variables that can be expressed through the matrices A and B, or rather via the matrix inverse to A. So if the matrix A has an inverse one, then performing the operation of multiplying A-1* B, we get the unknown matrix X.
So, if the matrix has an inverse one, then the original system can be solved by performing the specified operations, that is, the inverse of A is multiplied by B.
A question arises. How can we find the inverse matrix? There are different algorithms. Let's get acquainted now with the following theorem, which gives a rule for finding the inverse matrix, provided that it exists.
Suppose that we have a square matrix A, if we add a unit matrix of the same order to this matrix on the right and perform an elementary transformation on the rows, so that the matrix A acquires a unit one, then on the right we get exactly the inverse matrix.
I propose to illustrate this algorithm with a simple example. Let's take a second-order matrix and perform the specified transformations. Formally, we added a unit one, so that the matrices do not merge, they can be conditionally divided by a certain line, but then forgetting about this line, we begin to perform the transformation on the rows, so that we get a unit matrix. First, we will try to get a step matrix. We already know how to do this, that is, we will get zeros under the leading elements of each line. So, we got such a zero, we have, in fact, a step matrix, but for finding a unit one, it is necessary that the leading elements 2 and 5 turn into unit elements.
To do this, first divide the second row by 5, getting the leading element one, and then add the first and second rows to get a zero over the unit. Now you can already see that in order to form a single matrix, it is enough to divide the first row by 2, and I remind you that we perform operations on all rows, regardless of the set line. After division, it turns out that the unit matrix is written to the left of the line, and what was obtained on the right is the inverse matrix of the original one. Here is an interesting algorithm that we will master at the practical lesson.