머신러닝,딥러닝/Andrew Ng 머신러닝 코세라 강의 노트

Week 1 Lecture ML : Matrices and Vectors

mcdn 2020. 8. 6. 15:53
반응형

ML:Linear Algebra Review

Khan Academy has excellent Linear Algebra Tutorials (https://www.khanacademy.org/#linear-algebra)

And the other piece  of knowledge that we need is  that the dimension of the  matrix is going to  be written as the  number of row times the number of columns in the matrix.  So, concretely, this example  on the left, this  has 1, 2, 3, 4  rows and has 2 columns, 1분 14초부터 동영상을 재생하고 스크립트 따르기1:14 and so this example on the  left is a 4 by  2 matrix - number of rows by number of columns.  So, four rows, two columns.  This one on the right, this matrix has two rows.  That's the first row, that's  the second row, and it has three columns. 1분 35초부터 동영상을 재생하고 스크립트 따르기1:35 That's the first column, that's the  second column, that's the third  column So, this second  matrix we say it is  a 2 by 3 matrix.
right, because that's 3 2  so that's equal to 1 4 3 7.  And finally, 8 4 1  is going to refer to  this one right, fourth row,  first column is equal to  1 4 7 and if,  hopefully you won't, but if  you were to write and say  well this A 4  3, well, that refers to  the fourth row, and the  third column that, you know,  this matrix has no third  column so this is undefined,you know, or you can think of this as an error.  There's no such element as  8 4 3, so, you know, you  shouldn't be referring to 8 4 3.  So, the matrix  gets you a way of letting  you quickly organize, index and access lots of data.  In case I seem to be  tossing up a lot of  concepts, a lot of new notations  very rapidly, you don't need  to memorize all of this, but  on the course website where we  have posted the lecture notes,  we also have all of these definitions written down. 
A vector is a matrix  that has only 1 column so  you have an N x 1  matrix, then that's a remember, right?  N is the number of  rows, and 1 here  is the number of columns, so, so  matrix with just one column  is what we call a vector.  So here's an example  of a vector, with I  guess I have N equals four elements here.
you should assume we are using one index vectors.  In fact, throughout the rest  of these videos on linear algebra  review, I will be using one index vectors.       vectors as well. Finally, by convention, usually when writing matrices and vectors, most people will use upper case to refer to matrices. So we're going to use capital letters like A, B, C, you know, X, to refer to matrices, 8분 16초부터 동영상을 재생하고 스크립트 따르기8:16and usually we'll use lowercase, like a, b, x, y, 8분 21초부터 동영상을 재생하고 스크립트 따르기8:21to refer to either numbers, or just raw numbers or scalars or to vectors. This isn't always true but this is the more common notation where we use lower case "Y" for referring to vector and we usually use upper case to refer to a matrix.

Matrices and Vectors

Matrices are 2-dimensional arrays:

adgjbehkcfil

The above matrix has four rows and three columns, so it is a 4 x 3 matrix.

A vector is a matrix with one column and many rows:

wxyz

So vectors are a subset of matrices. The above vector is a 4 x 1 matrix.

Notation and terms:

  • A_{ij} refers to the element in the ith row and jth column of matrix A.
  • A vector with 'n' rows is referred to as an 'n'-dimensional vector
  • v_i refers to the element in the ith row of the vector.
  • In general, all our vectors and matrices will be 1-indexed. Note that for some programming languages, the arrays are 0-indexed.
  • Matrices are usually denoted by uppercase names while vectors are lowercase.
  • "Scalar" means that an object is a single value, not a vector or matrix.
  • \mathbb{R} refers to the set of scalar real numbers
  • \mathbb{R^n} refers to the set of n-dimensional vectors of real numbers

 

Addition and Scalar Multiplication

 

Addition and subtraction are element-wise, so you simply add or subtract each corresponding element:

In scalar multiplication, we simply multiply every element by the scalar value:To add or subtract two matrices, their dimensions must be the same.

[acbd]

* x =

[axcxbxdx]

It turns out this  "n" here has to match this "n" here.  In other words, the number of  columns in this matrix, so  it's the number of n columns.  The number of columns here has  to match the number of rows here.  It has to match the dimension of this vector.  And the result of this product  is going to be an n-dimensional  vector y. 
And if you just do this then  this variable prediction - sorry  for my bad handwriting - then  just implement this one  line of code assuming you have  an appropriate library to do matrix vector multiplication.  If you just do this,  then prediction becomes this  4 by 1 dimensional vector, on  the right, that just gives you all the predicted prices.  And your alternative to doing  this as a matrix vector multiplication  would be to write eomething like  , you know, for I equals 1 to 4, right?  And you have say a thousand houses  it would be for I equals 1 to a thousand or whatever.        It turns out, that, by writing  code in this style on the  left, it allows you to  not only simplify the  code, because, now, you're just  writing one line of code  rather than the form of a bunch of things inside.  But, for subtle reasons, that we  will see later, it turns  out to be much more computationally  efficient to make predictions  on all of the prices of  all of your houses doing it  the way on the left than the  way on the right than if you were to write your own formula. 

Matrix-Vector Multiplication

We map the column of the vector onto each row of the matrix, multiplying each element and summing the result.

acebdf

*

[xy]

=

ax+bycx+dyex+fy

The result is a vector. The vector must be the second term of the multiplication. The number of columns of the matrix must equal the number of rows of the vector.

An m x n matrix multiplied by an n x 1 vector results in an m x 1 vector.

 

Matrix-Matrix Multiplication

We multiply two matrices by breaking it into several vector multiplications and concatenating the result

acebdf

*

[wyxz]

=

aw+bycw+dyew+fyax+bzcx+dzex+fz

An m x n matrix multiplied by an n x o matrix results in an m x o matrix. In the above example, a 3 x 2 matrix times a 2 x 2 matrix resulted in a 3 x 2 matrix.

To multiply two matrices, the number of columns of the first matrix must equal the number of rows of the second matrix.

So I'm gonna take this two matrices and just reverse them.  It turns out if you multiply these two matrices,  you get the second answer on the right.  And well clearly, right, these two matrices are not equal to each other.
So it doesn't matter whether I multiply 5 x 2 first or  whether I multiply 3 x 5 first, because sort of,  well, 3 x (5 x 2) = (3 x 5) x 2.  And this is called the associative property of real number multiplication.  It turns out that matrix multiplication is associative.
Finally, I want to tell you about the Identity Matrix,  which is a special matrix.  So let's again make the analogy to what we know of real numbers.  When dealing with real numbers or scalar numbers, the number 1,  you can think of it as the identity of multiplication.  And what I mean by that is that for  any number z, 1 x z = z x 1.  And that's just equal to the number z for any real number z.          Finally, I just wanna point out that earlier  I said that AB is not, in general, equal to BA.  Right?  For most matrices A and B, this is not true.  But when B is the identity matrix, this does hold true,  that A times the identity matrix does indeed equal to identity  times A is just that you know this is not true for other matrices B in general.

Matrix Multiplication Properties

  • Not commutative. A∗B≠B∗A
  • Associative. (A∗B)∗C=A∗(B∗C)

The identity matrix, when multiplied by any matrix of the same dimensions, results in the original matrix. It's just like multiplying numbers by 1. The identity matrix simply has 1's on the diagonal (upper left to lower right diagonal) and 0's elsewhere.

100010001

When multiplying the identity matrix after some matrix (A∗I), the square identity matrix should match the other matrix's columns. When multiplying the identity matrix before some other matrix (I∗A), the square identity matrix should match the other matrix's rows.

So how did I  find this inverse or how  did I come up with this inverse over here?  It turns out that sometimes  you can compute inverses by hand  but almost no one does that these days.  And it turns out there is  very good numerical software for  taking a matrix and computing its inverse.  So again, this is one of  those things where there are lots  of open source libraries that  you can link to from any  of the popular programming languages to compute inverses of matrices.  Let me show you a quick example.  How I actually computed this inverse,  and what I did was I used software called Optive. 
But the intuition if you want is that you can think of matrices as not have an inverse that is somehow too close to zero in some sense. So, just to wrap up the terminology, matrix that don't have an inverse Sometimes called a singular matrix or degenerate matrix and so this matrix over here is an example zero zero zero matrix. is an example of a matrix that is singular, or a matrix that is degenerate.
  Finally, the last special  matrix operation I want to  tell you about is to do matrix transpose.  So suppose I have  matrix A, if I compute  the transpose of A, that's what I get here on the right.  This is a transpose which is  written and A superscript T,  and the way you compute  the transpose of a matrix is as follows.  To get a transpose I am going  to first take the first  row of A one to zero.  That becomes this first column of this transpose. 

Inverse and Transpose

The inverse of a matrix A is denoted A−1. Multiplying by the inverse results in the identity matrix.

A non square matrix does not have an inverse matrix. We can compute inverses of matrices in octave with the pinv(A) function [1] and in matlab with the inv(A) function. Matrices that don't have an inverse are singular or degenerate.

The transposition of a matrix is like rotating the matrix 90° in clockwise direction and then reversing it. We can compute transposition of matrices in matlab with the transpose(A) function or A':

A =

acebdf

A^T =

[abcdef]

In other words:

A_{ij} = A^T_{ji}

반응형