머신러닝,딥러닝/Andrew Ng 머신러닝 코세라 강의 노트

Week 1 Lecture ML:Intro ~ Supervised learning

mcdn 2020. 8. 6. 01:46
반응형

Week 1 Lecture Notes

ML:Introduction

Where is machine learning used?

 

application that can't be programmed by hand : handwriting recognition

Introduction : where is machine learning used? 
 - web search engine like Google or Bing -> learned how to rank web pages
 - Facebook or Apple's photo typing application -> recognizes your friends 
 - spam filter in your email 
 - web click data, also called clickstream data from Silicon Valley companies 
moreover
 - applications that can't be programmed by hand
 - ex natural language processing
 - handwriting recognition 
 - autonomous helicopter

 


https://www.computerworld.com/article/2542247/12-it-skills-that-employers-can-t-say-no-to.html

 

12 IT skills that employers can't say no to

Students with the right IT skills are getting snapped up before they graduate from college, job hunters say. But even if you're already in a career, you'd better know which adjunct skills will help you advance.

www.computerworld.com

first one skill : Machine Learning

A few months ago, a student showed me an article on the top twelve IT skills. The skills that information technology hiring managers cannot say no to. It was a slightly older article, but at the top of this list of the twelve most desirable IT skills was machine learning. Here at Stanford, the number of recruiters that contact me asking if I know any graduating machine learning students is far larger than the machine learning students we graduate each year. So I think there is a vast, unfulfilled demand for this skill set, and this is a great time to be learning about machine learning,  

 

 

But what he did was he had to programmed maybe tens of thousands of games against  himself, and by watching what sorts of board positions tended to lead to wins and  what sort of board positions tended to lead to losses,  the checkers playing program learned over time what are good board positions and  what are bad board positions. 

What is Machine Learning?

Two definitions of Machine Learning are offered. Arthur Samuel described it as: "the field of study that gives computers the ability to learn without being explicitly programmed." This is an older, informal definition.

Tom Mitchell provides a more modern definition: "A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E."

 

Example: playing checkers.

E = the experience of playing many games of checkers

T = the task of playing checkers.

P = the probability that the program will win the next game.

In general, any machine learning problem can be assigned to one of two broad classifications:

supervised learning, OR

unsupervised learning.

 

QUIZ what is task T? 
틀림 헤헤 
There are several different types of learning algorithms.  The main two types are what we call supervised learning and  unsupervised learning. You might also hear other ghost terms such as reinforcement learning and  recommender systems.  These are other types of machine learning algorithms that we'll talk about later.
So, given this data,  let's say you have a friend who owns a house that is say 750 square feet,  and they are hoping to sell the house,  and they want to know how much they can get for the house. ...For example, instead of fitting a straight line to the data,  we might decide that it's better to fit a quadratic function,  or a second-order polynomial to this data.  If you do that and make a prediction here,  then it looks like, well,  maybe they can sell the house for closer to $200,000. .. The term Supervised Learning refers to  the fact that we gave the algorithm a data set in which the,  called, "right answers" were given. 
The term classification refers to the fact, that here,  we're trying to predict a discrete value output zero or one, malignant or benign.  It turns out that in classification problems,  sometimes you can have more than two possible values for the output.
So, given a data set like this,  what the learning algorithm might do is fit a straight line to the data to  try to separate out the malignant tumors from the benign ones,  and so the learning algorithm may decide to put a straight line like  that to separate out the two causes of tumors. 
So, how do you deal with an infinite number of features? How do you even store an infinite number of things in the computer when your computer is going to run out of memory? It turns out that when we talk about an algorithm called the Support Vector Machine, there will be a neat mathematical trick that will allow a computer to deal with an infinite number of features.
Back then, recall data sets  that look like this, where each  example was labeled either  as a positive or negative example,  whether it was a benign or a malignant tumor. in Supervised  Learning, we were told explicitly what  is the so-called right answer,  whether it's benign or malignant.

Supervised Learning

In supervised learning, we are given a data set and already know what our correct output should look like, having the idea that there is a relationship between the input and the output.

Supervised learning problems are categorized into "regression" and "classification" problems. In a regression problem, we are trying to predict results within a continuous output, meaning that we are trying to map input variables to some continuous function. In a classification problem, we are instead trying to predict results in a discrete output. In other words, we are trying to map input variables into discrete categories. Here is a description on Math is Fun on Continuous and Discrete Data.

Example 1:

Given data about the size of houses on the real estate market, try to predict their price. Price as a function of size is a continuous output, so this is a regression problem.

We could turn this example into a classification problem by instead making our output about whether the house "sells for more or less than the asking price." Here we are classifying the houses based on price into two discrete categories.

Example 2:

(a) Regression - Given a picture of Male/Female, We have to predict his/her age on the basis of given picture.

(b) Classification - Given a picture of Male/Female, We have to predict Whether He/She is of High school, College, Graduate age. Another Example for Classification - Banks have to decide whether or not to give a loan to someone on the basis of his credit history.

For problem one, I would treat this as  a regression problem because if I have thousands of items,  well, I would probably just treat this as a real value,  as a continuous value.  Therefore, the number of items I sell as a continuous value.  For the second problem,  I would treat that as a classification problem,  because I might say set the value I want to  predict with zero to denote the account has not been hacked,  and set the value one to denote an account that has been hacked into.  So, just like your breast cancers where zero is benign, one is malignant.  So, I might set this be zero or one depending on whether it's been hacked,  and have an algorithm try to predict each one of these two discrete values.
Can you find some structure in the data?  Given this data set, an  Unsupervised Learning algorithm might decide that  the data lives in two different clusters.  And so there's one cluster ... group your customers into different  market segments so that  you can automatically and more  efficiently sell or market  your different market segments together? All of these are examples of clustering,  which is just one type of Unsupervised Learning.
So we can do, is take  these two microphone recorders and give  them to an Unsupervised Learning algorithm  called the cocktail party algorithm,  and tell the algorithm  - find structure in this data for you.  And what the algorithm will do  is listen to these  audio recordings and say, you  know it sounds like the  two audio recordings are being  added together or that have being  summed together to produce these recordings that we had.  Moreover, what the cocktail party  algorithm will do is separate  out these two audio sources  that were being added or being  summed together to form other  recordings and, in fact,  here's the first output of the cocktail party algorithm.
It turns out the algorithm, to  do what you just heard, that  can be done with one line  of code - shown right here. So this is also why in this class we're going to use the Octave programming environment. 10분 8초부터 동영상을 재생하고 스크립트 따르기10:08Octave, is free open source software, and using a tool like Octave or Matlab, many learning algorithms become just a few lines of code to implement. Later in this class, I'll just teach you a little bit about how to use Octave and you'll be implementing some of these algorithms in Octave. Or if you have Matlab you can use that too. If you were trying to do this  in C++ or Java,  this would be many many lines of  code linking complex C++ or Java libraries.  So, you can implement this stuff as  C++ or Java  or Python, it's just much  more complicated to do so in those languages.
The news story example, that's exactly the Google News example that we saw in this video, we saw how you can use a clustering algorithm to cluster these articles together so that's Unsupervised Learning. The market segmentation example I talked a little bit earlier, you can do that as an Unsupervised Learning problem because I am just gonna get my algorithm data and ask it to discover market segments automatically.

Unsupervised Learning

Unsupervised learning, on the other hand, allows us to approach problems with little or no idea what our results should look like. We can derive structure from data where we don't necessarily know the effect of the variables.

We can derive this structure by clustering the data based on relationships among the variables in the data.

With unsupervised learning there is no feedback based on the prediction results, i.e., there is no teacher to correct you.

Example:

Clustering: Take a collection of 1000 essays written on the US Economy, and find a way to automatically group these essays into a small number that are somehow similar or related by different variables, such as word frequency, sentence length, page count, and so on.

Non-clustering: The "Cocktail Party Algorithm", which can find structure in messy data (such as the identification of individual voices and music from a mesh of sounds at a cocktail party (https://en.wikipedia.org/wiki/Cocktail_party_effect) ). Here is an answer on Quora to enhance your understanding. : https://www.quora.com/What-is-the-difference-between-supervised-and-unsupervised-learning-algorithms ?

1. 3rd 2. classification

 

3. classification
틀림 : 답 뭐냐. 난 2 4 고름 
5. 3rd

 

반응형