Wednesday, February 8, 2012

machine learning lecture 1


I hope you can already tell that machine learning is a highly interdisciplinary topic in which just the TAs find learning algorithms to problems in computer vision and biology and robots and language. And machine learning is one of those things that has and is having a large impact on many applications.
So just in my own daily work, I actually frequently end up talking to people like helicopter pilots to biologists to people in computer systems or databases to economists and sort of also an unending stream of people from industry coming to Stanford interested in applying machine learning methods to their own problems.
So yeah, this is fun. A couple of weeks ago, a student actually forwarded to me an article in "Computer World" about the 12 IT skills that employers can't say no to. So it's about sort of the 12 most desirable skills in all of IT and all of information technology, and topping the list was actually machine learning. So I think this is a good time to be learning this stuff and learning algorithms and having a large impact on many segments of science and industry.
I'm actually curious about something. Learning algorithms is one of the things that touches many areas of science and industries,

let's say a few more words about machine learning. I feel that machine learning grew out of early work in AI, early work in artificial intelligence. And over the last — I wanna say last 15 or last 20 years or so, it's been viewed as a sort of growing new capability for computers. And in particular, it turns out that there are many programs or there are many applications that you can't program by hand.

For example, if you want to get a computer to read handwritten characters, to read sort of handwritten digits, that actually turns out to be amazingly difficult to write a piece of software to take this input, an image of something that I wrote and to figure out just what it is, to translate my cursive handwriting into — to extract the characters I wrote out in longhand. And other things: One thing that my students and I do is autonomous flight. It turns out to be extremely difficult to sit down and write a program to fly a helicopter.

But in contrast, if you want to do things like to get software to fly a helicopter or have software recognize handwritten digits, one very successful approach is to use a learning algorithm and have a computer learn by itself how to, say, recognize your handwriting. And in fact, handwritten digit recognition, this is pretty much the only approach that works well. It uses applications that are hard to program by hand.

Learning algorithms has also made I guess significant inroads in what's sometimes called database mining. So, for example, with the growth of IT and computers, increasingly many hospitals are keeping around medical records of what sort of patients, what problems they had, what their prognoses was, what the outcome was. And taking all of these medical records, which started to be digitized only about maybe 15 years, applying learning algorithms to them can turn raw medical records into what I might loosely call medical knowledge in which we start to detect trends in medical practice and even start to alter medical practice as a result of medical knowledge that's derived by applying learning algorithms to the sorts of medical records that hospitals have just been building over the last 15, 20 years in an electronic format.

Turns out that most of you probably use learning algorithms — I don't know — I think half a dozen times a day or maybe a dozen times a day or more, and often without knowing it. So, for example, every time you send mail via the US Postal System, turns out there's an algorithm that tries to automatically read the zip code you wrote on your envelope, and that's done by a learning algorithm. So every time you send US mail, you are using a learning algorithm, perhaps without even being aware of it.

Similarly, every time you write a check, I actually don't know the number for this, but a significant fraction of checks that you write are processed by a learning algorithm that's learned to read the digits, so the dollar amount that you wrote down on your check. So every time you write a check, there's another learning algorithm that you're probably using without even being aware of it.

If you use a credit card, or I know at least one phone company was doing this, and lots of companies like eBay as well that do electronic transactions, there's a good chance that there's a learning algorithm in the background trying to figure out if, say, your credit card's been stolen or if someone's engaging in a fraudulent transaction.

If you use a website like Amazon or Netflix that will often recommend books for you to buy or movies for you to rent or whatever, these are other examples of learning algorithms that have learned what sorts of things you like to buy or what sorts of movies you like to watch and can therefore give customized recommendations to you.

Just about a week ago, I had my car serviced, and even there, my car mechanic was trying to explain to me some learning algorithm in the innards of my car that's sort of doing its best to optimize my driving performance for fuel efficiency or something.

So, see, most of us use learning algorithms half a dozen, a dozen, maybe dozens of times without even knowing it.

And of course, learning algorithms are also doing things like giving us a growing understanding of the human genome. So if someday we ever find a cure for cancer, I bet learning algorithms will have had a large role in that. That's sort of the thing that Tom works on, yes?

So in teaching this class, I sort of have three goals. One of them is just to I hope convey some of my own excitement about machine learning to you.

The second goal is by the end of this class, I hope all of you will be able to apply state-of-the-art machine learning algorithms to whatever problems you're interested in. And if you ever need to build a system for reading zip codes, you'll know how to do that by the end of this class.

And lastly, by the end of this class, I realize that only a subset of you are interested in doing research in machine learning, but by the conclusion of this class, I hope that all of you will actually be well qualified to start doing research in machine learning, okay?

So let's say a few words about logistics. The prerequisites of this class are written on one of the handouts, are as follows: In this class, I'm going to assume that all of you have sort of basic knowledge of computer science and knowledge of the basic computer skills and principles. So I assume all of you know what big?O notation, that all of you know about sort of data structures like queues, stacks, binary trees, and that all of you know enough programming skills to, like, write a simple computer program. And it turns out that most of this class will not be very programming intensive, although we will do some programming, mostly in either MATLAB or Octave. I'll say a bit more about that later.

I also assume familiarity with basic probability and statistics. So most undergraduate statistics class, like Stat 116 taught here at Stanford, will be more than enough. I'm gonna assume all of you know what random variables are, that all of you know what expectation is, what a variance or a random variable is. And in case of some of you, it's been a while since you've seen some of this material. At some of the discussion sections, we'll actually go over some of the prerequisites, sort of as a refresher course under prerequisite class. I'll say a bit more about that later as well.
Lastly, I also assume familiarity with basic linear algebra. And again, most undergraduate linear algebra courses are more than enough. So if you've taken courses like Math 51, 103, Math 113 or CS205 at Stanford, that would be more than enough. Basically, I'm gonna assume that all of you know what matrixes and vectors are, that you know how to multiply matrices and vectors and multiply matrix and matrices, that you know what a matrix inverse is. If you know what an eigenvector of a matrix is, that'd be even better. But if you don't quite know or if you're not quite sure, that's fine, too. We'll go over it in the review sections.

So there are a couple more logistical things I should deal with in this class. One is that, as most of you know, CS229 is a televised class. And in fact, I guess many of you are probably watching this at home on TV, so I'm gonna say hi to our home viewers.
So earlier this year, I approached SCPD, which televises these classes, about trying to make a small number of Stanford classes publicly available or posting the videos on the web. And so this year, Stanford is actually starting a small pilot program in which we'll post videos of a small number of classes online, so on the Internet in a way that makes it publicly accessible to everyone. I'm very excited about that because machine learning in school, let's get the word out there.

One of the consequences of this is that — let's see — so videos or pictures of the students in this classroom will not be posted online, so your images — so don't worry about being by seeing your own face appear on YouTube one day. But the microphones may pick up your voices, so I guess the consequence of that is that because microphones may pick up your voices, no matter how irritated you are at me, don't yell out swear words in the middle of class, but because there won't be video you can safely sit there and make faces at me, and that won't show, okay?

Let's see. I also handed out this — there were two handouts I hope most of you have, course information handout. So let me just say a few words about parts of these. On the third page, there's a section that says Online Resources.

Oh, okay. Louder? Actually, could you turn up the volume? Testing. Is this better? Testing, testing. Okay, cool. Thanks.

So all right, online resources. The class has a home page, so it's in on the handouts. I won't write on the chalkboard — http://cs229.stanford.edu. And so when there are homework assignments or things like that, we usually won't sort of — in the mission of saving trees, we will usually not give out many handouts in class. So homework assignments, homework solutions will be posted online at the course home page.

As far as this class, I've also written, and I guess I've also revised every year a set of fairly detailed lecture notes that cover the technical content of this class. And so if you visit the course homepage, you'll also find the detailed lecture notes that go over in detail all the math and equations and so on that I'll be doing in class.

There's also a newsgroup, su.class.cs229, also written on the handout. This is a newsgroup that's sort of a forum for people in the class to get to know each other and have whatever discussions you want to have amongst yourselves. So the class newsgroup will not be monitored by the TAs and me. But this is a place for you to form study groups or find project partners or discuss homework problems and so on, and it's not monitored by the TAs and me. So feel free to talk trash about this class there.

If you want to contact the teaching staff, please use the email address written down here, cs229-qa@cs.stanford.edu. This goes to an account that's read by all the TAs and me. So rather than sending us email individually, if you send email to this account, it will actually let us get back to you maximally quickly with answers to your questions.

If you're asking questions about homework problems, please say in the subject line which assignment and which question the email refers to, since that will also help us to route your question to the appropriate TA or to me appropriately and get the response back to you quickly.

Let's see. Skipping ahead — let's see — for homework, one midterm, one open and term project. Notice on the honor code. So one thing that I think will help you to succeed and do well in this class and even help you to enjoy this class more is if you form a study group.

So start looking around where you're sitting now or at the end of class today, mingle a little bit and get to know your classmates. I strongly encourage you to form study groups and sort of have a group of people to study with and have a group of your fellow students to talk over these concepts with. You can also post on the class newsgroup if you want to use that to try to form a study group.

But some of the problems sets in this class are reasonably difficult. People that have taken the class before may tell you they were very difficult. And just I bet it would be more fun for you, and you'd probably have a better learning experience if you form a study group of people to work with. So I definitely encourage you to do that.

And just to say a word on the honor code, which is I definitely encourage you to form a study group and work together, discuss homework problems together. But if you discuss homework problems with other students, then I'll ask you to sort of go home and write down your own solutions independently without referring to notes that were taken in any of your joint study sessions.

So in other words, when you turn in a homework problem, what you turn in should be something that was reconstructed independently by yourself and without referring to notes that you took during your study sessions with other people, okay? And obviously, showing your solutions to others or copying other solutions directly is right out.

We occasionally also reuse problem set questions from previous years so that the problems are a bit more debugged and work more smoothly. And as a result of that, I also ask you not to look at solutions from previous years, and this includes both sort of official solutions that we've given out to previous generations of this class and previous solutions that people that have taken this class in previous years may have written out by themselves, okay?

Sadly, in this class, there are usually — sadly, in previous years, there have often been a few honor code violations in this class. And last year, I think I prosecuted five honor code violations, which I think is a ridiculously large number. And so just don't work without solutions, and hopefully there'll be zero honor code violations this year. I'd love for that to happen.

The section here on the late homework policy if you ever want to hand in a homework late, I'll leave you to read that yourself.

We also have a midterm, which is scheduled for Thursday, 8th of November at 6:00 p.m., so please keep that evening free.

And let's see. And one more administrative thing I wanted to say is about the class project. So part of the goal of this class is to leave you well equipped to apply machine learning algorithms to a problem or to do research in machine learning. And so as part of this class, I'll ask you to execute a small research project sort of as a small term project.

And what most students do for this is either apply machine learning to a problem that you find interesting or investigate some aspect of machine learning. So to those of you that are either already doing research or to those of you who are in industry, you're taking this from a company, one fantastic sort of way to do a class project would be if you apply machine learning algorithms to a problem that you're interested in, to a problem that you're already working on, whether it be a science research problem or sort of a problem in industry where you're trying to get a system to work using a learning algorithm.

To those of you that are not currently doing research, one great way to do a project would be if you apply learning algorithms to just pick a problem that you care about. Pick a problem that you find interesting, and apply learning algorithms to that and play with the ideas and see what happens.

And let's see. Oh, and the goal of the project should really be for you to do a publishable piece of research in machine learning, okay?

And if you go to the course website, you'll actually find a list of the projects that students had done last year. And so I'm holding the list in my hand. You can go home later and take a look at it online.

But reading down this list, I see that last year, there were students that applied learning algorithms to control a snake robot. There was a few projects on improving learning algorithms. There's a project on flying autonomous aircraft. There was a project actually done by our TA Paul on improving computer vision algorithms using machine learning.

There are a couple of projects on Netflix rankings using learning algorithms; a few medical robots; ones on segmenting [inaudible] to segmenting pieces of the body using learning algorithms; one on musical instrument detection; another on irony sequence alignment; and a few algorithms on understanding the brain neuroscience, actually quite a few projects on neuroscience; a couple of projects on undescending fMRI data on brain scans, and so on; another project on market makings, the financial trading. There was an interesting project on trying to use learning algorithms to decide what is it that makes a person's face physically attractive. There's a learning algorithm on optical illusions, and so on.


So start by talking about what machine learning is. What is machine learning? Actually, can you read the text out there? Raise your hand if the text on the small screens is legible. Oh, okay, cool, mostly legible. Okay. So I'll just read it out.

So what is machine learning? Way back in about 1959, Arthur Samuel defined machine learning informally as the [inaudible] that gives computers to learn — [inaudible] that gives computers the ability to learn without being explicitly programmed. So Arthur Samuel, so way back in the history of machine learning, actually did something very cool, which was he wrote a checkers program, which would play games of checkers against itself.

And so because a computer can play thousands of games against itself relatively quickly, Arthur Samuel had his program play thousands of games against itself, and over time it would start to learn to recognize patterns which led to wins and patterns which led to losses. So over time it learned things like that, "Gee, if I get a lot of pieces taken by the opponent, then I'm more likely to lose than win," or, "Gee, if I get my pieces into a certain position, then I'm especially likely to win rather than lose."

And so over time, Arthur Samuel had a checkers program that would actually learn to play checkers by learning what are the sort of board positions that tend to be associated with wins and what are the board positions that tend to be associated with losses. And way back around 1959, the amazing thing about this was that his program actually learned to play checkers much better than Arthur Samuel himself could.

So even today, there are some people that say, well, computers can't do anything that they're not explicitly programmed to. And Arthur Samuel's checkers program was maybe the first I think really convincing refutation of this claim. Namely, Arthur Samuel managed to write a checkers program that could play checkers much better than he personally could, and this is an instance of maybe computers learning to do things that they were not programmed explicitly to do.

Here's a more recent, a more modern, more formal definition of machine learning due to Tom Mitchell, who says that a well-posed learning problem is defined as follows: He says that a computer program is set to learn from an experience E with respect to some task T and some performance measure P if its performance on T as measured by P improves with experience E. Okay. So not only is it a definition, it even rhymes.

So, for example, in the case of checkers, the experience E that a program has would be the experience of playing lots of games of checkers against itself, say. The task T is the task of playing checkers, and the performance measure P will be something like the fraction of games it wins against a certain set of human opponents. And by this definition, we'll say that Arthur Samuel's checkers program has learned to play checkers, okay?

So as an overview of what we're going to do in this class, this class is sort of organized into four major sections. We're gonna talk about four major topics in this class, the first of which is supervised learning. So let me give you an example of that.

So suppose you collect a data set of housing prices. And one of the TAs, Dan Ramage, actually collected a data set for me last week to use in the example later. But suppose that you go to collect statistics about how much houses cost in a certain geographic area. And Dan, the TA, collected data from housing prices in Portland, Oregon. So what you can do is let's say plot the square footage of the house against the list price of the house, right, so you collect data on a bunch of houses. And let's say you get a data set like this with houses of different sizes that are listed for different amounts of money.

Now, let's say that I'm trying to sell a house in the same area as Portland, Oregon as where the data comes from. Let's say I have a house that's this size in square footage, and I want an algorithm to tell me about how much should I expect my house to sell for. So there are lots of ways to do this, and some of you may have seen elements of what I'm about to say before.

So one thing you could do is look at this data and maybe put a straight line to it. And then if this is my house, you may then look at the straight line and predict that my house is gonna go for about that much money, right? There are other decisions that we can make, which we'll talk about later, which is, well, what if I don't wanna put a straight line? Maybe I should put a quadratic function to it. Maybe that fits the data a little bit better. You notice if you do that, the price of my house goes up a bit, so that'd be nice.

And this sort of learning problem of learning to predict housing prices is an example of what's called a supervised learning problem. And the reason that it's called supervised learning is because we're providing the algorithm a data set of a bunch of square footages, a bunch of housing sizes, and as well as sort of the right answer of what the actual prices of a number of houses were, right?

So we call this supervised learning because we're supervising the algorithm or, in other words, we're giving the algorithm the, quote, right answer for a number of houses. And then we want the algorithm to learn the association between the inputs and the outputs and to sort of give us more of the right answers, okay?

It turns out this specific example that I drew here is an example of something called a regression problem. And the term regression sort of refers to the fact that the variable you're trying to predict is a continuous value and price.

There's another class of supervised learning problems which we'll talk about, which are classification problems. And so, in a classification problem, the variable you're trying to predict is discreet rather than continuous. So as one specific example — so actually a standard data set you can download online [inaudible] that lots of machine learning people have played with. Let's say you collect a data set on breast cancer tumors, and you want to learn the algorithm to predict whether or not a certain tumor is malignant. Malignant is the opposite of benign, right, so malignancy is a sort of harmful, bad tumor. So we collect some number of features, some number of properties of these tumors, and for the sake of sort of having a simple [inaudible] explanation, let's just say that we're going to look at the size of the tumor and depending on the size of the tumor, we'll try to figure out whether or not the tumor is malignant or benign.

So the tumor is either malignant or benign, and so the variable in the Y axis is either zero or 1, and so your data set may look something like that, right? And that's 1 and that's zero, okay? And so this is an example of a classification problem where the variable you're trying to predict is a discreet value. It's either zero or 1.

And in fact, more generally, there will be many learning problems where we'll have more than one input variable, more than one input feature and use more than one variable to try to predict, say, whether a tumor is malignant or benign. So, for example, continuing with this, you may instead have a data set that looks like this. I'm gonna part this data set in a slightly different way now. And I'm making this data set look much cleaner than it really is in reality for illustration, okay?

For example, maybe the crosses indicate malignant tumors and the "O"s may indicate benign tumors. And so you may have a data set comprising patients of different ages and who have different tumor sizes and where a cross indicates a malignant tumor, and an "O" indicates a benign tumor. And you may want an algorithm to learn to predict, given a new patient, whether their tumor is malignant or benign.

So, for example, what a learning algorithm may do is maybe come in and decide that a straight line like that separates the two classes of tumors really well, and so if you have a new patient who's age and tumor size fall over there, then the algorithm may predict that the tumor is benign rather than malignant, okay? So this is just another example of another supervised learning problem and another classification problem.

And so it turns out that one of the issues we'll talk about later in this class is in this specific example, we're going to try to predict whether a tumor is malignant or benign based on two features or based on two inputs, namely the age of the patient and the tumor size. It turns out that when you look at a real data set, you find that learning algorithms often use other sets of features. In the breast cancer data example, you also use properties of the tumors, like clump thickness, uniformity of cell size, uniformity of cell shape, [inaudible] adhesion and so on, so various other medical properties.

And one of the most interesting things we'll talk about later this quarter is what if your data doesn't lie in a two-dimensional or three-dimensional or sort of even a finite dimensional space, but is it possible — what if your data actually lies in an infinite dimensional space? Our plots here are two-dimensional space. I can't plot you an infinite dimensional space, right? And so it turns out that one of the most successful classes of machine learning algorithms — some may call support vector machines — actually takes data and maps data to an infinite dimensional space and then does classification using not two features like I've done here, but an infinite number of features.

And that will actually be one of the most fascinating things we talk about when we go deeply into classification algorithms. And it's actually an interesting question, right, so think about how do you even represent an infinite dimensional vector in computer memory? You don't have an infinite amount of computers. How do you even represent a point that lies in an infinite dimensional space? We'll talk about that when we get to support vector machines, okay?

So let's see. So that was supervised learning. The second of the four major topics of this class will be learning theory. So I have a friend who teaches math at a different university, not at Stanford, and when you talk to him about his work and what he's really out to do, this friend of mine will — he's a math professor, right? — this friend of mine will sort of get the look of wonder in his eyes, and he'll tell you about how in his mathematical work, he feels like he's discovering truth and beauty in the universe. And he says it in sort of a really touching, sincere way, and then he has this — you can see it in his eyes — he has this deep appreciation of the truth and beauty in the universe as revealed to him by the math he does.

In this class, I'm not gonna do any truth and beauty. In this class, I'm gonna talk about learning theory to try to convey to you an understanding of how and why learning algorithms work so that we can apply these learning algorithms as effectively as possible.

So, for example, it turns out you can prove surprisingly deep theorems on when you can guarantee that a learning algorithm will work, all right? So think about a learning algorithm for reading zip codes. When can you prove a theorem guaranteeing that a learning algorithm will be at least 99.9 percent accurate on reading zip codes? This is actually somewhat surprising. We actually prove theorems showing when you can expect that to hold.

We'll also sort of delve into learning theory to try to understand what algorithms can approximate different functions well and also try to understand things like how much training data do you need? So how many examples of houses do I need in order for your learning algorithm to recognize the pattern between the square footage of a house and its housing price? And this will help us answer questions like if you're trying to design a learning algorithm, should you be spending more time collecting more data or is it a case that you already have enough data; it would be a waste of time to try to collect more. Okay?

So I think learning algorithms are a very powerful tool that as I walk around sort of industry in Silicon Valley or as I work with various businesses in CS and outside CS, I find that there's often a huge difference between how well someone who really understands this stuff can apply a learning algorithm versus someone who sort of gets it but sort of doesn't.

The analogy I like to think of is imagine you were going to a carpentry school instead of a machine learning class, right? If you go to a carpentry school, they can give you the tools of carpentry. They'll give you a hammer, a bunch of nails, a screwdriver or whatever. But a master carpenter will be able to use those tools far better than most of us in this room. I know a carpenter can do things with a hammer and nail that I couldn't possibly. And it's actually a little bit like that in machine learning, too. One thing that's sadly not taught in many courses on machine learning is how to take the tools of machine learning and really, really apply them well.

So in the same way, so the tools of machine learning are I wanna say quite a bit more advanced than the tools of carpentry. Maybe a carpenter will disagree. But a large part of this class will be just giving you the raw tools of machine learning, just the algorithms and so on. But what I plan to do throughout this entire quarter, not just in the segment of learning theory, but actually as a theme running through everything I do this quarter, will be to try to convey to you the skills to really take the learning algorithm ideas and really to get them to work on a problem.

It's sort of hard for me to stand here and say how big a deal that is, but when I walk around companies in Silicon Valley, it's completely not uncommon for me to see someone using some machine learning algorithm and then explain to me what they've been doing for the last six months, and I go, oh, gee, it should have been obvious from the start that the last six months, you've been wasting your time, right?

And so my goal in this class, running through the entire quarter, not just on learning theory, is actually not only to give you the tools of machine learning, but to teach you how to use them well. And I've noticed this is something that really not many other classes teach. And this is something I'm really convinced is a huge deal, and so by the end of this class, I hope all of you will be master carpenters. I hope all of you will be really good at applying these learning algorithms and getting them to work amazingly well in many problems. Okay?

Let's see. So [inaudible] the board. After learning theory, there's another class of learning algorithms that I then want to teach you about, and that's unsupervised learning. So you recall, right, a little earlier I drew an example like this, right, where you have a couple of features, a couple of input variables and sort of malignant tumors and benign tumors or whatever. And that was an example of a supervised learning problem because the data you have gives you the right answer for each of your patients. The data tells you this patient has a malignant tumor; this patient has a benign tumor. So it had the right answers, and you wanted the algorithm to just produce more of the same.

In contrast, in an unsupervised learning problem, this is the sort of data you get, okay? Where speaking loosely, you're given a data set, and I'm not gonna tell you what the right answer is on any of your data. I'm just gonna give you a data set and I'm gonna say, "Would you please find interesting structure in this data set?" So that's the unsupervised learning problem where you're sort of not given the right answer for everything.

So, for example, an algorithm may find structure in the data in the form of the data being partitioned into two clusters, or clustering is sort of one example of an unsupervised learning problem.

So I hope you can see this. It turns out that these sort of unsupervised learning algorithms are also used in many problems. This is a screen shot — this is a picture I got from Sue Emvee, who's a PhD student here, who is applying unsupervised learning algorithms to try to understand gene data, so is trying to look at genes as individuals and group them into clusters based on properties of what genes they respond to — based on properties of how the genes respond to different experiments.

Another interesting application of [inaudible] sorts of clustering algorithms is actually image processing, this which I got from Steve Gules, who's another PhD student. It turns out what you can do is if you give this sort of data, say an image, to certain unsupervised learning algorithms, they will then learn to group pixels together and say, gee, this sort of pixel seems to belong together, and that sort of pixel seems to belong together.

And so the images you see on the bottom — I guess you can just barely see them on there — so the images you see on the bottom are groupings — are what the algorithm has done to group certain pixels together. On a small display, it might be easier to just look at the image on the right. The two images on the bottom are two sort of identical visualizations of the same grouping of the pixels into [inaudible] regions.

And so it turns out that this sort of clustering algorithm or this sort of unsupervised learning algorithm, which learns to group pixels together, it turns out to be useful for many applications in vision, in computer vision image processing.

I'll just show you one example, and this is a rather cool one that two students, Ashutosh Saxena and Min Sun here did, which is given an image like this, right? This is actually a picture taken of the Stanford campus. You can apply that sort of clustering algorithm and group the picture into regions. Let me actually blow that up so that you can see it more clearly. Okay. So in the middle, you see the lines sort of grouping the image together, grouping the image into [inaudible] regions.

And what Ashutosh and Min did was they then applied the learning algorithm to say can we take this clustering and use it to build a 3D model of the world? And so using the clustering, they then had a learning algorithm try to learn what the 3D structure of the world looks like so that they could come up with a 3D model that you can sort of fly through, okay? Although many people used to think it's not possible to take a single image and build a 3D model, but using a learning algorithm and that sort of clustering algorithm is the first step. They were able to.

I'll just show you one more example. I like this because it's a picture of Stanford with our beautiful Stanford campus. So again, taking the same sort of clustering algorithms, taking the same sort of unsupervised learning algorithm, you can group the pixels into different regions. And using that as a pre-processing step, they eventually built this sort of 3D model of Stanford campus in a single picture. You can sort of walk into the ceiling, look around the campus. Okay? This actually turned out to be a mix of supervised and unsupervised learning, but the unsupervised learning, this sort of clustering was the first step.

So it turns out these sorts of unsupervised — clustering algorithms are actually routinely used for many different problems, things like organizing computing clusters, social network analysis, market segmentation, so if you're a marketer and you want to divide your market into different segments or different groups of people to market to them separately; even for astronomical data analysis and understanding how galaxies are formed. These are just a sort of small sample of the applications of unsupervised learning algorithms and clustering algorithms that we'll talk about later in this class.

Just one particularly cool example of an unsupervised learning algorithm that I want to tell you about. And to motivate that, I'm gonna tell you about what's called the cocktail party problem, which is imagine that you're at some cocktail party and there are lots of people standing all over. And you know how it is, right, if you're at a large party, everyone's talking, it can be sometimes very hard to hear even the person in front of you. So imagine a large cocktail party with lots of people. So the problem is, is that all of these people talking, can you separate out the voice of just the person you're interested in talking to with all this loud background noise?

So I'll show you a specific example in a second, but here's a cocktail party that's I guess rather sparsely attended by just two people. But what we're gonna do is we'll put two microphones in the room, okay? And so because the microphones are just at slightly different distances to the two people, and the two people may speak in slightly different volumes, each microphone will pick up an overlapping combination of these two people's voices, so slightly different overlapping voices. So Speaker 1's voice may be more loud on Microphone 1, and Speaker 2's voice may be louder on Microphone 2, whatever.

But the question is, given these microphone recordings, can you separate out the original speaker's voices? So I'm gonna play some audio clips that were collected by Tai Yuan Lee at UCSD. I'm gonna actually play for you the original raw microphone recordings from this cocktail party. So this is the Microphone 1:



So it turns out that beyond solving the cocktail party algorithm, this specific class of unsupervised learning algorithms are also applied to a bunch of other problems, like in text processing or understanding functional grading and machine data, like the magneto-encephalogram would be an EEG data. We'll talk about that more when we go and describe ICA or independent component analysis algorithms, which is what you just saw.

And as an aside, this algorithm I just showed you, it seems like it must be a pretty complicated algorithm, right, to take this overlapping audio streams and separate them out. It sounds like a pretty complicated thing to do. So you're gonna ask how complicated is it really to implement an algorithm like this? It turns out if you do it in MATLAB, you can do it in one line of code.

So I got this from Samuel Wyse at Toronto, U of Toronto, and the example I showed you actually used a more complicated ICA algorithm than this. But nonetheless, I guess this is why for this class I'm going to ask you to do most of your programming in MATLAB and Octave because if you try to implement the same algorithm in C or Java or something, I can tell you from personal, painful experience, you end up writing pages and pages of code rather than relatively few lines of code. I'll also mention that it did take researchers many, many years to come up with that one line of code, so this is not easy.

So that was unsupervised learning, and then the last of the four major topics I wanna tell you about is reinforcement learning. And this refers to problems where you don't do one-shot decision-making. So, for example, in the supervised learning cancer prediction problem, you have a patient come in, you predict that the cancer is malignant or benign. And then based on your prediction, maybe the patient lives or dies, and then that's it, right? So you make a decision and then there's a consequence. You either got it right or wrong. In reinforcement learning problems, you are usually asked to make a sequence of decisions over time.

So, for example, this is something that my students and I work on. If I give you the keys to an autonomous helicopter — we actually have this helicopter here at Stanford, — how do you write a program to make it fly, right? You notice that if you make a wrong decision on a helicopter, the consequence of crashing it may not happen until much later. And in fact, usually you need to make a whole sequence of bad decisions to crash a helicopter. But conversely, you also need to make a whole sequence of good decisions in order to fly a helicopter really well.

So I'm gonna show you some fun videos of learning algorithms flying helicopters. This is a video of our helicopter at Stanford flying using a controller that was learned using a reinforcement learning algorithm. So this was done on the Stanford football field, and we'll zoom out the camera in a second. You'll sort of see the trees planted in the sky. So maybe this is one of the most difficult aerobatic maneuvers flown on any helicopter under computer control. And this controller, which is very, very hard for a human to sit down and write out, was learned using one of these reinforcement learning algorithms.

Just a word about that: The basic idea behind a reinforcement learning algorithm is this idea of what's called a reward function. What we have to think about is imagine you're trying to train a dog. So every time your dog does something good, you say, "Good dog," and you reward the dog. Every time your dog does something bad, you go, "Bad dog," right? And hopefully, over time, your dog will learn to do the right things to get more of the positive rewards, to get more of the "Good dogs" and to get fewer of the "Bad dogs.”

So the way we teach a helicopter to fly or any of these robots is sort of the same thing. Every time the helicopter crashes, we go, "Bad helicopter," and every time it does the right thing, we go, "Good helicopter," and over time it learns how to control itself so as to get more of these positive rewards.

So reinforcement learning is — I think of it as a way for you to specify what you want done, so you have to specify what is a "good dog" and what is a "bad dog" behavior. And then it's up to the learning algorithm to figure out how to maximize the "good dog" reward signals and minimize the "bad dog" punishments.

So it turns out reinforcement learning is applied to other problems in robotics. It's applied to things in web crawling and so on. But it's just cool to show videos, so let me just show a bunch of them. This learning algorithm was actually implemented by our head TA, Zico, of programming a four-legged dog. I guess Sam Shriver in this class also worked on the project and Peter Renfrew and Mike and a few others. But I guess this really is a good dog/bad dog since it's a robot dog.

The second video on the right, some of the students, I guess Peter, Zico, Tonca working on a robotic snake, again using learning algorithms to teach a snake robot to climb over obstacles.

Below that, this is kind of a fun example. Ashutosh Saxena and Jeff Michaels used learning algorithms to teach a car how to drive at reasonably high speeds off roads avoiding obstacles.

And on the lower right, that's a robot programmed by PhD student Eva Roshen to teach a sort of somewhat strangely configured robot how to get on top of an obstacle, how to get over an obstacle. Sorry. I know the video's kind of small. I hope you can sort of see it. Okay?

So I think all of these are robots that I think are very difficult to hand-code a controller for by learning these sorts of learning algorithms. You can in relatively short order get a robot to do often pretty amazing things.

Okay. So that was most of what I wanted to say today. Just a couple more last things, but let me just check what questions you have right now. So if there are no questions, I'll just close with two reminders, which are after class today or as you start to talk with other people in this class, I just encourage you again to start to form project partners, to try to find project partners to do your project with. And also, this is a good time to start forming study groups, so either talk to your friends or post in the newsgroup, but we just encourage you to try to start to do both of those today,