Placeholder Image

字幕列表 影片播放

  • Welcome back to Mining of Massive Datasets.

  • Today's topic is Recommender Systems.

  • We're going to start with an overview of recommendation systems, and

  • why they are necessary.

  • Today, we are going to look at the two most common types of Recommender Systems.

  • Content-Based Systems and Collaborative Filtering.

  • And finally we're going to look at how to evaluate Recommender Systems,

  • to make sure they're doing a good job.

  • Let's start with an overview.

  • Imagine any situation where a user interacts with

  • a really large catalog of items.

  • Now these items could be products at Amazon.

  • They could be movies at Netflix.

  • They could be music from Pandora's catalog.

  • Or they could be the news items on Google News.

  • What really matters, is that there are tens of thousands, or

  • hundreds of thousands, or millions of items.

  • A really large catalog.

  • And the user is interacting with this catalog.

  • There's two ways in which a user can interact with a large catalog of items.

  • The first is such, user knows what they're looking for, and they go and

  • they search the catalog for the precise item that they're looking for.

  • Now when you have a very large catalog of items,

  • very often the user doesn't know exactly what they're looking for.

  • And this is where recommendations come in.

  • The system recommends to the user certain items that they think the user will be

  • interested in, based on what they know about the user.

  • Now why do we really need such recommendations?

  • The key that made recommendations so important and

  • why a recommendation system developed so much in the last ten or 20 years.

  • If that he moved from an area scarcity to an area of abundance.

  • What do I mean by this?

  • Imagine that you were out shopping 20 years ago,

  • and you'd go to a local retailer, and

  • you'll find a certain number of products on the shelves of the local retailer.

  • Now, even in the really large retailer like, like a Wal Mart, for instance.

  • Shelf space is a key, is a scarce commodity.

  • It limits the number of items that a retailer can carry.

  • Shelf space is expensive, because it involves real estate costs.

  • And, therefore, a retailer can carry only a certain number of products.

  • Now, a similar situation applies in the case of, for example, TV networks.

  • A TV network can carry only so many shows, because there's only so

  • many hours in a day.

  • And there are only so

  • many movie theaters, so they can only ser, screen a certain number of movies.

  • Now once the internet was developed, things changed.

  • The web enables zero-cost dissemination of information about products.

  • And what this means, is that we can have many more products than ever before.

  • There is no shelf space limitation on the number of products.

  • That's why the number of products on Amazon is much,

  • much more than the number of products available at any physical retailer.

  • The number of you know, movies available on Netflix is more than the number of

  • movies that have been available, available at Blockbuster and so on.

  • This near-zero-cost dissemination of information gives rise to

  • a phenomenon that's called the long tail phenomenon.

  • Let's examine what this is.

  • Now imagine a graph, where on X-axis we've taken the items in the catalog.

  • Remember, items might be books, or music, or video, or news articles.

  • And we've ranked these items by popularity.

  • So the most popular items are on the left, and

  • as they move towards the right, the items become less and less popular.

  • What do I mean by popular?

  • Well, I mean the number of times the item is purchased in a week.

  • Or the number of times a movie is viewed in a week, or a month, or

  • some, some fixed time period.

  • Now, on the Y axis, you have the actual popularity, which in this case I've

  • shown as the number of purchases per week, it could be number of views per week, or

  • it could be number of, you know, plays per month, for some music, and so on.

  • So in general you have items ranked by popularity along the X axis,

  • and the popularity itself along the Y axis.

  • Now when you take items you know, in a large catalog.

  • And you rank them,

  • and you plot them on this curve you get a curve that looks like this.

  • You can see that the score, you know, has a very steep fall initially.

  • the, the, you know, you have a really, really, a few really,

  • really popular items.

  • And then as you move towards the right as the,

  • you know, as the item rank becomes greater the popularity falls off very steeply.

  • But at a certain point, you can see that this popularity stops, you know,

  • falls off less and less deeply.

  • And, you know, it quite reaches the X axis.

  • The interesting thing here, is that there is a cut off point.

  • The you know, items that are less popular than this cut off point.

  • You know, might be purchased perhaps just once a week.

  • Or maybe once a month.

  • If you're a physical retailer like a Wal-Mart, it's not economic to

  • stock this item, because the rent cost of stocking the item is more than you make,

  • when you sell the item.

  • And therefore a retailer,

  • any right thinking retailer doesn't stock items that are unpopular.

  • The, you know, they only stock the, the head of the distribution.

  • So there's this cutoff point that I show on this graph here and items that are more

  • popular then this, the, the more popular items are available at a retail store.

  • But the less popular items, the items that are to the right of the cut off point,

  • are not available at any retail store.

  • They're only available online.

  • Now this phenomenon applies to books, to music, to movie,

  • to videos to news articles for example, there are only so

  • many news articles in newspaper, but when you go online you can see the rest of

  • the news articles are less popular, news articles that are off to the right.

  • The piece of the curve, that is to the the piece of the curve here that is to

  • the right of this dividing line, is called the long tail.

  • These are the items that are available only online.

  • The interesting thing is, the, is this area under the curve here.

  • And you can see the area under the curve here is quite significant.

  • In fact, in some cases the area under the curve on the right is about as large, or

  • could be even larger than the area of the curve,

  • under the curve on the, on the left.

  • So you have all these items that could never be found in a physical store, but

  • that can be only found online.

  • But there are so many of them.

  • That it's very hard for any user to find all these items.

  • Right, so when you have the seed of abundance and

  • you have so many items and many of them are really found online.

  • How, you know, how do you introduce a user to all these new

  • items they may have not otherwise find?

  • When you have more choice like this, when you have these millions and

  • millions of items that are only available online, you need a better way for

  • the user to find all these items.

  • The user doesn't even know where to start looking, and

  • that's where recommendation engines come in.

  • So recommendation engines work in the case of many,

  • many kinds of items books, music, movies, news articles.

  • Interestingly, they even work in the case of people.

  • For example, when you go to Facebook, or LinkedIn, or

  • Twitter, there are so many people that you don't know who to follow or who to friend.

  • And so Facebook, or LinkedIn, or

  • Twitter make recommendations to you, on the people you could follow or friend.

  • I like this point with interesting anecdote that shows you

  • the power of a recommendation engine.

  • Several years ago a book was published called Touching the Void.

  • It's a book about mountaineering.

  • It's very, very good book.

  • The book came out, it didn't make much of a ripple.

  • You know, few people bought the book.

  • It got some decent reviews, but it never became a bestseller.

  • And then a few years after Touching the Void,

  • a new book was published on mountaineering called Into Thin Air.

  • Now Into Thin Air picked up traction,

  • and lots of people started buying Into Thin Air.

  • Amazon noticed that a few of people who bought Into Thin Air,

  • had also bought Touching the Void.

  • So they started recommending Touching the Void, to people who bought Into Thin Air.

  • And low and behold, those people started buying Touching the Void as well.

  • The interesting point is, this made Touching the Void a bestseller.

  • In fact, it became a bigger bestseller even than Into Thin Air,

  • even though a few years ago, it had sank without a trace.

  • So this example should show you the power of recommendation systems.

  • There are these items, these sort of gems like Touching the Void you know,

  • that people don't know because they don't know to look for them.

  • But a good recommendation system can expose people to these hidden gems,

  • that they wouldn't have known about otherwise.

  • So let's look at types of recommendation systems.

  • The simplest and the oldest kind of recommendation is editorial or

  • hand curated.

  • You might find a list of favorites for example when you go into your

  • favorite neighborhood book store, you might find staff picks.

  • Certain marked off as staff picks, right?

  • And these are editorial triangulated, and on certain websites you'll

  • see a list of staff favorites or a list of essential items.

  • These are essentially built by hand.

  • And another place where you'll see these editorial recommendations is often on

  • the homepages of websites.

  • For example if you go to the the homepage of most popular

  • websites including product websites you'll see editorial picks.

  • These are products that have been picked by the editorial staff to feature

  • on the home page.

  • The drawback with editorial on hand curated recommendation,

  • is that the, it's done entirely by you know, by the staff of the website, and

  • there's no input from the users of the site.

  • So, when you go beyond editorial recommendations,

  • the next simple thing that you can do is simple aggregates.

  • On many websites, you'll see lists of top ten, or

  • most popular, or most recent for example,

  • if you go to YouTube you can see the most popular videos, for instance, right.

  • So these are simple aggregates which sort of

  • take into account user activity to make recommendations to other users.

  • But these recommendations don't depend on the user they only depend on,

  • you know, the, the aggregate activity of a lot of other users.

  • The third and most interesting kind of recommendation to us

  • is recommendations that are tailored to individual users.

  • Right for example, book recommendations tailored to your taste, or

  • movie recommendations based on the movies that you watched previously.

  • Or music recommendation based on your music interests.

  • And this is our focus here recommendations that are tailored to individual users.

  • So lets look at a formal model.

  • Let C be a set of customers and S a set of items.

  • We will create a function called the utility function or a utility matrix.

  • The utility function is a function that looks at every pair of

  • customer and item, and maps it to a rating.

  • Okay.

  • R in this case is a set of ratings.

  • And for example R could be a star rating from one star to five star.

  • R could be a number between zero and ten.

  • In general, R is a totally ordered set so that,

  • you know, a lower value indicates that a user liked the product less, and

  • a higher value indicates that a user liked the product more.

  • Let's look at an example of a utility matrix.

  • On the top we have four movies here.

  • Avatar, Lord of the Rings, Matrix, and Pirates of the Caribbean.

  • And down here we have four users Alice, Bob, Carol, and David.

  • And the utility matrix gives you rating for certain movies and certain users.

  • For example Alice has rated Avatar and

  • Matrix, but not Lord of the Rings or Pirates of the Caribbean.

  • Whereas Carol has rated you know, has rated the same two movies.

  • Bob has rated Lord of the Rings and

  • Pirates but hasn't rated Avatar or, or Matrix.

  • Now it could be that these users have not scene these movies or

  • it could be that they've seen the movies, but not bother to rate them.

  • So in general, usually a matrix like this is going to be sparse.

  • You know, most of the users haven't seen most of the movies.

  • And there are going to be values in some of the you know, some of the locations.

  • The problem in recommendation systems is to figure out these unknown values.

  • For example you've seen that Alice is rated Avatar and

  • Matrix, but hasn't rated Lord Of The Rings.

  • So the question is can we figure out what Alice's rating for

  • Lord Of The Rings will be, based on her other ratings?

  • Can you figure out whether she likes Pirates or not?

  • Right. So this is the key problem for

  • recommended systems.

  • Once we find out for each user certain movies that they would have rated highly,

  • or the system thinks they might have rated highly,

  • then we can recommend those movies to those users.

  • So there are three key problems in the space of recommended systems.

  • The first is gathering the known ratings in the ratings in the matrix.

  • Now in the previous slide when you look at the utility matrix it was

  • already filled in with certain values.

  • But how do you get, gather those values in the first place?

  • So that's the key that's the first problem that you need to tackle.

  • The second problem is to extrapolate unknown ratings from known ratings.

  • But we're mainly interested in the high unknown ratings.

  • We are interested in those ratings where a user would've given

  • a high rating to a movie.

  • We are not interested in the average, or the low ratings because they're never

  • going to recommend those movies to the user.

  • And finally, the third key problem is evaluating extrapolation methods.

  • Once you have a recommendation system that can extrapolate unknown ratings from

  • known ratings, how do you know the recommended system is doing well?

  • This is where evaluation methodologies come in.

  • Let's start with the first problem, that of gathering ratings.

  • The first and

  • simplest way of gathering ratings is, is what I call explicit patterns.

  • Simply ask people to rate items.

  • Now this method is good because the,

  • you know you're asking people to directly rate items.

  • And you're going to get, you know, for ex,

  • and you can decide on what scale people are going to rate items.

  • For example, you can say and ask for ratings on a one to five star scale.

  • Or you can ask people to rate on a scale from zero to ten.

  • or, or we can just ask people to say whether they like an item or

  • did not like it.

  • So the exclusive method had the advantage of simplicity and

  • of getting direct responses from users.

  • The problem, though, is that it doesn't scale.

  • Only a small fraction of users who viewed a movie or listened to a, you know, piece

  • of music or bought a product are actually bother to leave a rating or review.

  • Most users don't actually leave ratings or reviews.

  • So while the data that's explicitly gathered's excellent data.

  • It it's not sufficient in most cases for recommendations,

  • because only a small fraction of users actually leave ratings and reviews.

  • Since explicit rating don't scale, a lot of sites use implicit ratings.

  • Now the idea behind implicit ratings is to learn ratings from other user action.

  • For example an online shopping website might

  • have a rule that a purchase implies a high rating.

  • Now the nice thing about implicit ratings,

  • is that they're much more scalable than explicit ratings.

  • Because the user doesn't have to explicitly rate an item, and

  • there are way more other actions than there are ratings.

  • The problem though is that it's very hard using implicit ratings to

  • learn low ratings.

  • It's quite easy to learn high ratings because you,

  • you might have it all that points imply the high rating.

  • But you, you can never learn a rating that a user disliked a product implicitly.

  • Impactors.

  • Most recommender systems and

  • most web sites, use a combination of explicit and implicit ratings.

  • Where explicitly ratings are available, they use them.

  • But they supplement them with implicit ratings when needed.

  • Let's move on now to the central problem of Extrapolating Utilities.

  • Or extrapolating unknown utilities from known utility values.

  • The key of central problem, that we have this amount to extrapolate utilities,

  • is that the matrix U, the utility matrix is very, very sparse.

  • Most people have not rated most items.

  • And this introduces a slew of problems that we'll come across shortly.

  • The second problem we have is a cold start problem.

  • When you have a new item or

  • a new user, the new item doesn't have any ratings, and new users have no history.

  • So, this is known as the cold start problem, and

  • how to tackle this problem as for in due course.

  • There are three approaches to building recommender systems.

  • The first is Content-based recommendations.

  • The second is Collaborative filtering.

  • And the third is Latent factor based modeling.

  • Let's start with Content-based approaches.

Welcome back to Mining of Massive Datasets.

字幕與單字

單字即點即查 點擊單字可以查詢單字解釋

B1 中級

5 1 推薦系統概述 16 51 (5 1 Overview of Recommender Systems 16 51)

  • 12 2
    HaoLang Chen 發佈於 2021 年 01 月 14 日
影片單字