Placeholder Image

字幕列表 影片播放

  • Let's turn to the task of actually implementing collaborative filtering and

  • look at the complexity of the implementation.

  • The most expensive step in implementing collaborative filtering is the,

  • the step of finding k most similar users.

  • Or the k most similar items.

  • The, the neighborhood of an item or a user.

  • Remember what we had to do to find the k most similar item was to take the item and

  • compute its similarity with respect to every other item.

  • And this required us to compute, for example, a cosine or

  • a centered cosine distance, between the item of interest and every other item.

  • If we actually did this at every step with how to actually look at every entry in

  • the utility matrix.

  • And so the,

  • the complexity of doing that turns out to be the size of the utility matrix U.

  • Now, since a utility matrix can be very, very large because there are many,

  • many users, millions of users and millions of items.

  • Clearly impractical to do this at one time.

  • So what we really have to do is pre compute.

  • Pre compute for every item with set up similar items.

  • Or for every user a similar set of users.

  • But doing naive pre computations can take a really long time.

  • Supposed to have n n items and we're doing item to item collaborative filtering.

  • For each item, we'd have to compute its similarity to the other item.

  • And so the complexity of doing that is a product of the number of the items and

  • the size of the utility matrix.

  • [INAUDIBLE] Now, since a utility matrix can be very large, and

  • the number of items can be very large.

  • The product of these two is clearly a very large number.

  • And so even a naive pre-computation, can, can be too expensive to do, in practice.

  • So how do you deal with this?

  • Now.

  • In previous lectures, we looked at a way of doing this, and that technique,

  • those are the techniques for near-neighbor search in high dimensions.

  • For example, locality of sensitive, locality-sensitive hashing.

  • We can use something like LSH, to, to take an item, and

  • quickly find, a set of, near neighbors to that item, or

  • take a user, and find a set of near neighbors to that user.

  • And do that in advance.

  • In practice, you might do something like this you know,

  • once in a day for, for example or once every few hours, we can also use

  • clustering to to group users and items into,

  • into smaller clusters and thereby speed up the process by restricting the search,

  • to within a cluster, as opposed to the entire setup items as users.

  • And finally, we could use a set of techniques called dimensionality reduction

  • methods which will cover in an upcoming lecture.

  • What are the advantages and disadvantages of collaborative filtering?

  • The biggest advantage of collaborative filtering is that it works for

  • any kind of item.

  • It doesn't matter whether the item is books or

  • music or videos or news articles or people.

  • Collaborative filtering works on any kind of item without requiring and

  • feature selection which is actually the biggest advantage of

  • collaborative filtering because it turns out to be tough problem to find the right

  • sort of features so something as complicated as movie.

  • Or a piece of news and

  • this it sort of also explains the huge popularity of collaborative filtering.

  • Because it sort of obviates this need for

  • feature selection for complex things such as images and movies and music and so on.

  • That said, collaborative filtering has a number of disadvantages.

  • The first of these, is cold start.

  • For Collaborative Filtering to work you need enough users in the system,

  • who have rated enough items.

  • Remember we need to, given an item you're to find a set of other similar items.

  • Or given a user, you're to find the set of other similar users.

  • But if there are not enough users in the system, it's hard to find a match.

  • The second problem is, is one of sparsity.

  • Even when we do have enough users in the.

  • In the in the system.

  • Because there are so many items.

  • Typically the millions of users and millions and millions of items.

  • Most users have not taken most items.

  • And therefore the user ratings matrix is very, very sparse indeed.

  • And it's hard to find, you know,

  • a user, better users have actually rated the same set of items, right?

  • So remember, supposed they wanted to predict the, rating for user, x and

  • an item i, I need to find other users who also rated item i.

  • Really hard to find such set of users because of sparsity.

  • The next problem, is the first rater problem.

  • Suppose we add a new item to the catalog, right?

  • we, you cannot make recommend this new item to anybody because the new

  • item doesn't have any ratings.

  • And Esoteric item, for instance, that are actually you know,

  • there may be a few people who really love the item, but

  • the number of people who like the item is really small.

  • Think you have a smaller number of ratings,

  • that's how to make recommendations for those items, too.

  • And the final problem of collaborative filtering is the popularity bias.

  • Let's take a really popular movie or, or book like Harry Potter.

  • Now, lots of people tend to give high ratings to such a popular,

  • popular book or movie.

  • And Collaborative Filtering will therefore tend to recommend a popular book or movie.

  • to, to almost everyone in the system.

  • Regardless of whether they'd actually like it or not.

  • Just because the neighborhood of any item will include popular items.

  • At Amazon we used to call this problem the Harry Potter effect.

  • And the Harry Potter effect need not be a bad thing.

  • Recommending popular items generally works out well.

  • But it's you know, the problem is if every item that's recommended is

  • a popular item the popular items can completely cloud out

  • the unique recommendations that can be made for a specific user.

  • And you are to take special steps to avoid the popularity bias from washing out

  • the specific recommendations.

  • Now that you've seen some of the difficulties with collaborative filtering,

  • we can design hybrid methods to work on those difficulties.

  • For example, we can add content based methods to collaborative filtering.

  • We can add item profiles to deal with the new item problem and

  • make recommendations of new items to users.

  • Or we might you know, take new users and use demographic information about new

  • users to build synthetic profiles for them and that deal with the new user problem.

  • Another approach is to implement two or more different recommender systems and

  • combine their predictions perhaps using the linear model.

  • For example, you could use a global baseline recommender and

  • combine that with collaborative filtering.

  • And let's see how to do this.

  • So here's a problem.

  • We take the estimate of Joe's rating for the movie The Sixth Sense.

  • The problem is that Joe has not rated any movies similar to The Sixth Sense

  • according to our measure of similarity.

  • And so, using, the collaborative filtering formula that saw in the previous lights,

  • we can make no prediction for Joe's rating for the movie, The Sixth Sense.

  • This is a problem that arises from the sparsity of the rating matrix.

  • So here's an approach called the Global Baseline approach, and it's very,

  • very simple.

  • Suppose in our, in our system, the mean movie rating is 3.7 stars.

  • 3.7 is the average rating for, across all movies and all users.

  • And we observe that The Sixth Sense, the ra, the average rating for

  • Sixth Sense is 0.5 stars above the mean movie rating.

  • People like The Sixth Sense, on average data, you know,

  • more than the average movie.

  • You also notice that Joe rates 0.2 stars below the average rating for,

  • for other users.

  • Right?

  • This is not, remember this is not Joe's rating for The Sixth Sense or

  • for any specific movie.

  • But if we take the average of Joe's rating, Joe Turns will be a tough rater.

  • And his average rating is 3.5 stars as opposed to the mean rating of 3.7 stars.

  • And so Joe's rating i, i, Joe's rating in general are 0.2 stars below average.

  • Now we can use these, three numbers to come up with the baseline estimate for,

  • the user Joe and the movie Sixth Sense.

  • We started a mean rating with just 3.7.

  • Notice that The Sixth Sense is 0.5 stars above average.

  • And then we subtract the 0.2

  • because Joe's rating is on average 0.2 stars below the average rating.

  • And you predict that Joe will give the Sixth Sense four stars.

  • Notice that in this case we have not used any

  • specific information about the movies Joe has rated.

  • We don't need Joe to have rated any movies similar to

  • The Sixth Sense in order to make this prediction.

  • All we need is to have enough users who have actually rated The Sixth Sense.

  • And so, so that we can actually compute an average rating for for the you know,

  • for The Sixth Sense.

  • Now what we'd like to do is actually combine this global baseline rating

  • with collaborative filtering.

  • So let, let's look at an example.

  • So the Global Baseline estimated that Joe will give The Sixth Sense 4 stars.

  • Now suppose we use a collaborative filtering on nearest neighbour approach

  • and we actually find out that Joe didn't like the movie Signs and

  • Signs actually happens to be a movie that's is very similar to The Sixth Sense,

  • exactly by the same director.

  • And so and lets say the similarity between Signs and

  • Sixth Sense is is one point of we find out Joe didn't like Signs and

  • in fact Joe rated Signs one star below his average rating for all movies.

  • So now we can combine the Global Baseline estimate and

  • the collaborative filtering refinement and come up with the final estimate.

  • So, since the Global Baseline estimates that Joe will rate The Sixth Sense four

  • stars whereas the collaborative filtering gives it

  • a negative one below his average rating.

  • So we can, we can just add those two ratings.

  • And predict that Joe will grade The Sixth Sense three stars.

  • So notice that this approach actually takes a linear combination of,

  • two independent classifiers,

  • a global baseline classifier, and a collaborative filtering classifier, and

  • takes the sum of those, you know, takes the sum of those predictions.

  • And if you wanted we could rated these predictions in different ways as well.

  • In this case, we rated both of those equally.

  • So finally, here is the the formula that we can use to implement the,

  • the combination in the global base line and collaborative filtering.

  • Let's define SIJ to be the similarity of items I and J.

  • And given and item I, we're going to find it's K nearest neighbors and we'll

  • only going to find the K nearest neighbors who have also been rated by user X.

  • And our goal is to estimate the rating for user X, and item i.

  • And, about here, and, given the, the simple formula that we had, which was just

  • a, a weighted average dating, for all the items in the neighborhood N i x.

  • Now we're going to add in the global baseline idea.

  • And here's what the new formula looks like.

  • We're going to take estimate the rating r x i

  • to be the sum of a baseline rating and the collaborative filtering refinement.

  • b b x i here is the baseline estimate.

  • And the baseline estimate for user x and

  • item i is itself the sum of three components.

  • mu, which is the overall mean movie rating in the system.

  • b x, which is the rating deviation of user x, which is just the average rating

  • that user x gives across all movies that he has rated, minus mu.

  • And b i is similarly the rating deviation of movie i.

  • And so if you add those three up, you get bxi which is a baseline rating for

  • user x and item i.

  • And then we're going to add in the collaborative filtering piece,

  • which is the same as the formula we had before, with the weighted average.

  • Rated by similarity of item i and item j in the neighborhood.

  • however, and from using rxj, which is the rating of user x for item J.

  • We're just going to subtract out the baseline piece and when you look at

  • the the, the, the deviation of the rating for the, for the item from the baseline.

  • Since you've already added the baseline piece we don't want to double count it.

  • In the second piece there as well.

  • So we subtracted out the the, the,

  • the baseline ratings from the collaborative filtering piece.

  • And this gives us the final formula that combines collaborative filtering and

  • the baseline approach.

Let's turn to the task of actually implementing collaborative filtering and

字幕與單字

單字即點即查 點擊單字可以查詢單字解釋

B1 中級

5 4 實施協同過濾 13 46 高級 (5 4 Implementing Collaborative Filtering 13 46 Advanced)

  • 9 2
    HaoLang Chen 發佈於 2021 年 01 月 14 日
影片單字