Placeholder Image

字幕列表 影片播放

  • So basically, let me quickly summarize what we have, learned so far.

  • Right, so if we want to apply spectral graph partitioning to

  • find two clusters in a given graph, there are three steps we have to go through.

  • In the first step, we have to do what is called pre-processing where we

  • construct a matrix representation of a graph.

  • In the second step, we go and compute the eigenvalue decomposition of this,

  • of this graph by identifying eigenvalues and eigenvectors.

  • In particular, we are interested in the, second smallest eigenvalue lambda 2 and

  • the corresponding eigenvector x.

  • And then, once we have the cor, the, the,

  • the vector x, all we have to do is we have to go and

  • do the grouping where we basically look at the components of ne, of x.

  • And determine which nodes belong to the set A, and

  • which nodes belong to the set B, right?

  • Kind of, who belongs to the left partition,

  • who belongs to the right partition.

  • So, let me now give you an example.

  • Right, so in the pre-processing step, I take our graph G and

  • compute the Laplacian matrix L.

  • In the second step, then, we do the eigenvalue decomposition, where basically

  • we take, our L when we find a set of eigenvalues and a set of eigenvectors.

  • To this matrix L,

  • we take the second smallest eigenvalue and the corresponding eigenvector.

  • Here it is.

  • Right? We take this, this one out.

  • So here are now all the nodes one to, 1 to 6.

  • And these are the corresponding entries of the eigenvector.

  • And what we find, for example, is that now what we have to do is,

  • we have to group these nodes, right?

  • We would like to group these nodes into clusters corresponding to the embedding or

  • the labels of the nodes that, that we have in the corresponding eigenvector.

  • So how do we do this grouping?

  • Doing the grouping is very simple.

  • We basically go and sort the components, of of the vector X, and

  • identify clusters, by splitting the vector in 2, so [COUGH].

  • A naive approach to this,

  • to this case would be to ask, what are the corresponding values that are negative,

  • what are the corresponding values that are positive, right.

  • So if this is our vector X, we would split it here between nodes 3 and 4.

  • These are all the nodes that are on the right-hand side of the 0.

  • These are all the nodes on the left-hand side of the 0.

  • Notice that some of the node labels equals to 0, exactly as we said,

  • as we said it should be.

  • Also notice the sum of the squares of the values equals to 1, which is,

  • again, exactly as it should be.

  • So what this basically means is, now, by splitting this vector in half into kind of

  • the positive part and the negative part, we identified the two clusters.

  • Here is the cluster A, which is composed of nodes 1 to 3,

  • and the cluster B that is composed of nodes, 4, 5, and 6.

  • So I have an A and I have a B, right?

  • So, basically, what the components of second smallest eigenvector here basically

  • embedded the nodes on a line, assigned them the positive and negative values.

  • We take all the positive nodes, put them into cluster A.

  • Take all the negative nodes,

  • nodes with negative values, put them in the cluster B.

  • So, now if I give you a bit more interesting example,

  • here I have graph on the left G that you see it has two clusters.

  • I computed the graph Laplacian.

  • I computed the lambda 2 and the corresponding eigenvector X.

  • And all I did here is now I sorted the vector X2, right, the corresponding

  • eigenvector to the second smallest eigenvalue by, by the entries.

  • And what you see very nicely is there is a set of components that

  • has a negative value, and a set of components that has a positive value.

  • So basically, the eigenvalues, or

  • the entries of the eigenvector with a positive value correspond to

  • one cluster and the remaining ones correspond to the second cluster.

  • Of course you can now start asking what,

  • is there anything interesting about the, eigenvectors that corresponds to,

  • let's say the third smallest eigenvalue, the fourth smallest eigenvalue, and so on.

  • And just to, for example, show you what happens that, the first case I'll

  • show you is a graph that I have here that actually contains four small clusters.

  • If the graph contains four small clusters and I still compute, lambda 2 and

  • the corresponding eigen vector X2, what, what I show you here is now the,

  • the components of this eigenvector and

  • you see how, how the entries now first are split into two clusters, right?

  • Kind of above zero and below zero.

  • But then you actually find that here, we have these different steps.

  • Right? So basically we have the,

  • first the two clusters, and

  • then each of the two clusters has two small clusters embedded in them.

  • And we see this kind of from the structure of the components of

  • the second eigenvector very nicely.

  • So the question is, if I have multiple, clusters, how would I go identify them?

  • So let me just show you some examples how to do this.

  • For example, if this is my graph G, the same as I had before,

  • I have my Laplacian matrix L that I do eigendecomposition.

  • For example, here are the components of vector X1, and

  • they all have exactly the same value, so exactly as we said and as we have proven.

  • But for example the components of the third,

  • smallest eigenvector are the following.

  • And now what you see here is for example that, this is the, the bottom cluster.

  • This is the, the second cluster.

  • Then this would be the third cluster and

  • here these component's correspond to the fourth cluster.

  • So basically what does this mean?

  • This gives us now an idea of how do we go if we want to

  • partition a graph not into clusters but in K of them.

  • There are two possible approaches.

  • One approach is to do recursive bipartitioning.

  • It's basically take the full graph, split it in two, and now for

  • each of the two pieces, again, kind of try to split it in two and we keep applying

  • this recursively in until we are getting kind of smaller and smaller pieces.

  • And another idea is to basically cluster using multiple eigenvalues and

  • multiple eigenvectors.

  • So basically the idea is that, for every node in the graph, we take our

  • Laplacian matrix computed the eigenvalue decomposition, and now we take the second

  • eigenvector, the third eigenvector, fourth eigenvector, which basically means that

  • every node of the graph is now described by a small vector of values.

  • And now we can apply k-means or something like that to to identify the clusters.

  • So basically the idea would be that we take the graph,

  • compute the Laplacian matrix.

  • From Laplacian matrix what we will do is for every node i,

  • we will come up with it's corresponding, in some sense, coordinates.

  • Right? So we will take the coordinate i of

  • second smallest eigenvector, coordinate i of the third smallest eigenvector, and

  • so on and so forth.

  • And now that every node is described by a set of coordinates, we can run k-means

  • that we have already discussed how to use, to identify, K clusters.

  • And it turns out in practice, this method, works really well.

  • So with this, we have finished the discussion of spectral clustering and

  • basically given a graph how to find individual clusters of nodes, in it.

So basically, let me quickly summarize what we have, learned so far.

字幕與單字

單字即點即查 點擊單字可以查詢單字解釋

B1 中級

4 11 光譜聚類三步驟 7 17 高級 (4 11 Spectral Clustering Three Steps 7 17 Advanced)

  • 20 3
    HaoLang Chen 發佈於 2021 年 01 月 14 日
影片單字