November 22, 2016
Title: Graphons, mergeons, and so on!
Speaker: Justin Eldridge
A fundamental problem in the theory of graph clustering is that of defining a cluster. Given a graph, what is the correct clustering? There is no single answer to this seemingly simple question. However, in a statistical setting in which the graphs to be clustered come from some underlying probability distribution, it is natural to define the correct clusters in terms of the distribution itself. The goal of clustering, then, is to identify the appropriate cluster structure of the distribution, and to recover that structure from a finite sample.
In this talk, I will develop a statistical theory of graph clustering for the setting in which graphs are drawn from a so-called "graphon"; a very general and powerful random graph model of much recent interest. First, I will define the clusters
of a graphon. The natural definition results in a graphon having a hierarchy of clusters, which we call the "graphon cluster tree". Next, I develop a notion of statistical consistency for estimators of the graphon cluster tree, in which a clustering method is said to be consistent if its output converges to the graphon cluster tree in a suitable metric. Finally, I will identify a specific, practical algorithm which consistently recovers the cluster tree.