Fast Community Detection in Large Sparse Networks


Add to Cal
  • Speaker: Liza Levina
  • Host Department: Center for the Study of Complex Systems
  • Date: 10/29/2013
  • Time: 12:00PM - 01:00PM

  • Location: PLEASE NOTE NEW ROOM AND BUILDING: 4448 East Hall

  • Description:

    Abstract: Community detection is one of the fundamental problems in
    network analysis, with many diverse applications, and a lot of work has
    been done on models and algorithms that find communities.   Perhaps the
    most commonly used probabilistic model for a network with communities is
    the stochastic block model, and many algorithms for fitting it have been
    proposed.   Since finding communities involves optimizing over all possible
    assignments of discrete labels, most existing algorithms do not scale well
    to large networks, and many fail on sparse networks.   In this talk, we
    propose a pseudo-likelihood approach  for fitting the stochastic block
    model to address these shortcomings.  Pseudo-likelihood is a general
    statistical principle that involves trading off some of the model
    complexity against computational efficiency.   We also derive a variant
    that allows for arbitrary degree distributions in the network, making it
    suitable for fitting the more flexible degree-corrected stochastic block
    model.  The pseudo-likelihood algorithm scales easily to networks with
    millions of nodes, performs well empirically under a range of settings,
    including on very sparse networks, and is asymptotically consistent under
    reasonable conditions.   If times allows, I will also discuss spectral
    clustering with perturbations, a new method of independent interest we use
    to initialize pseudo-likelihood, which works well on sparse networks where
    regular spectral clustering fails.

    Joint work with Arash  Amini, Aiyou Chen, and Peter Bickel.