Jul
21
2011

Add to Cal
  • Speaker: Hao Zhou
  • Host Department: Statistics
  • Date: 07/21/2011
  • Time: 1:00 PM

  • Location: 438 West Hall

  • Description:

    Title: Methods and Tools for Visual Analytics
    Chair: Professor George Michailidis
    Committee Members: Associate Professor Kerby Shedden, Associate Professor Ji Zhu, Professor H.V. Jagadish (EECS)

    Abstract: Technological advances have led to a proliferation of data characterized by a complex structure; namely, high-dimensional attribute information complemented by relationships between the objects or even the attributes. Classical data mining techniques usually explore the attribute space, while network analytic techniques focus on the relationships, usually expressed in the form of a graph. However, visualization techniques offer the possibility to gain useful insight through appropriate graphical displays coupled with data mining and network analytic techniques. In this thesis, we study various topics of the visual analysis process. Specifically, in chapter 2, we propose a visual analytic algebra geared towards attributed graphs. The algebra defining a universal language for graph data manipulation during the visual analytic process and allows documentation and reproducibility. We also extend the algebra framework to address the uncertain querying problem. The algebra's operators are illustrated on a number of synthetic and real data sets, implemented in an existing visualization system (Cytoscape) and validated through a small user study. In chapter 3, we introduce dimension reduction techniques that through a regularization framework incorporate network information either on the objects or the attributes. The techniques are illustrated on a number of real data sets. Finally, in the last part of the thesis, we present a multi-task generalized linear model that improves the learning of a single task (problem) by utilizing information from connected/similar tasks through a shared representation. We present an algorithm for estimating the parameters of the problem efficiently and illustrate it on a movie ratings data set.

  • Event Flyer