MINING SOCIAL DOCUMENTS AND NETWORKS
Open Access
- Author:
- Zhou , Ding
- Graduate Program:
- Computer Science and Engineering
- Degree:
- Doctor of Philosophy
- Document Type:
- Dissertation
- Date of Defense:
- December 10, 2007
- Committee Members:
- C Lee Giles, Committee Chair/Co-Chair
Hongyuan Zha, Committee Chair/Co-Chair
Jia Li, Committee Member
Wang Chien Lee, Committee Member
David Jonathan Miller, Committee Member - Keywords:
- social network
machine learning
information retrieval
user modeling - Abstract:
- The Web has connected millions of users by various communication tools for social purposes. Daily, huge amount of social data are being created through fingertips, driven by various of social actions that involve a wide range of user-produced content (e.g. emails or collaborative documentations). Often being a part of many contemporary Web applications, user social networks are gaining increasing attention from both industry and academia as they seem to have become a promising vehicle for delivering better user experience. Accordingly, computational social network analysis has become an important topic in user data mining. Despite a long history of structural social network analysis and recent interests in user behavior analysis, little research has addressed social contents in social networks and heterogeneous networks in user behavior. In fact, social content and heterogeneity are two key elements in contemporary online social networks that offer great benefits: social contents provide more semantic information; meanwhile heterogeneous social networks allow diversified perception of users. Motivated by these considerations, this dissertation seeks to improve traditional computational social network analysis by covering analysis of not only (1) social networks of users; but also (2) social content composed by users, and (3) social actions among users. A series of new methods are presented for knowledge discovery in social documents and social networks, with a special focus on modeling social content and machine learning of heterogeneous networks. In particular, this study first proposes new probabilistic content models for user generated social documents and annotations and investigates the connection between social content and social actions. A set of new techniques for computational analysis of heterogeneous social networks constructed by various social actions constitutes the conclusion. The methods proposed in this dissertation have been applied to a wide range of applications including ranking, community discovery, information retrieval, and document recommendations. For large scale real world data sets, this research shows significant experimental improvements over currently applied methods.