CLUSTERING AND CONNECTIVITY PROBLEMS IN COMPLEX NETWORKS

Open Access
Author:
Raghavan, Ushanandini
Graduate Program:
Industrial Engineering
Degree:
Doctor of Philosophy
Document Type:
Dissertation
Date of Defense:
April 23, 2008
Committee Members:
  • Soundar Kumara, Committee Chair
  • Reka Z Albert, Committee Chair
  • Dennis Kon Jin Lin, Committee Member
  • Tao Yao, Committee Member
  • Arvind Rangaswamy, Committee Member
Keywords:
  • networks
  • social networks
  • community detection
  • wireless sensor networks
  • clustering
  • connectivity
Abstract:
Networked systems pervade this world. From social relationships between people to chemical interactions between bio-molecules to wired or wireless communications between technological devices, networks play a role. In recent years there has been a wide spread interest in the study of the 'structural organization' in such complex networks. The focus here is on the organization of links (who is connected to whom) in the network and its effects on dynamics and behavior of the networked system. Traditionally complex networks were modeled as random graphs that were introduced by Paul Erdos and Alfred Renyi (E-R random graphs) [1]. Here pairs of nodes in a network are linked with a uniform probability p. However in the recent years empirical observations from a wide range of large-scale networks such as the movie actor collaboration network, scientific co-authorship networks, protein-protein interaction maps of cells and organisms, neural networks, the World Wide Web (WWW) and others revealed properties that deviate from the properties of E-R random graphs. For example, many real-world networks have a power-law degree distribution, high degree of local order, assortative or dis-assortative mixing patterns and tightly interlinked groups called clusters as opposed to E-R random graphs that have a Poisson degree distribution, no particular local order or mixing patterns or clustered structures. This implies that in many social, biological and technological networks the organization of links (who is connected to whom) follow specific principles (which is not random) leading to the observed network properties. A lot of research has been done in recent years to understand various structural properties of complex networks and in identifying the organizing principles defining them. In this thesis we focus on two problems of structural organization in networks, namely clustering and connectivity. The presence of clusters has been observed in many real-world networks (for example, the presence of groups or communities in social networks, functional modules in biological networks and webpages on similar topics in the WWW) There has been a lot of effort recently in defining, identifying and extracting these clusters from complex networks to better understand their structural organization. To identify clustered structures in networks we develop an algorithm that uses a consensus formation mechanism to extract densely interlinked groups. To our knowledge such a mechanism has not been previously used to identify clusters in networks and thereby contributing to another way of defining them. One of the main advantages of this algorithm is its ability to run in a near-linear time and hence its applicability to very large networks. Also unlike most of the existing algorithms we do not require any a priori information such as the number and sizes of the clusters present in the network. By applying the algorithm on actor collaboration, scientific co-authorship, protein-protein interaction and World Wide Web networks we highlight that there exists multiple significant clustered structures revealing the richness of the orders present in them. Further, this result supports the presence of overlapping clusters in social and biological networks [2]. A connected network is one in which it is possible to travel from any node to any other node via the links. For the connectivity problem we consider a class of networks that initially contain no links. Our goal is to create and organize links in the network, which is limited by certain constraints, to obtain network connectivity. We solve the connectivity problem in the domain of Wireless Sensor Networks (WSNs). That is, the constraints that curb the creation of links are given by the physical limits of tiny wireless sensor devices. In WSNs the nodes are sensor devices and the links are the wireless communications between these devices. The constraint here is that the degree of any node (number of communications of a node) should be strictly bounded and cannot increase with an increasing network size ($N$). Unlike previous results that insisted that the smallest degree required, to obtain connectivity, is a function of $N$ and increases with $N$, we show that it is possible to put a bound on the degree. In specific, we show that if we can tolerate certain nodes being disconnected then every node is required to communicate only to its five closest neighbors (organization principle) to obtain a giant connected component in a network of any size $N$.