Article ID: | iaor20115288 |
Volume: | 51 |
Issue: | 3 |
Start Page Number: | 506 |
End Page Number: | 518 |
Publication Date: | Jun 2011 |
Journal: | Decision Support Systems |
Authors: | Bai Xue, Airoldi Edoardo M, Carley Kathleen M |
Keywords: | datamining, performance, optimization |
Methods for generating a random sample of networks with desired properties are important tools for the analysis of social, biological, and information networks. Algorithm‐based approaches to sampling networks have received a great deal of attention in recent literature. Most of these algorithms are based on simple intuitions that associate the full features of connectivity patterns with specific values of only one or two network metrics. Substantive conclusions are crucially dependent on this association holding true. However, the extent to which this simple intuition holds true is not yet known. In this paper, we examine the association between the connectivity patterns that a network sampling algorithm aims to generate and the connectivity patterns of the generated networks, measured by an existing set of popular network metrics. We find that different network sampling algorithms can yield networks with similar connectivity patterns. We also find that the alternative algorithms for the same connectivity pattern can yield networks with different connectivity patterns. We argue that conclusions based on simulated network studies must focus on the full features of the connectivity patterns of a network instead of on the limited set of networkmetrics for a specific network type. This fact has important implications for network data analysis: for instance, implications related to the way significance is currently assessed.