A maximum entropy approach to separating noise from signal in bimodal affiliation networks

Abstract

In practice, many empirical networks, including co-authorship and collocation networks are unimodal projections of a bipartite data structure where one layer represents entities, the second layer consists of a number of sets representing affiliations, attributes, groups, etc., and an inter-layer link indicates membership of an entity in a set. The edge weight in the unimodal projection, which we refer to as a co-occurrence network, counts the number of sets to which both end-nodes are linked. Interpreting such dense networks requires statistical analysis that takes into account the bipartite structure of the underlying data. Here we develop a statistical significance metric for such networks based on a maximum entropy null model which preserves both the frequency sequence of the individuals/entities and the size sequence of the sets. Solving the maximum entropy problem is reduced to solving a system of nonlinear equations for which fast algorithms exist, thus eliminating the need for expensive Monte-Carlo sampling techniques. We use this metric to prune and visualize a number of empirical networks.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…