Goodness of fit for log-linear ERGMs

Abstract

Many popular models from the networks literature can be viewed through a common lens of contingency tables on network dyads, resulting in log-linear ERGMs: exponential family models for random graphs whose sufficient statistics are linear on the dyads. We propose a new model in this family, the p1-SBM, which combines node and group effects common in network formation mechanisms. In particular, it is a generalization of several well-known ERGMs including the stochastic blockmodel for undirected graphs with known block assignment, the degree-corrected version of it, and the directed p1 model without group structure. We frame the problem of testing model fit for the log-linear ERGM class through an exact conditional test whose p-value can be approximated efficiently in networks of both small and moderately large sizes. The sampling methods we build rely on a dynamic adaptation of Markov bases. We use quick estimation algorithms adapted from the contingency table literature and effective sampling methods rooted in graph theory and algebraic statistics. The performance and scalability of the method is demonstrated on two data sets from biology: the connectome of C. elegans and the interactome of Arabidopsis thaliana. These two networks -- a network and a protein-protein interaction network -- have been popular examples in the network science literature. Our work provides a model-based approach to studying them.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…