Structural and Temporal Hallmarks of Genealogical Networks
Abstract
The rapid growth of the genealogical sector, spanning platforms with billions of records and millions of users, has produced some of the largest and most complex networks available for analysis. Despite substantial advances in genealogical network research, it remains unclear whether human kinship networks exhibit universal structural properties. We address this by developing an integrated approach to genealogical network analysis that combines network-theoretic structure with an inferred notion of time. Using over one hundred datasets from the Kinsources repository, we reinterpret standard network measures in genealogical terms and introduce pseudogenerations, a method for extracting temporal structure directly from network topology. Within this framework, we identify common features shared across datasets. We find that genealogical networks exhibit scale-free--like degree and component-size distributions, multiscale family organization, and small-world behavior with respect to genetic and union-based distances. We show that 2-components provide a natural unit of genealogical structure, observe consistent disassortative mixing, and find that recorded unions are strongly biased toward short genetic distances relative to potential pairings. We also document temporal and demographic patterns, including shifts in recorded parental and child information, as well as correlations among recorded unions, parents, and children. These results suggest that diverse genealogical datasets share a common set of structural and temporal characteristics, providing evidence for universal features of human kinship networks and establishing a general framework for their comparative analysis.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.