Structural and Compositional Complexities of Hierarchical Self-Assembly: a Hypergraph Approach
Abstract
Programmable self-assembly enables the construction of complex molecular, supramolecular, and crystalline architectures from well-designed building blocks. We introduce a hypergraph-based formalism, Blocks & Bonds (B&B), that generalizes classical chemical graph theory by incorporating directed and multicolored interactions, internal symmetries, and hierarchical organization. Within this framework, we develop the Structure Code (SC), a compact and versatile language for describing self-assembled architectures. We define a Kolmogorov-style Structural Complexity as the total information content of SC, obtained through its tokenization and Shannon information assignment. Complementing this encoding-based measure, we introduce a much simpler quantity, the Compositional Complexity, which depends only on the number and cumulative usage of block and bond types in the construction set. A central result of this work is a strong empirical correlation between the token-based Structural Complexity and the Compositional Complexity across all examined systems. Owing to this agreement, the Compositional Complexity emerges as the most practical and broadly applicable measure: it is easy to compute, requires no explicit encoding, and yet closely tracks the actual information content of structurally diverse architectures. Applications to molecular systems (ethylene glycol, glucose), DNA-origami lattices, and crystalline assemblies show that B\&B hypergraphs provide a unified, scalable, and information-efficient representation of structural organization, naturally capturing symmetry, modularity, and stereochemistry. This framework establishes a quantitative foundation for complexity-aware classification and inverse design of programmable matter.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.