Fibers of multi-way contingency tables given conditionals: relation to marginals, cell bounds and Markov bases

Abstract

A reference set, or a fiber, of a contingency table is the space of all realizations of the table under a given set of constraints such as marginal totals. Understanding the geometry of this space is a key problem in algebraic statistics, important for conducting exact conditional inference, calculating cell bounds, imputing missing cell values, and assessing the risk of disclosure of sensitive information. Motivated primarily by disclosure limitation problems where constraints can come from summary statistics other than the margins, in this paper we study the space FT of all possible multi-way contingency tables for a given sample size and set of observed conditional frequencies. We show that this space can be decomposed according to different possible marginals, which, in turn, are encoded by the solution set of a linear Diophantine equation. We characterize the difference between two fibers: FT and the space of tables for a given set of corresponding marginal totals. In particular, we solve a generalization of an open problem posed by Dobra et al. (2008). Our decomposition of FT has two important consequences: (1) we derive new cell bounds, some including connections to Directed Acyclic Graphs, and (2) we describe a structure for the Markov bases for the space FT that leads to a simplified calculation of Markov bases in this particular setting.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…