Clustering with Non-adaptive Subset Queries
Abstract
Recovering the underlying clustering of a set U of n points by asking pair-wise same-cluster queries has garnered significant interest in the last decade. Given a query S ⊂ U, |S|=2, the oracle returns yes if the points are in the same cluster and no otherwise. For adaptive algorithms with pair-wise queries, the number of required queries is known to be (nk), where k is the number of clusters. However, non-adaptive schemes require (n2) queries, which matches the trivial O(n2) upper bound attained by querying every pair of points. To break the quadratic barrier for non-adaptive queries, we study a generalization of this problem to subset queries for |S|>2, where the oracle returns the number of clusters intersecting S. Allowing for subset queries of unbounded size, O(n) queries is possible with an adaptive scheme (Chakrabarty-Liao, 2024). However, the realm of non-adaptive algorithms is completely unknown. In this paper, we give the first non-adaptive algorithms for clustering with subset queries. Our main result is a non-adaptive algorithm making O(n k · ( k + n)2) queries, which improves to O(n n) when k is a constant. We also consider algorithms with a restricted query size of at most s. In this setting we prove that ((n2/s2,n)) queries are necessary and obtain algorithms making O(n2k/s2) queries for any s ≤ n and O(n2/s) queries for any s ≤ n. We also consider the natural special case when the clusters are balanced, obtaining non-adaptive algorithms which make O(n k) + O(k) and O(n2 k) queries. Finally, allowing two rounds of adaptivity, we give an algorithm making O(n k) queries in the general case and O(n k) queries when the clusters are balanced.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.