Ordering-Free Inference from Locally Dependent Data

Abstract

This paper focuses on a data-rich environment where the data set has a very large cross-sectional dimension, is likely to exhibit local dependence, and yet is hard to determine the dependence ordering. Such a situation arises, for example, when the data set is collected from the Internet, through a method of web crawling. This paper proposes an approach of randomized subsampling inference, where one constructs a test statistic by aggregating many randomized test statistics using random draws of subsamples, and uses for inference the conditional distribution of the test statistic given data. This paper explores two approaches of such inference: one based on an M-type statistic constructed from randomized mean statistics and the other based on a U-type statistic constructed from randomized U-statistics. This paper provides conditions for local dependence, the number of the random draws, and the subsample size, under which randomized subsampling inference is asymptotically valid. From the Monte Carlo simulation studies, this paper finds that the randomized subsampling inference based on the U-type statistics performs better than that based on the M-type statistics.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…