Identification of Latent Group Effects under Conditional Calibration

Abstract

We study identification of a structural group effect when the group indicator G∈\0,1\ is unobserved but the analyst observes a calibrated probability score p∈[0,1] satisfying E[G|p,X]=p. Under a constant-coefficient structural mean model, the latent-group coefficient τ is point-identified from the joint law of observables (Y,X,p) by a simple ratio of weighted moments: the covariance of the signed score 2p-1 with the covariate-partialled outcome, divided by twice the residual variance of the score after conditioning on covariates. Identification fails if and only if the score is a deterministic function of X; we establish this by constructing an explicit continuum of observationally equivalent models indexed by arbitrary values of τ. The identified coefficient differs from the marginal latent mean gap by a compositional term that is unidentified without further assumptions; we give a necessary and sufficient condition for the two to coincide. The oracle estimator is n-consistent and asymptotically normal with a closed-form sandwich variance. Under calibration error bounded uniformly by δ, the bias is bounded by |τ|\,E[|2p-1|]\,δ\,(2V*)-1, a bound that is sharp over all calibration error functions of that magnitude. Hard-threshold classification at p=1/2 attenuates the estimated gap by a factor strictly less than one. Monte Carlo experiments confirm the asymptotic theory, trace the divergence of RMSE as V* 0, illustrate the attenuation bias of hard-threshold classification, and verify identification of the variance-weighted estimand under heterogeneous effects.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…