A simulation-free extrapolation method for misspecified models with errors-in-variables in epidemiological studies
Abstract
In epidemiological studies, it is common to analyze disease risk by categorizing continuous variables, such as calorie and nutrient intake, for interpretability. When the original continuous variable is contaminated with measurement errors, ignoring this issue and performing regular statistical analysis leads to severely biased point estimates and invalid confidence intervals. Although the errors-in-variables problem is a well-known critical issue in many areas, most existing methods addressing measurement errors either do not account for model misspecification or make strong parametric assumptions. We introduce SIMFEX, a simulation-free extrapolation method, which provides valid and robust statistical inference across a range of models and imposes no distributional assumptions on the observed data. Through extensive numerical studies, we show that SIMFEX can provide consistent point estimation and valid confidence intervals under various regression models. Using Food Frequency Questionnaire in UK Biobank data, we show that ignoring measurement errors underestimates the impact of high fat intake on BMI and obesity by at least 30% and 60%, respectively, compared with the results of correcting for measurement errors using SIMFEX.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.