A belief-state restless bandit model for treatment adherence: Whittle indexability via partial conservation laws
Abstract
We study clinically motivated capacity-constrained treatment-adherence outreach through a belief-state restless multi-armed bandit model, in which each patient is a partially observed two-state Markov decision process and interventions induce reset-type belief dynamics. For the discounted criterion, partial conservation law (PCL)-based conditions are used to establish single-patient threshold-policy optimality and indexability (threshold-indexability) and yield a closed-form Whittle index, threshold performance metrics, and an explicit optimal threshold map. We also prove a single-patient long-run average analogue on the invariant belief core and obtain an explicit average-criterion Whittle index. For the multi-patient model, the PCL-derived formulas give an analytic Lagrangian relaxation, efficient dual bounds, and computable Lagrangian index benchmark policies, including a forced-capacity variant. We analyze how the Whittle index depends on lapse and spontaneous-recovery probabilities. In large-scale experiments with two-type, three-type, and jittered finite-mixture populations, the Whittle and forced-capacity Lagrangian index policies are the strongest performers, while myopic prioritization can be substantially worse under tight capacity.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.