Policy Learning with New Treatments

Abstract

I study the problem of a decision maker choosing a policy which allocates treatment to a heterogeneous population on the basis of experimental data that includes only a subset of possible treatment values. The effects of new treatments are partially identified by shape restrictions on treatment response. Policies are compared according to the minimax regret criterion, and I show that the empirical analog of the population decision problem has a tractable linear- and integer-programming formulation. I prove the maximum regret of the estimated policy converges to the lowest possible maximum regret at a rate which is the maximum of N-1/2 and the rate at which conditional average treatment effects are estimated in the experimental data. In an application to designing targeted subsidies for electrical grid connections in rural Kenya, I find that nearly the entire population should be given a treatment not implemented in the experiment, reducing maximum regret by over 60% compared to the policy that restricts to the treatments implemented in the experiment.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…