Breaking the Dimensional Barrier: Dynamic Portfolio Choice with Parameter Uncertainty via Pontryagin Projection

Abstract

We study continuous-time CRRA portfolio choice in diffusion markets with estimated and hence uncertain coefficients. Nature draws a latent parameter θ q at time 0 and keeps it fixed; the investor never observes θ and must commit to a single θ-blind policy maximizing an ex-ante objective, treating q as a decision-time input. We propose a simulation-only two-stage solver.Stage 1 (DPO) performs BPTT-based stochastic gradient ascent through an Euler simulator while sampling θ only inside the simulator. Stage 2 (Pontryagin projection) aggregates costate blocks across θ q and enforces the q-aggregated stationarity condition within the deployable class; the resulting correction can be amortized via interactive distillation. We refer to the full Stage 1 + Stage 2 pipeline as PG-DPO.We prove a uniform conditional BPTT-PMP correspondence and a residual-based policy-gap bound with explicit discretization and Monte Carlo error terms. Experiments on high-dimensional Gaussian drift-uncertainty and factor-driven benchmarks show that projection stabilizes learning and accurately recovers analytic decision-time references, while a model-free PPO baseline remains far from the targets.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…