Harnessing Unimodality in Semiparametric Contextual Pricing via Oracle Price Map Learning

Zhengyuan Zhou

Harnessing Unimodality in Semiparametric Contextual Pricing via Oracle Price Map Learning

Abstract

We study contextual dynamic pricing in a semiparametric scalar-index valuation model where the latent value is vt=μ( ct)+ξt, with an unknown utility map μ and an unknown additive noise distribution. The key decision object is the one-dimensional oracle price map u p(u) induced by the scalar index u=μ( c) and the noise tail. Under the β-Hölder smoothness of the tail function for β≥ 2 and a revenue-geometry condition that gives a unique, stable, interior maximizer, this oracle map is itself (β-1)-smooth. We exploit such structure through ORBIT, a modular coarse-to-fine policy that takes a scalar pilot index as input, localizes a benchmark price in each active bin, and learns a local polynomial approximation of the oracle map inside a trust region via bandit convex optimization. For the baseline linear utility model μ( c)= cθ, an adaptive elliptical exploration scheme constructs the required scalar pilot online without distributional assumptions on the contexts. The resulting policy achieves regret O(T2β-14β-3+dT). For fixed d, we establish a matching lower bound in the horizon dependence, unveiling that the nonparametric oracle-map learning term is minimax sharp. The same scalar-pilot interface also yields extensions to sparse high-dimensional linear utility and nonparametric Hölder utility.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Or compile a full topic from this idea

Discussion (0)

Sign in to join the discussion.

Loading comments…