A Probabilistic Sample Path Convergence Time Analysis of Drift-Plus-Penalty Algorithm for Stochastic Optimization

Abstract

This paper considers the problem of minimizing the time average of a controlled stochastic process subject to multiple time average constraints on other related processes. The probability distribution of the random events in the system is unknown to the controller. A typical application is time average power minimization subject to network throughput constraints for different users in a network with time varying channel conditions. We show that with probability at least 1-2δ, the classical drift-plus-penalty algorithm provides a sample path O() approximation to optimality with a convergence time O(12\212δ,~32δ\), where >0 is a parameter related to the algorithm. When there is only one constraint, we further show that the convergence time can be improved to O(1221δ).

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…