Reach-avoid semi-Markov decision processes with time-varying obstacles

Abstract

We consider the maximal reach-avoid probability to a target in finite horizon for semi-Markov decision processes with time-varying obstacles. Since the variance of the obstacle set, the model Model is non-homogeneous. To overcome such difficulty, we construct a related two-dimensional model newModel, and then prove the equivalence between such reach-avoid probability of the original model and that of the related two-dimensional one. For the related two-dimensional model, we analyze some special characteristics of the equivalent reach-avoid probability. On this basis, we provide a special improved value-type algorithm to obtain the equivalent maximal reach-avoid probability and its ε-optimal policy. Then, at the last step of the algorithm, by the equivalence between these two models, we obtain the original maximal reach-avoid probability and its ε-optimal policy for the original model.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…