Abstract Dynamic Programming, 2nd Edition, 動的計画法概要, 第2版, 9781886529465, 978-1-886529-46-5
Description
A research monograph providing a synthesis of old research on the foundations of dynamic programming, with the modern theory of approximate dynamic programming and new research on semicontractive models.
It aims at a unified and economical development of the core theory and algorithms of total cost sequential decision problems, based on the strong connections of the subject with fixed point theory. The analysis focuses on the abstract mapping that underlies dynamic programming and defines the mathematical character of the associated problem. The discussion centers on two fundamental properties that this mapping may have: monotonicity and (weighted sup-norm) contraction. It turns out that the nature of the analytical and algorithmic DP theory is determined primarily by the presence or absence of these two properties, and the rest of the problem's structure is largely inconsequential. New research is focused on two areas: 1) The ramifications of these properties in the context of algorithms for approximate dynamic programming, and 2) The new class of semicontractive models, exemplified by stochastic shortest path problems, where some but not all policies are contractive.
The 2nd edition aims primarily to amplify the presentation of the semicontractive models of Chapter 3 and Chapter 4 of the first (2013) edition, and to supplement it with a broad spectrum of research results that I obtained and published in journals and reports since the first edition was written (see below). As a result, the size of this material more than doubled, and the size of the book increased by nearly 40%.
Contents:
1. Introduction
1.1. Structure of Dynamic Programming Problems
1.2. Abstract Dynamic Programming Models
1.3. Organization of the Book
1.4. Notes, Sources, and Exercises
2. Contractive Models
2.1. Bellman’s Equation and Optimality Conditions
2.2. Limited Lookahead Policies
2.3. Value Iteration
2.4. Policy Iteration
2.5. Optimistic Policy Iteration andλ-Policy Iteration
2.6. Asynchronous Algorithms
2.7. Notes, Sources, and Exercises
3. Semicontractive Models
3.1. Pathologies of Noncontractive DP Models
3.2. Semicontractive Models and Regular Policies
3.3. Irregular Policies/Infinite Cost Case
3.4. Irregular Policies/Finite Cost Case - A Perturbation
3.5. Applications in Shortest Path and Other Contexts
3.6. Algorithms
3.7. Notes, Sources, and Exercises
4. Noncontractive Models
4.1. Noncontractive Models - Problem Formulation
4.2. Finite Horizon Problems
4.3. Infinite Horizon Problems
4.4. Regularity and Nonstationary Policies
4.5. Stable Policies for Deterministic Optimal Control
4.6. Infinite-Spaces Stochastic Shortest Path Problems
4.7. Notes, Sources, and Exercises
Appendix A: Notation and Mathematical Conventions
Appendix B: Contraction Mappings