Contents
1. Introduction . . . . _ _ p. 1
I.I. Structurc of Dynamic Programming Problems . _ p. 2
1.2. Abstract Dynamic Programming Moclels _ _ p. 5
1.2.1. Problem Formulation . . _ p. 5
1.2.2. Monotonicity and Contraction Assumptions _ p. 7
1.2.3. Some Examples _ _ p. 9
1.2.4. Approximation-Related Mappings _ p. 21
1.3. Organization of the Book _ _ p. 23
1.4. Notes. Sources. and Exercises _ _ p. 25
2. Contractive Models _ _ p. 29
2.1. Fixed Point Equation and Optimality Conditions _ p. 30
2.2. Limited Lookahead Policies _ _ p. 37
2.3. Value Iteration . _ p. 42
2.3.1. Approximate Value Iteration . . . . . . . p. 43
2.4. Policy Iteration . _ p. 46
2.4.1. Approximate Policy Iteration . . . . . . p. 48
2.5. Optimistic Policy Iteration _ _ p. 52
2.5.1. Convergence of Optimistic Policy Iteration _ p. 52
2.5.2. Approximate Optimistic Policy Iteration . _ p. 57
2.6. Asynchronous Algorithms _ - p. 61
2.6.1. Asynchronous Value Iteration . . . . . . p. 61
2.6.2. Asynchronous Policy Iteration . . . . . . p. 67
2.6.3. Policy Iteration with a Uniform Fixed Point _ p. 72
2.7. Notes. Sources£¬ and Excrcises _ _ p. 79
3. Semicontractive Models _ _ p. 85
3.1. Semicontractive Models and Regular Policies _ _ p. 86
3.1.1. Fixed Points, Optimality Conditions, and
Algorithmic Results _ _ p. 90
3.1.2. Illustrative Example: Deterministic Shortest
Path Problems _ p. 97
3.2. Irregular Policies and a Perturbation Approach - p. 100
3.2.1. The Casc Where Irregular Policies Have Infinite
Cost _ p. 100
3.2.2. The Case Where Irregular Policies Have Finite
Cost Perturbations _ _ p. 107
3.3. Algorithms _ - p. 116
3.3.1. Asynchronous Value Iteration . . . . . p. 117
3.3.2. Asynchronous Policy Iteration . . . . . p. 118
3.3.3. Policy Iteration with Perturbations _ p. 124
3.4. Notes. Sources. and Exercises _ p. 125
4. Noncontractive Models _ _ p. 129
4.1. Noncontractive Models . _ p. 130
4.2. Finite Horizon Problems . _ p. 133
4.3. Infinite Horizon Problems _ p. 139
4.3.1. Fixed Point Properties and Optimality Conditions _ p. 143
4.3.2. Value Iteration . . . _ p. 154
4.3.3. Policy Iteration . . _ p. 157
4.4. Semicontractive-Monotone Increasing Models _ p. 163
4.4.1. Value and Policy Iteration Algorithms _ p. 163
4.4.2. Some Applications . - p. 166
4.4.3. Linear-Quadratic Problems _ p. 168
4.5. Affine Monotonic Models _ p. 171
4.5.1. Increasing Affine Monotonic Models _ p. 172
4.5.2. Nonincreasing Affine Monotonic Models _ p. 173
4.5.3. Exponential Cost Stochastic Shortest Path
Problems _ _ p. 175
4.6. An Overview of Semicontractive Models and Results _ p. 179
4.7. Notes. Sources. and Exercises _ p. 179
5. Models with Restricted Policies _ _ p. 187
5.1. A Framework for Restricted Policies _ p. 188
5.1.1. General Assumptions _ _ p. 192
5.2. Finite Horizon Problems . _ p. 196
5.3. Contractive Models _ _ p. 198
5.4. Borel Space Models _ _ p. 200
5.5. Notes. Sources. and Exercises _ p. 201
Appendix A: Notation and Mathematical Conventions . p. 203
Appendix B: Contraction Mappings _ p. 207
Appendix C: Measure Theoretic Issues _ p. 216
Appendix D: Solutions of Exercises _ p. 230
References _ _ _ p. 241
Index _ _ _ p. 247