Continuous time first, arrival target discounted moment optimal models with countable state and action spaces are invest}igatedl. A general formula of the k-th moment of total discounted return is given. A relation between the continuous and associated discrete time quasi-discounted return is established. It is shown that there exists a unique bounded solution for the moment optimal equation under a rather weak condition. Some properties of optimal policies are discussed.
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
References
[1] 林元烈, 连续时间折扣矩最优模型及其与离散时间拟折扣矩最优模型之间的关系—Q矩阵术必保守的情形, 数学学报.(待发表).
[2] K.C Chung, Markov Chains with Stationary Transition Probability, Springer. Berlin, 1960.
[3] Derman, C. (1970) Finite State Markovian Decision New York. Academic Press, Inc, 1970.
[4] Taylor. H., Optimal Stopping in a Markov Processes; Ann. Math. Statist., 39(1968), 1333-1344.
[5] S. C. daquette Markov Decision Processes with a new Optimality Criterion. Technical Report No. 15 May 1971, Dept. of Oper. Res. Stanford Univ.
{{custom_fnGroup.title_en}}
Footnotes
{{custom_fn.content}}