Ergodic risk-sensitive control of Markov processes on countable state space revisited

Biswas, Anup; Pradhan, Somnath

doi:10.1051/cocv/2022018

Biswas, Anup ; Pradhan, Somnath

ESAIM: Control, Optimisation and Calculus of Variations, Tome 28 (2022), article no. 26

Résumé

We consider a large family of discrete and continuous time controlled Markov processes and study an ergodic risk-sensitive minimization problem. Under a blanket stability assumption, we provide a complete analysis to this problem. In particular, we establish uniqueness of the value function and verification result for optimal stationary Markov controls, in addition to the existence results. We also revisit this problem under a near-monotonicity condition but without any stability hypothesis. Our results also include policy improvement algorithms both in discrete and continuous time frameworks.

MR Zbl

DOI : 10.1051/cocv/2022018

Classification : 90C40, 91B06, 60J10
Keywords: Risk-sensitive control, ergodic cost criterion, stochastic representation, verification result, Markov decision problem, near-monotone cost

@article{COCV_2022__28_1_A26_0,
     author = {Biswas, Anup and Pradhan, Somnath},
     title = {Ergodic risk-sensitive control of {Markov} processes on countable state space revisited},
     journal = {ESAIM: Control, Optimisation and Calculus of Variations},
     year = {2022},
     publisher = {EDP-Sciences},
     volume = {28},
     doi = {10.1051/cocv/2022018},
     mrnumber = {4429406},
     zbl = {1493.90218},
     language = {en},
     url = {https://www.numdam.org/articles/10.1051/cocv/2022018/}
}

TY  - JOUR
AU  - Biswas, Anup
AU  - Pradhan, Somnath
TI  - Ergodic risk-sensitive control of Markov processes on countable state space revisited
JO  - ESAIM: Control, Optimisation and Calculus of Variations
PY  - 2022
VL  - 28
PB  - EDP-Sciences
UR  - https://www.numdam.org/articles/10.1051/cocv/2022018/
DO  - 10.1051/cocv/2022018
LA  - en
ID  - COCV_2022__28_1_A26_0
ER  -

%0 Journal Article
%A Biswas, Anup
%A Pradhan, Somnath
%T Ergodic risk-sensitive control of Markov processes on countable state space revisited
%J ESAIM: Control, Optimisation and Calculus of Variations
%D 2022
%V 28
%I EDP-Sciences
%U https://www.numdam.org/articles/10.1051/cocv/2022018/
%R 10.1051/cocv/2022018
%G en
%F COCV_2022__28_1_A26_0

Biswas, Anup; Pradhan, Somnath. Ergodic risk-sensitive control of Markov processes on countable state space revisited. ESAIM: Control, Optimisation and Calculus of Variations, Tome 28 (2022), article no. 26. doi: 10.1051/cocv/2022018

Bibliographie
Cité par

[1] W. J. Anderson, Continuous-time Markov chains, Springer Series in Statistics: Probability and its Applications. Springer-Verlag, New York (1991). | MR | Zbl | DOI

[2] A. Arapostathis, A counterexample to a nonlinear version of the Kreĭn-Rutman theorem by R. Mahadevan. Nonlinear Anal. 171 (2018) 170–176. | MR | Zbl | DOI

[3] A. Arapostathis and A. Biswas, Infinite horizon risk-sensitive control of diffusions without any blanket stability assumptions. Stochastic Process. Appl. 128 (2018) 1485–1524. | MR | Zbl | DOI

[4] A. Arapostathis and A. Biswas, Risk-sensitive control for a class of diffusions with jumps. Preprint (2019). | arXiv | MR | Zbl

[5] A. Arapostathis, A. Biswas and S. Pradhan, On the policy improvement algorithm for ergodic risk-sensitive control. Proc. Royal Soc. Edinburgh: A Math. 151 (2021) 1305–1330. | MR | Zbl | DOI

[6] A. Arapostathis, A. Biswas and S. Saha, Strict monotonicity of principal eigenvalues of elliptic operators in $ℝ^{d}$ and risk-sensitive control. J. Math. Pures Appl. (9) 124 (2019) 169–219. | MR | Zbl | DOI

[7] A. Arapostathis, V. Borkar, E. Fernández-Gaucherand, M. Ghosh and S. Marcus, Discrete-time controlled Markov processes with average cost criterion: a survey. SIAM J. Control Optim. 31 (1993) 282–344. | MR | Zbl | DOI

[8] S. Balaji and S. P. Meyn, Multiplicative ergodicity and large deviations for an irreducible Markov chain. Stochastic Process. Appl. 90 (2000) 123–144. | MR | Zbl | DOI

[9] A. Basu and M. K. Ghosh, Zero-sum risk-sensitive stochastic games on a countable state space. Stochastic Process. Appl. 124 (2014) 961–983. | MR | Zbl | DOI

[10] N. Bäuerle and U. Rieder, More risk-sensitive Markov decision processes, Math. Oper. Res. 39 (2014) 105–120. | MR | Zbl | DOI

[11] N. Bäuerle and U. Rieder, Zero-sum risk-sensitive stochastic games, Stochastic Process. Appl. 127 (2017) 622–642. | MR | Zbl | DOI

[12] H. Berestycki, L. Nirenberg and S. R. S. Varadhan, The principal eigenvalue and maximum principle for second-order elliptic operators in general domains, Commun. Pure Appl. Math. 47 (1994) 47–92. | MR | Zbl | DOI

[13] H. Berestycki and L. Rossi, Generalizations and properties of the principal eigenvalue of elliptic operators in unbounded domains. Comm. Pure Appl. Math. 68 (2015) 1014–1065. | MR | Zbl | DOI

[14] D. P. Bertsekas and S. E. Shreve, Stochastic optimal control. Academic Press, New York (1978). | MR | Zbl

[15] T. Bielecki, D. Hernández-Hernández and S. R. Pliska, Risk sensitive control of finite state Markov chains in discrete time, with applications to portfolio management, vol. 50 (1999) 167–188. | MR | Zbl

[16] A. Biswas, An eigenvalue approach to the risk sensitive control problem in near monotone case. Syst. Control Lett. 60 (2011) 181–184. | MR | Zbl | DOI

[17] V. S. Borkar and S. P. Meyn, Risk-sensitive optimal control for Markov decision processes with monotone cost. Math. Oper. Res. 27 (2002) 192–209. | MR | Zbl | DOI

[18] G. B. Di Masi and L. Stettner, Risk-sensitive control of discrete-time Markov processes with infinite horizon. SIAM J. Control Optim. 38 (1999) 61–78. | MR | Zbl | DOI

[19] G. B. Di Masi and L. Stettner, Infinite horizon risk sensitive control of discrete time Markov processes under minorization property. SIAM J. Control Optim. 46 (2007) 231–252. | MR | Zbl | DOI

[20] A. Gheorghe, On risk-sensitive Markovian decision models for complex systems maintenance. Econ. Comp. Econom. Cybernet. Stud. Res. 1 (1976) 31–46. | MR | Zbl

[21] M. K. Ghosh and S. Saha, Risk-sensitive control of continuous time Markov chains. Stochastics 86 (2014) 655–675. | MR | Zbl | DOI

[22] X. Guo and O. Hernández-Lerma, Continuous-time Markov decision processes. Stoch. Model. Appl. Probab., vol. 62, Springer, Berlin (2009). | MR | Zbl

[23] X. Guo and Y. Huang, Risk-sensitive average continuous-time Markov decision processes with unbounded transition and cost rates. J. Appl. Probab. 58 (2021) 523–550. | MR | Zbl | DOI

[24] X. Guo and Z.-W. Liao, Risk-sensitive discounted continuous-time Markov decision processes with unbounded rates. SIAM J. Control Optim. 57 (2019) 3857–3883. | MR | Zbl | DOI

[25] X. Guo, Q. Liu and Y. Zhang, Finite horizon risk-sensitive continuous-time Markov decision processes with unbounded transition and cost rates. 4OR 17 (2019) 427–442. | MR | Zbl | DOI

[26] X. Guo and A. Piunovskiy, Discounted continuous-time Markov decision processes with constraints: unbounded transition and loss rates. Math. Oper. Res. 36 (2011) 105–132. | MR | Zbl | DOI

[27] X. Guo and J. Zhang, Risk-sensitive continuous-time Markov decision processes with unbounded rates and Borel spaces. Discrete Event Dyn. Syst. 29 (2019) 445–471. | MR | Zbl | DOI

[28] D. Hernández-Hernández and S. I. Marcus, Risk sensitive control of Markov processes in countable state space. Syst. Control Lett. 29 (1996) 147–155. | MR | Zbl | DOI

[29] O. Hernández-Lerma, Adaptive Markov control processes, vol. 79, Springer-Verlag, New York (1989). | MR | Zbl | DOI

[30] O. Hernández-Lerma and J. B. Lasserre, Further topics on discrete-time Markov control processes. Applications of Mathematics (New York), vol. 42, Springer-Verlag, New York (1999). | MR | Zbl

[31] R. A. Howard and J. E. Matheson, Risk-sensitive Markov decision processes, Management Sci. 18 (1971/1972) 356–369. | MR | Zbl | DOI

[32] D. H. Jacobson, Optimal stochastic linear systems with exponential performance criteria and their relation to deterministic differential games. IEEE Trans. Automatic Control AC-18 (1973) 124–131. | MR | Zbl | DOI

[33] M. R. James, J. S. Baras and R. J. Elliott, Risk-sensitive control and dynamic games for partially observed discrete-time nonlinear systems. IEEE Trans. Automat. Control 39 (1994) 780–792. | MR | Zbl | DOI

[34] M. Kitaev, Semi-Markov and jump Markov controlled models: average cost criterion. SIAM Theory Probab. Appl. 30 (1985) 272–288. | Zbl | DOI

[35] I. Kontoyiannis and S. P. Meyn, Spectral theory and limit theorems for geometrically ergodic Markov processes. Ann. Appl. Probab. 13 (2003) 304–362. | MR | Zbl | DOI

[36] M. G. Kreĭn and M. A. Rutman, Linear operators leaving invariant a cone in a Banach space. Amer. Math. Soc. Translation 1950 (1950) 128. | MR

[37] S. P. Meyn and R. L. Tweedie, Markov chains and stochastic stability, Communications and Control Engineering Series, Springer Verlag London, Ltd., London (1993). | MR | Zbl

[38] S. P. Meyn and R. L. Tweedie, Computable bounds for geometric convergence rates of Markov chains. Ann. Appl. Probab. 4 (1994) 981–1011. | MR | Zbl

[39] R. D. Nussbaum and Y. Pinchover, On variational principles for the generalized principal eigenvalue of second order elliptic operators and some applications, vol. 59, (1992) 161–177 | MR | Zbl

[40] A. Piunovskiy and Y. Zhang, Continuous-time Markov decision processes, Probability Theory and Stochastic Modelling, vol. 97, Springer, Cham (2020). | MR | Zbl | DOI

[41] T. Prieto-Rumeau and O. Hernández-Lerma, Uniform ergodicity of continuous-time controlled Markov chains: a survey and new results. Ann. Oper. Res. 241 (2016) 249–293. | MR | Zbl | DOI

[42] Y. Shen, W. Stannat and K. Obermayer, Risk-sensitive Markov control processes. SIAM J. Control Optim. 51 (2013) 3652-3672. | MR | Zbl | DOI

[43] J. L. Speyer, An adaptive terminal guidance scheme based on an exponential cost criterion with application to homing missile guidance. IEEE Trans. Automatic Control 21 (1976) 371–375. | Zbl | DOI

[44] J. L. Speyer, J. Deyst and D. H. Jacobson, Optimization of stochastic linear systems with additive measurement and process noise using exponential performance criteria. IEEE Trans. Automatic Control AC-19 (1974) 358–366. | MR | Zbl | DOI

[45] K. Suresh Kumar and C. Pal, Risk-sensitive ergodic control of continuous time Markov processes with denumerable state space. Stoch. Anal. Appl. 33 (2015) 863–881. | MR | Zbl | DOI

[46] Q. Wei, Continuous-time Markov decision processes with risk-sensitive finite-horizon cost criterion. Math. Methods Oper. Res. 84 (2016) 461–487. | MR | Zbl | DOI

[47] Q. Wei and X. Chen, Continuous-time Markov decision processes under the risk-sensitive average cost criterion. Oper. Res. Lett. 44 (2016) 457–462. | MR | Zbl | DOI

[48] P. Whittle, Risk-sensitive linear/quadratic/Gaussian control. Adv. in Appl. Probab. 13 (1981) 764–777. | MR | Zbl | DOI

Cité par Sources :