A policy iteration method for mean field games

Cacace, Simone; Camilli, Fabio; Goffi, Alessandro

doi:10.1051/cocv/2021081

Cacace, Simone ; Camilli, Fabio ; Goffi, Alessandro

ESAIM: Control, Optimisation and Calculus of Variations, Tome 27 (2021), article no. 85

Résumé

The policy iteration method is a classical algorithm for solving optimal control problems. In this paper, we introduce a policy iteration method for Mean Field Games systems, and we study the convergence of this procedure to a solution of the problem. We also introduce suitable discretizations to numerically solve both stationary and evolutive problems. We show the convergence of the policy iteration method for the discrete problem and we study the performance of the proposed algorithm on some examples in dimension one and two.

Reçu le : 2020-12-21
Accepté le : 2021-07-09
Première publication : 2021-01-28
Publié le : 2021-07-27

DOI : 10.1051/cocv/2021081

Classification : 49N70, 35Q91, 91A16, 49M15
Keywords: Mean Field Games, policy iteration, convergence, numerical methods

@article{COCV_2021__27_1_A87_0,
     author = {Cacace, Simone and Camilli, Fabio and Goffi, Alessandro},
     title = {A policy iteration method for mean field games},
     journal = {ESAIM: Control, Optimisation and Calculus of Variations},
     year = {2021},
     publisher = {EDP-Sciences},
     volume = {27},
     doi = {10.1051/cocv/2021081},
     language = {en},
     url = {https://www.numdam.org/articles/10.1051/cocv/2021081/}
}

TY  - JOUR
AU  - Cacace, Simone
AU  - Camilli, Fabio
AU  - Goffi, Alessandro
TI  - A policy iteration method for mean field games
JO  - ESAIM: Control, Optimisation and Calculus of Variations
PY  - 2021
VL  - 27
PB  - EDP-Sciences
UR  - https://www.numdam.org/articles/10.1051/cocv/2021081/
DO  - 10.1051/cocv/2021081
LA  - en
ID  - COCV_2021__27_1_A87_0
ER  -

%0 Journal Article
%A Cacace, Simone
%A Camilli, Fabio
%A Goffi, Alessandro
%T A policy iteration method for mean field games
%J ESAIM: Control, Optimisation and Calculus of Variations
%D 2021
%V 27
%I EDP-Sciences
%U https://www.numdam.org/articles/10.1051/cocv/2021081/
%R 10.1051/cocv/2021081
%G en
%F COCV_2021__27_1_A87_0

Cacace, Simone; Camilli, Fabio; Goffi, Alessandro. A policy iteration method for mean field games. ESAIM: Control, Optimisation and Calculus of Variations, Tome 27 (2021), article no. 85. doi: 10.1051/cocv/2021081

Bibliographie
Cité par

[1] Y. Achdou and I. Capuzzo Dolcetta Mean field games: numerical methods. SIAM J. Numer. Anal. 48 (2010) 1136–1162.

[2] Y. Achdou, F. Camilli and I. Capuzzo Dolcetta, Mean field games: convergence of a finite difference method. SIAM J. Numer. Anal. 51 (2013) 2585–2612.

[3] Y. Achdou and M. Lauriere, On the system of partial differential equations arising in mean field type control. Discrete Contin. Dyn. Syst. 35 (2015) 3879–3900.

[4] Y. Achdou and M. Lauriere, Mean Field Games and applications: numerical aspects, in Mean field games. Vol. 2281 of Lecture Notes in Math. Springer, Cham (2020).

[5] A. Alla, M. Falcone and D. Kalise, An efficient policy iteration algorithm for dynamic programming equations. SIAM J. Sci. Comput. 37 (2015) A181–A200.

[6] M. Bardi and E. Feleqi, Nonlinear elliptic systems and mean-field games. NoDEA Nonlinear Differ. Equ. Appl. 23 (2016) 44.

[7] M. Bardi and F. Priuli, Linear-quadratic N-person and mean-field games with ergodic cost. SIAM J. Control Optim. 52 (2014) 3022–3052.

[8] R. Bellman, Dynamic Programming. Princeton Univ. Press, Princeton (1957).

[9] S. Bianchini, M. Colombo, G. Crippa and L. V. Spinolo, Optimality of integrability estimates for advection-diffusion equations. NoDEA Nonlinear Differ. Equ. Appl. 24 (2017) 33.

[10] O. Bokanowski, S. Maroso and H. Zidani, Some convergence results for Howard’s algorithm. SIAM J. Numer. Anal. 47 (2009) 3001–3026.

[11] A. Briani and P. Cardaliaguet, Stable solutions in potential mean field game systems. NoDEA Nonlinear Differ. Equ. Appl. 25 (2018), no. 1.

[12] L. M. Briceño-Arias, D. Kalise and F. J. Silva, Proximal methods for stationary mean field games with local couplings. SIAM J. Control Optim. 56 (2018) 801–836.

[13] S. Cacace and F. Camilli, A generalized Newton method for homogenization of Hamilton-Jacobi equations. SIAM J. Sci. Comput. 38 (2016) A3589–A3617.

[14] P. Cardaliaguet and S. Hadikhanloo, Learning in mean field games: the fictitious play. ESAIM: COCV 23 (2017) 569–591.

[15] P. Cardaliaguet, J.-M. Lasry, P.-L. Lions and A. Porretta, Long time average of mean field games. Netw. Heterog. Media 7 (2012) 279–301.

[16] E. Carlini and F. Silva, A semi-Lagrangian scheme for a degenerate second order mean field game system. Discrete Contin. Dyn. Syst. 35 (2015) 4269–4292.

[17] R. Carmona and F. Lacker, Mean field games of timing and models for bank runs. Appl. Math. Optim. 76 (2017) 217–260.

[18] M. Cirant and A. Goffi, On the existence and uniqueness of solutions to time-dependent fractional MFG. SIAM J. Math. Anal. 51 (2019) 913–954.

[19] M. Cirant and A. Goffi, Lipschitz regularity for viscous Hamilton-Jacobi equations with L$$ terms. Ann. Inst. H. Poincaré Anal. Non Linéaire 37 (2020) 757–784.

[20] M. Cirantand A. Goffi, On the problem of maximal L$$-regularity for viscous Hamilton-Jacobi equations. Arch. Rat. Mech. Anal. 240 (2021) 1521–1534.

[21] T. Davis, SuiteSparse, http://faculty.cse.tamu.edu/davis/suitesparse.html.

[22] W. H. Fleming, Some Markovian optimization problems. J. Math. Mech. 12 (1963) 131–140.

[23] D. A. Gomes, L. Nurbekyan and E. A. Pimentel, Economic models and mean-field games theory. IMPA Mathematical Publications, Instituto Nacional de Matemática Pura e Aplicada (IMPA), Rio de Janeiro (2015) iv+127 pp.

[24] R. Howard, Dynamic Programming and Markov Processes. MIT Press, Cambridge (1960).

[25] M. Huang, P. E. Caines and R. P. Malhame, Large-population cost-coupled LQG problems with non uniform agents: Individual-mass behaviour and decentralized ϵ-Nash equilibria. IEEE Trans. Autom. Control 52 (2007) 1560–1571.

[26] B. Kerimkulov, D. Šiška and L. Szpruch, Exponential convergence and stability of Howards’s policy improvement algorithm for controlled diffusions. SIAM J. Control Optim. 53 (2020) 1314–1340.

[27] O. A. Ladyzenskaja, V. A. Solonnikov and N. N. Ural’Ceva, Linear and quasilinear equations of parabolic type. Translated from the Russian by S. Smith. Translations of Mathematical Monographs. American Mathematical Society, Providence, R.I. (1968).

[28] J.-M. Lasry and P.-L. Lions, Mean field games. Jpn. J. Math. 2 (2007) 229–260.

[29] P.-L. Lions, Quelques remarques sur les problemes elliptiques quasilináires du second ordre. J. Analyse Math. 45 (1985) 234–54.

[30] A. Lunardi, Interpolation theory. Vol. 16 of Appunti della Scuola Normale Superiore di Pisa (Nuova Serie) (2018).

[31] G. Metafune, D. Pallara and A. Rhandi, Global properties of transition probabilities of singular diffusions. Teor. Veroyatn. Primen. 54 (2009) 116–148.

[32] A. Porretta, On the turnpike property for mean field games. Minimax Theory Appl. 3 (2018) 285–312.

[33] M. L. Puterman, On the convergence of policy iteration for controlled diffusions. J. Optim. Theory Appl. 33 (1981) 137–144.

[34] M. L. Puterman, Optimal control of diffusion processes with reflection. J. Optim. Theory Appl. 22 (1977) 103–116.

[35] M. L. Puterman and S. L. Brumelle, On the convergence of policy iteration in stationary dynamic programming. Math. Oper. Res. 4 (1979) 60–69.

[36] M. S. Santos and J. Rust, Convergence properties of policy iteration. SIAM J. Control Optim. 42 (2004) 2094–2115.

[37] H.-J. Schmeisser and H. Triebel, Topics in Fourier analysis and function spaces. A Wiley-Interscience Publication. John Wiley & Sons, Ltd., Chichester (1987).

Cité par Sources :