Optimization with learning-informed differential equation constraints and its applications
ESAIM: Control, Optimisation and Calculus of Variations, Tome 28 (2022), article no. 3

Inspired by applications in optimal control of semilinear elliptic partial differential equations and physics-integrated imaging, differential equation constrained optimization problems with constituents that are only accessible through data-driven techniques are studied. A particular focus is on the analysis and on numerical methods for problems with machine-learned components. For a rather general context, an error analysis is provided, and particular properties resulting from artificial neural network based approximations are addressed. Moreover, for each of the two inspiring applications analytical details are presented and numerical results are provided.

DOI : 10.1051/cocv/2021100
Classification : 49M15, 65J15, 65J20, 65K10, 90C30, 35J61, 68T07
Keywords: Optimal control, semilinear PDEs, integrated physics-based imaging, learning-informed model, artificial neural network, quantitative MRI, semi-smooth Newton SQP algorithm
@article{COCV_2022__28_1_A3_0,
     author = {Dong, Guozhi and Hinterm\"uller, Michael and Papafitsoros, Kostas},
     title = {Optimization with learning-informed differential equation constraints and its applications},
     journal = {ESAIM: Control, Optimisation and Calculus of Variations},
     year = {2022},
     publisher = {EDP-Sciences},
     volume = {28},
     doi = {10.1051/cocv/2021100},
     mrnumber = {4362194},
     zbl = {1481.49028},
     language = {en},
     url = {https://www.numdam.org/articles/10.1051/cocv/2021100/}
}
TY  - JOUR
AU  - Dong, Guozhi
AU  - Hintermüller, Michael
AU  - Papafitsoros, Kostas
TI  - Optimization with learning-informed differential equation constraints and its applications
JO  - ESAIM: Control, Optimisation and Calculus of Variations
PY  - 2022
VL  - 28
PB  - EDP-Sciences
UR  - https://www.numdam.org/articles/10.1051/cocv/2021100/
DO  - 10.1051/cocv/2021100
LA  - en
ID  - COCV_2022__28_1_A3_0
ER  - 
%0 Journal Article
%A Dong, Guozhi
%A Hintermüller, Michael
%A Papafitsoros, Kostas
%T Optimization with learning-informed differential equation constraints and its applications
%J ESAIM: Control, Optimisation and Calculus of Variations
%D 2022
%V 28
%I EDP-Sciences
%U https://www.numdam.org/articles/10.1051/cocv/2021100/
%R 10.1051/cocv/2021100
%G en
%F COCV_2022__28_1_A3_0
Dong, Guozhi; Hintermüller, Michael; Papafitsoros, Kostas. Optimization with learning-informed differential equation constraints and its applications. ESAIM: Control, Optimisation and Calculus of Variations, Tome 28 (2022), article no. 3. doi: 10.1051/cocv/2021100

[1] R. A. Adams and J. J. F. Fournier, Sobolev spaces, volume 140 of Pure and Applied Mathematics (Amsterdam). Elsevier/ Press, Amsterdam (2003), second edition. | MR | Zbl

[2] J. Adler and O. Öktem, Solving ill-posed inverse problems using iterative deep neural networks. Inverse Probl. 33 (2017) 124007. | MR | Zbl | DOI

[3] C. D. Aliprantis and K. C. Border, Infinite dimensional analysis, a Hitchhiker’s Guide. Springer (2006). | MR | Zbl

[4] S. Arridge, P. Maass, O. Öktem and C. B. Schönlieb, Solving inverse problems using data-driven models. Acta Numer. 28 (2019) 1–174. | MR | Zbl | DOI

[5] H. Attouch, G. Buttazzo and G. Michaille, Variational analysis in Sobolev and BV spaces, volume 17 of MOS-SIAM Series on Optimization. Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA; Mathematical Optimization Society, Philadelphia, PA (2014), second edition. | MR | Zbl

[6] T. Bachlechner, B. P. Majumder, H. H. Mao, G. W. Cottrell and J. Mcauley, Rezero is all you need: Fast convergence at large depth. Preprint arXiv (2020).

[7] B. Baker, O. Gupta, N. Naik and R. Raskar, Designing neural network architectures using reinforcement learning. Conference paper on International Conference on Learning Representations (2017) 1–18. https://openreview.net/pdf?id=S1c2cvqee.

[8] F. Balsiger, A. Shridhar Konar, S. Chikop, V. Chandran, O. Scheidegger, S. Geethanath and M. Reyes, Magnetic resonance fingerprinting reconstruction via spatiotemporal convolutional neural networks. In Machine Learning for Medical Image Reconstruction. MLMIR 2018, edited by D. Rueckert, F. Knoll and A. Maier. Volume 11074 of LNCS. Springer, Cham (2018) 39–46. | DOI

[9] F. Bloch, Nuclear induction. Phys. Rev. 70 (1946) 460–473. | DOI

[10] L. Bottou, F. E. Curtis and J. Nocedal, Optimization methods for large-scale machine learning. SIAM Rev. 60 (2018) 223–311. | MR | Zbl | DOI

[11] A. Braides, Convergence of local minimizers. In Local Minimization, Variational Evolution and Γ -Convergence. Springer (2014) 67–78. | MR | Zbl | DOI

[12] Brainweb: Simulated brain database. http://www.bic.mni.mcgill.ca/brainweb/.

[13] L. Bungert, R. Raab, T. Roith, L. Schwinn and D. Tenbrinck, CLIP: Cheap Lipschitz training of neural networks. Preprint (2021). | arXiv | Zbl | DOI

[14] D. L. Collins, A. P. Zijdenbos, V. Kollokian, J. G. Sled, N. J. Kabani, C. J. Holmes and A. C. Evans, Design and construction of a realistic digital brain phantom. IEEE Trans. Med. Imag. 17 (1998) 463–468. | DOI

[15] W. M. Czarnecki, S. Osindero, M. Jaderberg, G. Swirszcz and R. Pascanu, Sobolev training for neural networks. In Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS’17 (2017) 4281–4290.

[16] G. Dal Maso, Introduction to Γ -convergence. Birkhäuser (1993). | MR | Zbl | DOI

[17] H. Daniels and M. Velikova, Monotone and partially monotone neural networks. IEEE Trans. Neural Netw. 21 (2010) 906–917. | DOI

[18] M. Davies, G. Puy, P. Vandergheynst and Y. Wiaux, A compressed sensing framework for magnetic resonance fingerprinting. SIAM J. Imag. Sci. 7 (2014) 2623–2656. | MR | Zbl | DOI

[19] S. P. Dirkse and M. C. Ferris, The path solver: A non-monotone stabilization scheme for mixed complementarity problems. Optim. Methods Softw. 5 (1995) 123–156. | DOI

[20] G. Dong, M. Hintermüller and K. Papafitsoros, Quantitative magnetic resonance imaging: From fingerprinting to integrated physics-based models. SIAM J. Imag. Sci. 12 (2019) . | DOI | MR | Zbl

[21] W.E., A proposal on machine learning via dynamical systems. Commun. Math. Stat. 5 (2017) 1–11. | MR | Zbl | DOI

[22] H. Engl, M. Hanke and A. Neubauer, https://www.springer.com/gp/book/9780792341574.

[23] L. C. Evans, Partial differential equations. Vol. 19 of Graduate studies in mathematics. American Mathematical Society, second edition (2010). | MR | Zbl

[24] H. O. Fattorini, Infinite-dimensional optimization and control theory. Vol. 62 of Encyclopedia of Mathematics and its Applications. Cambridge University Press, Cambridge (1999). | MR | Zbl

[25] I. Goodfellow, Y. Bengio and A. Courville, Deep Learning. MIT Press (2016). | MR | Zbl

[26] I. Gühring, G. Kutyniok and P. Petersen, Error bounds for approximations with deep ReLU neural networks in W s , p norms. Anal. Appl. 18 (2020) 803–859. | MR | Zbl | DOI

[27] E. Haber and L. Ruthotto, Stable architectures for deep neural networks. Inverse Prob. 34 (2018) 014004. | MR | Zbl | DOI

[28] J. Han, A. Jentzen and W.E., Solving high-dimensional partial differential equations using deep learning. Proc. Natl. Acad. Sci. 115 (2018) 8505–8510. | MR | Zbl | DOI

[29] M. Hanke, The regularizing Levenberg-Marquardt scheme is of optimal order. J. Integr. Equ. Appl. 22 (2010) 259–283. | MR | Zbl | DOI

[30] K. He, X. Zhang, S. Ren and J. Sun, Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016). https://openaccess.thecvf.com/content_cvpr_2016/papers/He_Deep_Residual_Learning_CVPR_2016_paper.pdf.

[31] M. Hintermüller, Mesh independence and fast local convergence of a primal-dual active-set method for mixed control-state constrained elliptic control problems. ANZIAM J. 49 (2007) 1–38. | MR | Zbl | DOI

[32] M. Hintermüller, K. Ito and K. Kunisch, The primal-dual active set strategy as a semismooth Newton method. SIAM J. Optim. 13 (2002) 865–888. | MR | Zbl | DOI

[33] M. Hintermüller and K. Kunisch, Feasible and noninterior path-following in constrained minimization with low multiplier regularity. SIAM J. Control Optim. 45 (2006) 1198–1221. | MR | Zbl | DOI

[34] M. Hintermüller and M. Ulbrich, A mesh-independence result for semismooth Newton methods. Mathematical Programming, Series B 101 (2004) 151–184. | MR | Zbl | DOI

[35] M. Leshno, V. Y. Lin, A. Pinkus and S. Schocken, Multilayer feedforward networks with a nonpolynomial activation function can approximate any function. Neural Netw. 6 (1993) 861–867. | DOI

[36] J.-L. Lions, Optimal control of systems governed by partial differential equations. Translated from the French by S. K. Mitter. Die Grundlehren der mathematischen Wissenschaften, Band 170. Springer-Verlag, New York-Berlin (1971). | MR | Zbl

[37] X. Liu, X. Han, N. Zhang and Q. Liu, Certified monotonic neural networks. In Volume 33 of Advances in Neural Information Processing Systems, edited by H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, and H. Lin (2020) 15427–15438.

[38] Z. Long, Y. Lu, X. Ma and B. Dong, PDE-Net: Learning PDEs from data. Proc. Mach. Learn. Res. 80 (2018) 3208–3216.

[39] S. Lu and J. Flemming, Convergence rate analysis of Tikhonov regularization for nonlinear ill-posed problems with noisy operators. Inverse Probl. 28 (2012) 104003. | MR | Zbl | DOI

[40] D. Ma, V. Gulani, N. Seiberlich, K. Liu, J. Sunshine, J. L. Duerk and M. A. Griswold, Magnetic resonance fingerprinting. Nature 495 (2013) 187–193. | DOI

[41] D. J. C. Mackay, Bayesian interpolation. Neural Comput. 4 (1992) 415–447. | DOI

[42] G. Mazor, L. Weizman, A. Tal and Y. C. Eldar, Low-rank magnetic resonance fingerprinting. Med. Phys. 45 (2018) 4066–4084. | DOI

[43] P. Neittaanmaki, J. Sprekels and D. Tiba, Optimization of elliptic systems. Springer Monographs in Mathematics. Springer, New York (2006). Theory and applications. | MR | Zbl

[44] D. Nguyen and B. Widrow, Improving the learning speed of 2-layer neural networks by choosing initial values of the adaptive weights. 1990 IJCNN International Joint Conference on Neural Networks 3 (1990) 21–26. | DOI

[45] A. Pinkus, Approximation theory of the MLP model in neural networks. Acta Numer. 8 (1999) 143–195. | MR | Zbl | DOI

[46] M. J. D. Powell, A view of unconstrained optimization. In Optimization in Action, edited by L. C. W. Dixon. Academic Press, London and New York (1976) 117–152. | MR

[47] T. Qin, K. Wu and D. Xiu, Data driven governing equations approximation using deep neural networks. J. Comput. Phys. 395 (2019) 620–635. | MR | Zbl | DOI

[48] D. Ralph, Global convergence of damped Newton’s method for nonsmooth equations via the path search. Math. Oper. Res. 19 (1994) 352–389. | MR | Zbl | DOI

[49] K. Scheffler, A pictorial description of steady-states in rapid magnetic resonance imaging. Concepts Magn. Reson. 11 (1999) 187–193. | DOI

[50] J. Sirignano and K. Spiliopoulos, DGM: A deep learning algorithm for solving partial differential equations. J. Comput. Phys. 375 (2018) 1339–1364. | MR | Zbl | DOI

[51] A. Sivaraman, G. Farnadi, T. Millstein and G. Van Den Broeck Counterexample-guided learning of monotonic neural networks. Preprint (2020). | arXiv

[52] G. Teschl, Ordinary Differential Equations and Dynamical Systems. Volume 140 of Graduate Studies in Mathematics. American Mathematical Society, first edition (2012). | MR | Zbl | DOI

[53] F. Tröltsch, Optimal Control of Partial Differential Equations: Theory, Methods and Applications. Vol. 112 of Graduate Studies in Mathematics. American Mathematical Society (2010). | MR | Zbl | DOI

[54] J. Zowe and S. Kurcyusz, Regularity and stability for the mathematical programming problem in banach spaces. Appl. Math. Optim. 5 (1970) 49–62. | MR | Zbl | DOI

Cité par Sources :

This work is supported by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany’s Excellence Strategy – The Berlin Mathematics Research Center MATH+ (EXC-2046/1, project ID: 390685689). The work of MH is partially supported by the DFG SPP 1962, project-145r. The work of GD is partially supported by an NSFC grant (No. 12001194).

The authors acknowledge the support of Tsinghua–Sanya International Mathematical Forum (TSIMF), as some of the ideas in the paper were discussed there while all the authors attended the workshop on “Efficient Algorithms in Data Science, Learning and Computational Physics” in January 2020.