Iterative Linear Quadratic Optimization for Nonlinear Control: Differentiable Programming Algorithmic Templates
Open Journal of Mathematical Optimization, Tome 5 (2024), article no. 8, 63 p.

Iterative optimization algorithms depend on access to information about the objective function. In a differentiable programming framework, this information, such as gradients, can be automatically derived from the computational graph. We explore how nonlinear control algorithms, often employing linear and/or quadratic approximations, can be effectively cast within this framework. Our approach illuminates shared components and differences between gradient descent, Gauss–Newton, Newton, and differential dynamic programming methods in the context of discrete time nonlinear control. Furthermore, we present line-search strategies and regularized variants of these algorithms, along with a comprehensive analysis of their computational complexities. We study the performance of the aforementioned algorithms on various nonlinear control benchmarks, including autonomous car racing simulations using a simplified car model. All implementations are publicly available in a package coded in a differentiable programming language.

Reçu le :
Révisé le :
Accepté le :
Publié le :
DOI : 10.5802/ojmo.32
Keywords: Nonlinear Discrete Time Control, Differentiable Programming, Newton, Gauss–Newton, Dynamic Differentiable Programming

Roulet, Vincent  1   ; Srinivasa, Siddhartha  2   ; Fazel, Maryam  3   ; Harchaoui, Zaid  4

1 Google Brain, Seattle, USA (Work completed at the University of Washington before joining Google)
2 Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, USA
3 Department of Electrical and Computer Engineering, University of Washington, Seattle, USA
4 Department of Statistics University of Washington, Seattle, USA
Licence : CC-BY 4.0
Droits d'auteur : Les auteurs conservent leurs droits
@article{OJMO_2024__5__A9_0,
     author = {Roulet, Vincent and Srinivasa, Siddhartha and Fazel, Maryam and Harchaoui, Zaid},
     title = {Iterative {Linear} {Quadratic} {Optimization} for {Nonlinear} {Control:}  {Differentiable} {Programming} {Algorithmic} {Templates}},
     journal = {Open Journal of Mathematical Optimization},
     eid = {8},
     pages = {1--63},
     year = {2024},
     publisher = {Universit\'e de Montpellier},
     volume = {5},
     doi = {10.5802/ojmo.32},
     mrnumber = {4822314},
     language = {en},
     url = {https://www.numdam.org/articles/10.5802/ojmo.32/}
}
TY  - JOUR
AU  - Roulet, Vincent
AU  - Srinivasa, Siddhartha
AU  - Fazel, Maryam
AU  - Harchaoui, Zaid
TI  - Iterative Linear Quadratic Optimization for Nonlinear Control:  Differentiable Programming Algorithmic Templates
JO  - Open Journal of Mathematical Optimization
PY  - 2024
SP  - 1
EP  - 63
VL  - 5
PB  - Université de Montpellier
UR  - https://www.numdam.org/articles/10.5802/ojmo.32/
DO  - 10.5802/ojmo.32
LA  - en
ID  - OJMO_2024__5__A9_0
ER  - 
%0 Journal Article
%A Roulet, Vincent
%A Srinivasa, Siddhartha
%A Fazel, Maryam
%A Harchaoui, Zaid
%T Iterative Linear Quadratic Optimization for Nonlinear Control:  Differentiable Programming Algorithmic Templates
%J Open Journal of Mathematical Optimization
%] 8
%D 2024
%P 1-63
%V 5
%I Université de Montpellier
%U https://www.numdam.org/articles/10.5802/ojmo.32/
%R 10.5802/ojmo.32
%G en
%F OJMO_2024__5__A9_0
Roulet, Vincent; Srinivasa, Siddhartha; Fazel, Maryam; Harchaoui, Zaid. Iterative Linear Quadratic Optimization for Nonlinear Control:  Differentiable Programming Algorithmic Templates. Open Journal of Mathematical Optimization, Tome 5 (2024), article  no. 8, 63 p.. doi: 10.5802/ojmo.32

[1] Abadi, Martín; Agarwal, Ashish; Barham, Paul; Brevdo, Eugene; Chen, Zhifeng; Citro, Craig; Corrado, Greg S.; Davis, Andy; Dean, Jeffrey; Devin, Matthieu; Ghemawat, Sanjay; Goodfellow, Ian; Harp, Andrew; Irving, Geoffrey; Isard, Michael; Jia, Yangqing; Jozefowicz, Rafal; Kaiser, Lukasz; Kudlur, Manjunath; Levenberg, Josh; Mané, Dan; Monga, Rajat; Moore, Sherry; Murray, Derek; Olah, Chris; Schuster, Mike; Shlens, Jonathon; Steiner, Benoit; Sutskever, Ilya; Talwar, Kunal; Tucker, Paul; Vanhoucke, Vincent; Vasudevan, Vijay; Viégas, Fernanda; Vinyals, Oriol; Warden, Pete; Wattenberg, Martin; Wicke, Martin; Yu, Yuan; Zheng, Xiaoqiang TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems, 2015 (https://tensorflow.org/)

[2] Andersson, Joel A. E.; Gillis, Joris; Horn, Greg; Rawlings, James; Diehl, Moritz CasADi – A software framework for nonlinear optimization and optimal control, Math. Program. Comput., Volume 11 (2019), pp. 1-36 | MR | DOI | Zbl

[3] Bambade, Antoine; El-Kazdadi, Sarah; Taylor, Adrien; Carpentier, Justin PROX-QP: Yet another quadratic programming solver for robotics and beyond (2022) (RSS 2022-Robotics: Science and Systems)

[4] Baur, Walter; Strassen, Volker The complexity of partial derivatives, Theor. Comput. Sci., Volume 22 (1983) no. 3, pp. 317-330 | MR | DOI | Zbl

[5] Baydin, Atilim Gunes; Pearlmutter, Barak; Radul, Alexey Andreyevich; Siskind, Jeffrey Mark Automatic differentiation in machine learning: a survey, J. Mach. Learn. Res., Volume 18 (2018), 153, 43 pages | MR | Zbl

[6] Bellman, Richard Introduction to the mathematical theory of control processes. Vol. II: Nonlinear processes, Mathematics in Science and Engineering, 40-II, Academic Press Inc., 1971 | MR

[7] Betts, John Practical methods for optimal control and estimation using nonlinear programming, Society for Industrial and Applied Mathematics, 2010 | MR | DOI

[8] Bock, Hans Georg; Plitt, Karl-Josef A multiple shooting algorithm for direct solution of optimal control problems, IFAC Proceedings Volumes, Volume 17 (1984) no. 2, pp. 1603-1608 | DOI

[9] Bolte, Jerome; Pauwels, Edouard A mathematical model for automatic differentiation in machine learning, Proceedings of the 34rd International Conference on Neural Information Processing Systems, Curran Associates, Inc. (2020)

[10] Boyd, Stephen; Vandenberghe, Lieven Semidefinite programming relaxations of non-convex problems in control and combinatorial optimization, Communications, Computation, Control, and Signal Processing, Springer, 1997, pp. 279-287 | MR | DOI

[11] Boyd, Stephen; Vandenberghe, Lieven Convex optimization, Cambridge University Press, 2004 | MR | DOI

[12] Bradbury, James; Frostig, Roy; Hawkins, Peter; Johnson, Matthew James; Leary, Chris; Maclaurin, Dougal; Necula, George; Paszke, Adam; VanderPlas, Jake; Wanderman-Milne, Skye; Zhang, Qiao JAX: composable transformations of Python+NumPy programs, 2018 (version 0.3.13, https://github.com/google/jax)

[13] Bynum, Michael L.; Hackebeil, Gabriel A.; Hart, William E.; Laird, Carl D.; Nicholson, Bethany L.; Siirola, John D.; Watson, Jean-Paul; Woodruff, David L. Pyomo–optimization modeling in python, Springer Optimization and Its Applications, 67, Springer, 2021 | MR | Zbl

[14] Diehl, Moritz; Bock, Hans Georg; Diedam, Holger; Wieber, P.-B. Fast direct multiple shooting algorithms for optimal robot control, Fast motions in biomechanics and robotics: optimization and feedback control (Lecture Notes in Control and Information Sciences), Volume 340, Springer, 2006, pp. 65-93 | DOI | Zbl

[15] Diehl, Moritz; Ferreau, Hans Joachim; Haverbeke, Niels Efficient numerical methods for nonlinear MPC and moving horizon estimation, Nonlinear model predictive control: towards new challenging applications (Lecture Notes in Control and Information Sciences), Volume 384, Springer, 2009, pp. 391-417 | Zbl | DOI

[16] Dunn, Joseph; Bertsekas, Dimitri Efficient dynamic programming implementations of Newton’s method for unconstrained optimal control problems, J. Optim. Theory Appl., Volume 63 (1989) no. 1, pp. 23-38 | MR | DOI | Zbl

[17] Dunning, Iain; Huchette, Joey; Lubin, Miles JuMP: A modeling language for mathematical optimization, SIAM Rev., Volume 59 (2017) no. 2, pp. 295-320 | MR | DOI | Zbl

[18] Farshidian, Farbod; Neunert, Michael; Winkler, Alexander W.; Rey, Gonzalo; Buchli, Jonas An efficient optimal planning and control framework for quadrupedal locomotion, 2017 IEEE International Conference on Robotics and Automation (ICRA), IEEE (2017), pp. 93-100 | DOI

[19] Frasch, Janick V.; Sager, Sebastian; Diehl, Moritz A parallel quadratic programming method for dynamic optimization problems, Math. Program. Comput., Volume 7 (2015) no. 3, pp. 289-329 | MR | DOI | Zbl

[20] Giftthaler, Markus; Neunert, Michael; Stäuble, Markus; Buchli, Jonas; Diehl, Moritz A family of iterative Gauss-Newton shooting methods for nonlinear optimal control, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), IEEE (2018), pp. 1-9

[21] Gilbert, Jean Charles Automatic differentiation and iterative processes, Optim. Methods Softw., Volume 1 (1992) no. 1, pp. 13-21 | DOI

[22] Gill, Philip E.; Murray, Walter; Saunders, Michael A. SNOPT: An SQP algorithm for large-scale constrained optimization, SIAM Rev., Volume 47 (2005) no. 1, pp. 99-131 | MR | DOI | Zbl

[23] Goodfellow, Ian; Bengio, Yoshua; Courville, Aaron Deep learning, MIT Press, 2016 | MR

[24] Griewank, Andreas; Walther, Andrea Evaluating derivatives: principles and techniques of algorithmic differentiation, Society for Industrial and Applied Mathematics, 2008 | MR | DOI

[25] Houska, Boris; Diehl, Moritz A quadratically convergent inexact SQP method for optimal control of differential algebraic equations, Optim. Control Appl. Methods, Volume 34 (2013) no. 4, pp. 396-414 | DOI | Zbl

[26] Jacobson, David; Mayne, David Differential Dynamic Programming, Elsevier, 1970 | MR

[27] Jallet, Wilson; Bambade, Antoine; Arlaud, Etienne; El-Kazdadi, Sarah; Mansard, Nicolas; Carpentier, Justin Proxddp: Proximal constrained trajectory optimization (2023)

[28] Kakade, Sham; Krishnamurthy, Akshay; Lowrey, Kendall; Ohnishi, Motoya; Sun, Wen Information theoretic regret bounds for online nonlinear control, Adv. Neural Inf. Process. Syst., Volume 33 (2020), pp. 15312-15325

[29] LeCun, Yann A theoretical framework for back-propagation, 1988 Connectionist Models Summer School, CMU, Pittsburg, PA (1988)

[30] Li, Weiwei; Todorov, Emanuel Iterative linearization methods for approximately optimal control and estimation of non-linear stochastic system, Int. J. Control, Volume 80 (2007) no. 9, pp. 1439-1453 | MR | DOI | Zbl

[31] Liao, Li-Zhi; Shoemaker, Christine Convergence in unconstrained discrete-time differential dynamic programming, IEEE Trans. Autom. Control, Volume 36 (1991) no. 6, pp. 692-706 | MR | DOI | Zbl

[32] Liao, Li-Zhi; Shoemaker, Christine Advantages of differential dynamic programming over Newton’s method for discrete-time optimal control problems (1992) (Technical report)

[33] Liniger, Alexander; Domahidi, Alexander; Morari, Manfred Optimization-based autonomous racing of 1: 43 scale RC cars, Optim. Control Appl. Methods, Volume 36 (2015) no. 5, pp. 628-647 | MR | DOI | Zbl

[34] Lions, Pierre-Louis Generalized Solutions of Hamilton-Jacobi Equations, Pitman, 1982 | MR

[35] Magdy, Mohamed; El Marhomy, Abdallah; Attia, Mahmoud A. Modeling of inverted pendulum system with gravitational search algorithm optimized controller, Ain Shams Engineering Journal, Volume 10 (2019) no. 1, pp. 129-149 | DOI

[36] Mayne, David; Polak, Elijah First-order strong variation algorithms for optimal control, J. Optim. Theory Appl., Volume 16 (1975) no. 3, pp. 277-301 | MR | DOI

[37] Messerer, Florian; Baumgärtner, Katrin; Diehl, Moritz Survey of sequential convex programming and generalized Gauss–Newton methods, ESAIM, Proc. Surv., Volume 71 (2021), pp. 64-88 | MR | DOI | Zbl

[38] Murray, Donald; Yakowitz, Sidney Differential dynamic programming and Newton’s method for discrete optimal control problems, J. Optim. Theory Appl., Volume 43 (1984) no. 3, pp. 395-414 | MR | DOI | Zbl

[39] Nesterov, Yurii Lectures on convex optimization, Springer, 2018 | MR | DOI

[40] Nganga, John; Wensing, Patrick Accelerating Second-Order Differential Dynamic Programming for Rigid-Body Systems, IEEE Robotics and Automation Letters, Volume 6 (2021) no. 4, pp. 7659-7666 | DOI

[41] Nocedal, Jorge; Wright, Stephen Numerical optimization, Springer, 2006 | MR

[42] Pantoja, J. Differential dynamic programming and Newton’s method, Int. J. Control, Volume 47 (1988) no. 5, pp. 1539-1553 | MR | DOI | Zbl

[43] Paszke, Adam; Gross, Sam; Massa, Francisco; Lerer, Adam; Bradbury, James; Chanan, Gregory; Killeen, Trevor; Lin, Zeming; Gimelshein, Natalia; Antiga, Luca; Desmaison, Alban; Kopf, Andreas; Yang, Edward; DeVito, Zachary; Raison, Martin; Tejani, Alykhan; Chilamkurthy, Sasank; Steiner, Benoit; Fang, Lu; Bai, Junjie; Chintala, Soumith PyTorch: An Imperative Style, High-Performance Deep Learning Library, Proceedings of the 33rd International Conference on Neural Information Processing Systems, Curran Associates, Inc. (2019), pp. 8026-8037

[44] Polak, Elijah Computational methods in optimization: a unified approach, Mathematics in Science and Engineering, 77, Academic Press Inc., 1971 | MR

[45] Rao, Christopher; Wright, Stephen; Rawlings, James Application of interior-point methods to model predictive control, J. Optim. Theory Appl., Volume 99 (1998) no. 3, pp. 723-757 | MR | Zbl

[46] Recht, Benjamin A tour of reinforcement learning: The view from continuous control, Annual Review of Control, Robotics, and Autonomous Systems, Volume 2 (2019), pp. 253-279 | DOI

[47] Roulet, Vincent; Srinivasa, Siddhartha; Drusvyatskiy, Dmitriy; Harchaoui, Zaid Iterative linearized control: stable algorithms and complexity guarantees, Proceedings of the 36th International Conference on Machine Learning (Proceedings of Machine Learning Research), Volume 97, JMLR (2019), pp. 5518-5527

[48] Rumelhart, David E.; Hinton, Geoffrey E.; Williams, Ronald J. Learning representations by back-propagating errors, Nature, Volume 323 (1986) no. 6088, pp. 533-536 | DOI | Zbl

[49] Schmidhuber, Jürgen Making the world differentiable: on using self supervised fully recurrent neural networks for dynamic reinforcement learning and planning in non-stationary environments (1990) (Technical report)

[50] Sideris, Athanasios; Bobrow, James An efficient sequential linear quadratic algorithm for solving nonlinear optimal control problems, Proceedings of the 2005 American Control Conference, IEEE (2005), pp. 2275-2280 | DOI

[51] Srinivasan, Akshay; Todorov, Emanuel Graphical Newton (2015) | arXiv

[52] Tassa, Yuval; Erez, Tom; Smart, William Receding horizon differential dynamic programming, Adv. Neural Inf. Process. Syst., Volume 20 (2007)

[53] Tassa, Yuval; Erez, Tom; Todorov, Emanuel Synthesis and stabilization of complex behaviors through online trajectory optimization, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, IEEE (2012), pp. 4906-4913

[54] Tassa, Yuval; Mansard, Nicolas; Todorov, Emanuel Control-limited differential dynamic programming, 2014 IEEE International Conference on Robotics and Automation (ICRA), IEEE (2014), pp. 1168-1175 | DOI

[55] Todorov, Emanuel; Erez, Tom; Tassa, Yuval MuJoCo: A physics engine for model-based control, International Conference on Intelligent Robots and Systems (IROS), IEEE (2012), pp. 5026-5033

[56] Verschueren, Robin; Frison, Gianluca; Kouzoupis, Dimitris; Frey, Jonathan; Duijkeren, Niels van; Zanelli, Andrea; Novoselnik, Branimir; Albin, Thivaharan; Quirynen, Rien; Diehl, Moritz acados — a modular open-source framework for fast embedded optimal control, Math. Program. Comput., Volume 14 (2022), pp. 147-183 | MR | DOI | Zbl

[57] Verschueren, Robin; van Duijkeren, Niels; Quirynen, Rien; Diehl, Moritz Exploiting convexity in direct optimal control: a sequential convex quadratic programming method, 2016 IEEE 55th Conference on Decision and Control (CDC), IEEE (2016), pp. 1099-1104 | DOI

[58] Von Stryk, Oskar Numerical solution of optimal control problems by direct collocation, Springer, 1993

[59] Wächter, Andreas; Biegler, Lorenz T. On the implementation of an interior-point filter line-search algorithm for large-scale nonlinear programming, Math. Program., Volume 106 (2006), pp. 25-57 | MR | DOI | Zbl

[60] Werbos, Paul The Roots of Backpropagation: From Ordered Derivatives to Neural Networks and Political Forecasting, Wiley-Interscience, 1994

[61] Wright, Stephen Solution of discrete-time optimal control problems on parallel computers, Parallel Comput., Volume 16 (1990) no. 2-3, pp. 221-237 | MR | DOI | Zbl

[62] Wright, Stephen Partitioned dynamic programming for optimal control, SIAM J. Optim., Volume 1 (1991) no. 4, pp. 620-642 | MR | Zbl | DOI

[63] Wright, Stephen Structured interior point methods for optimal control, Proceedings of the 30th IEEE Conference on Decision and Control, IEEE (1991), pp. 1711-1716 | DOI

[64] Wright, Stephen Interior point methods for optimal control of discrete time systems, J. Optim. Theory Appl., Volume 77 (1993) no. 1, pp. 161-187 | MR | DOI | Zbl

[65] Zhang, Aston; Lipton, Zachary C.; Li, Mu; Smola, Alexander J. Dive into Deep Learning, Cambridge University Press, 2023

Cité par Sources :