MINES ParisTech CAS - Centre automatique et systèmes

Learning MPC: a data efficient model-based reinforcement learning strategy for iterative tasks

Thursday 10th June 2021, 4pm – 5pm (Paris time).

Ugo Rosolia, California Institute of Technology (Caltech), USA (https://scholar.google.com/citations?hl=fr&user=s4mZnz8AAAAJ)

Leveraging historical data to iteratively improve the performance of predictive controllers has been an active theme of research in the past few decades. The key idea is to use recorded state-input pairs in order to compute at least one of the following three components: i) a model which describes the evolution of the system, ii) a safe set of states (and an associated control policy) from which the control task can be safely executed and iii) a value function which represents the cumulative closed-loop cost from a given state of the safe set.
In this talk, I will first provide an overview of the theory of Learning Model Predictive Control that I have developed during my PhD. In particular, I will show how historical data can be used in the control design to guarantee safety, exploration and performance improvement. In the second part of the talk, I will show the effectiveness of the proposed methodology on an autonomous racing example and a manipulator task example.

Ugo Rosolia received the B.S. and M.S. cum laude degrees in mechanical engineering from the Politecnico di Milano in 2012 and 2014, respectively and his Ph.D. degree in mechanical engineering at the University of California at Berkeley in 2019. Currently, he is a postdoctoral scholar at the California Institute of Technology.
He was a Visiting Scholar at the Tongji University in Shanghai for the Double Degree Program PoliTong (Fall 2010 - Spring 2011) and at the University of Illinois at Urbana-Champaign (Fall 2013 - Spring 2014), sponsored by a Global E3 Scholarship. He was a Research Engineer with Siemens PLM Software in Belgium (Spring and Summer 2015). His current research interests include approximate dynamic programming, system identification, decision making in mixed observable Markov decision processes and predictive control.

see slides

look at source code