Chapter 10 Linear Quadratic Regulators

While solving the dynamic programming problem for continuous systems is very hard in general, there are a few very important special cases where the solutions are very accessible.

Most of these involve variants on the case of linear dynamics and quadratic cost.

The simplest case, called the linear quadratic regulator (LQR), is formulated as stabilizing a time-invariant linear system to the origin.

The linear quadratic regulator is likely the most important and influential result in optimal control theory to date. In this chapter we will derive the basic algorithm and a variety of useful extensions.

10.1 Basic Derivation

Consider a linear time-invariant system in state-space form:

$$\dot{\textbf{x}} = \textbf{Ax} + \textbf{Bu} $$

with the infinite-horizon cost function given by

$$ J = \int_0^\infty \big[ \textbf{x}^T\textbf{Qx} + \textbf{u}^T\textbf{Ru} \big] dt $$$$ \textbf{Q} = \textbf{Q}^T \geq 0 $$$$ \textbf{R} = \textbf{R}^T \geq 0 $$

Our goal is to find the optimal cost-to-go function $J^*(\textbf{x})$ which satisfies the HJB:

$$\forall \textbf{x}, \ \ \ 0 = \underset{u}{min} \big[ \textbf{x}^T\textbf{Qx} + \textbf{u}^T\textbf{Ru} + \frac{\partial J^*} {\partial \textbf{x}} ( \textbf{Ax} + \textbf{Bu} ) \big] $$

There is one magic step here -- it is well known that for this problem the optimal cost-to-go function is quadratic. This is easy to verify. Let us choose the form:

10.2.2 Time-varying LQR

The derivation above holds even if the dynamics are given by

$$\dot{\textbf{x}} = \textbf{A}(t)\textbf{x} + \textbf{B}(t)\textbf{u} $$

Similarly, the cost functions $\textbf{Q}$ and $\textbf{R}$ can also be time-varying. This is quite surprising, as the class of time-varying linear systems is a quite general class of systems. It requires essentially no assumptions on how the time-dependence enters, except perhaps that if $\textbf{A}$ or $\textbf{B}$ is discontinuous in time then one would have to use the proper techniques to accurately integrat the differential equation. As we will see in the next chapter, one of the most powerful applications involves linearizing around a nominal trajectory of a nonlinear system and using LQR to provide a trajectory controller.

10.2.3 Linear Quadratic Optimal Tracking

Chapter 11 Lyapunov Analysis

Chapter 12 Trajectory Optimization