27 May 2023

LQR: motivation and derivation

Motivation

We want to derive a controller (or regulator) in order to minimize a quadratic constraint given system dynamics.

Where is the discrete dynamics of the system.

The goal is to drive the system to zero in both the state and control spaces.

Derivation using backward substitution

An equivalent problem is the following:

This is a convex problem with the following Lagrangian:

The KKT conditions are:

Solving this sparse system can be done efficiently using backward substitution, leading to the Riccati recursion.

From this we get the following equations:

Which imply

Giving a feedback policy:

Now we can use direct induction to define

Which provide the general formula for the feedback policy:

Derivation using the cost-to-go function

The state-value function or cost-to-go is defined as the cost associated to a given state assuming we follow an optimal policy. In order to minimize the total cost, we can use a greedy approach, and minimize over each individual step (Bellman principle of optimality). It means that when the state is fixed and the controls free of choice, the optimal control minimizes the sum of both the current stage cost and future costs.

In equations, the final cost-to-go is

And the previous one can be defined recursively

By defining our optimal controls:

We can then follow the optimal trajectory, with the corresponding cost-to-go function:

This specific method can be applied to various other optimization problems, and be generalized using Dynamic Programming.

Applications

The essential thing that we should keep in mind is that these two methods are giving the same feedback policy: .

This drives the system to zero, and does not depend on the initial conditions. A direct application of this is the hover of a quadrotor. The LQR controller will help the system stabilize in the air, and a finite number of (or a unique) feedback gain matrice(s) is sufficient.

For trajectory tracking, we can use the following controls:

It can be thought as shifting the origin. Given the dynamics, the linearization can be done as follow, linearizing about the shifted origins:

LQR controllers are very simple and powerful, thus a good choice for many control problems.