Designing the Optimal Controller for an Inverted Pendulum (LQR control)

This is the part 5 in a series of blog posts devoted to the classic control problem of an inverted pendulum on a driven cart. This series is still in production (Nov. 2020) but when finished they will be structured as follows.

  1. Derivation of Simple Pendulum (Python Simulation) - link
  2. Building a Physical Inverted Pendulum
  3. Controlling Inverted Pendulum with PID controller
  4. State Space Model of Inverted Pendulum - link
  5. Designing the Optimal Controller for an Inverted Pendulum (LQR control) - link
  6. Simulating an Inverted Pendulum with an LQR Controller (Python) - link
  7. Deploying an LQR Controller on a Physical Inverted Pendulum

The associated code can be found on GitHub and a video series on YouTube (not currently released as of Nov. 2020). While this project aims to be encompassing and self contained, it does assume familiarity with classical mechanics, linear algebra, differential equations, some control theory, and programming (Python).


This blog post builds directly from my previous blog post on the state space model of an inverted pendulum. In this post I will derive the optimal controller for the inverted system using control theory, while I aim to cover all the relevant steps this is not designed to be a text book as such I will be presenting equations and theorems without mathematical proofs. For good online control theory courses I would recommend MIT OpenCourseWare and Steve Brunton's  YouTube series on control theory.

To recap from the previous blog post we found that the state space model is:

$$\begin{bmatrix} \dot{x} \\ \dot{\theta} \\ \ddot{x} \\ \ddot{\theta} \end{bmatrix} = \begin{bmatrix} 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & \frac{m_2^2gl^2}{m_1m_2l^2  + I(m_1 + m_2)} & \frac{- \beta(m_2 l^2+I)}{m_1m_2l^2  + I(m_1 + m_2)} & 0 \\ 0 & \frac{m_2gl(m_1+m_2)}{m_1m_2l^2  + I(m_1 + m_2)} & \frac{-m_2l \beta}{m_1m_2l^2  + I(m_1 + m_2)} & 0 \end{bmatrix} \begin{bmatrix} x \\ \theta \\ \dot{x} \\ \dot{\theta} \end{bmatrix} + \begin{bmatrix} 0\\0\\ \frac{m_2l^2+I}{m_1m_2l^2  + I(m_1 + m_2)} \\ \frac{m_2l}{m_1m_2l^2  + I(m_1 + m_2)} \end{bmatrix} u$$

$$y=\begin{bmatrix} 1 & 0&0&0\\0&1&0&0\end{bmatrix}\begin{bmatrix} x \\ \theta \\ \dot{x} \\ \dot{\theta} \end{bmatrix} +\begin{bmatrix} 0 \\ 0\end{bmatrix}u$$

Which, more succinctly, can be expressed in the general form:

$$\dot{\vec{x}} = A\vec{x}+Bu$$

$$\vec{y} = C\vec{x}+Du$$

Test for Controllability

To ensure our efforts are not futile it is a good first step to ensure the system we wish to control is controllable to begin with. A system is controllable if for all $\vec{x}_0$, $\vec{x}_1$, and $T>0$ there exists a control input, $u$, such that $\vec{x}(t=0)=\vec{x}_0$ and  $\vec{x}(t=T)=\vec{x}_1$. The test for controllability states that if the rank of the matrix $\begin{bmatrix} B & AB & ... & A^{n-1}B\end{bmatrix}$ is $n$ (where A is an $n \times m$ matrix ) then the system is controllable (proof omitted).

The following is controllability matrix evaluated for the following quantities: $m_1=1kg, m_2=0.25kg, g=9.81m \cdot s^{-2}, l=0.5m, \beta = 1 kg \cdot s^{-1}$.

$$\begin{bmatrix} B & AB & ... & A^{n-1}B\end{bmatrix} = \begin{bmatrix} 0 & 0.94 & -0.89 & 3.28 \\ 0 & 1.41 & -1.33 & 25.69 \\ 0.94 & -0.89 & 3.28 & -5.39 \\ 1.41 & -1.33 & 25.69 & -27.63\end{bmatrix}$$

I used the Python command control.ctrb(A, B) to calculate this matrix and the numpy.linalg.matrix_rank command to confirm that the rank of the matrix is 4. Thus the system is controllable.

Test for Stability

While we can intuitively tell that the inverted pendulum is an unstable system, we should still test for it mathematically, in some systems that are not as trivial as the inverted pendulum it is not immediately obvious from inspection if the system is stable or not. There are several tests for stability but I will use the eigenvalue method here. The proof is omitted from this blog post but a system is internally stable if the all the real parts of the matrix A are negative. That is to say the system is stable iff $Re(\lambda)<0, \quad \forall \lambda$.

Using the Python command, scipy.linalg.eig(A), the eigenvalues are found to be:

$$\begin{bmatrix} 0+0i \\ 4.1 + 0i \\ -4.2+0i \\ -0.8 +0i \end{bmatrix}$$

Thus since some of the eigenvalues have non-negative real parts the system is unstable.

LQR Control

Now that it has been established that the system is controllable but unstable, we can find a suitable control input, $u$, to stabilize the system. Lets define the control input in terms of the state vector, $\vec{x}$:

$$u=-K(\vec{x}(t) - \vec{x}_{setpoint})$$

Where $\vec{x}_{setpoint} = \begin{bmatrix}0& \pi & 0 &0 \end{bmatrix}^T$ is our desired final state and $K$ is some yet to be defined matrix. Notice that our state space can be rewritten as follows:

$$\dot{\vec{x}} = A\vec{x}+Bu = (A-BK)\vec{x}+K\vec{x}_{setpoint}$$

We now have the freedom to choose the matrix K such that the eigenvalues of $(A-BK)$ lead to stability, as shown in the previous section. Let me choose (somewhat) randomly the matrix values $K = \begin{bmatrix} -100 & 800 & -150 & 200\end{bmatrix}$. If we repeat the stability test for $(A-BK)$ we get the following eigenvalues:

$$\begin{bmatrix} -134.7+0i \\ -4.6 + 0i \\ -1.4+0.5i \\ -1.4 -0.5i \end{bmatrix}$$

Notice that now all the eigenvalues have negative real parts, thus the system is stable. This is great! We now have a control input, $u = -K(x-\vec{x}_{setpoint})$, that if applied to the inverted pendulum system would stabilized. However we can do better, we want to find the optimal control matrix K for the most efficient control.

The procedure to do this is a little involved and to be honest I don't want to write an entire control theory book. As such I will present the relevant equations and an overview of the steps required to find the optimal matrix K. If you want a deeper understanding of what these equations mean, how they work, and the associated mathematical proofs I advise consulting online control theory courses like the one available through MIT OpenCourseWare.

We can define a cost function $J$ as follows. It will be our task to minimize this cost function, in doing so we will find the optimal control.

$$J(\vec{x}, u) = \int_{0}^{\infty}(x^TQx+u^TRu)dt$$

Where $Q$ is a positive semi-definite matrix and $R$ is positive-definite, both are symmetric. $Q$ is effectively used to express the cost we wish to place on physical system: i.e. how important is it to get the correct final position, the speed and aggression at which to move? $R$ is effectively the cost we wish to apply to the input, in this case the motor, maybe we wish to conserve power or perhaps we have a powerful motor and don't care how hard we force it to move. A common starting point is to define them as:

$$Q=C^TC=\begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 &1 & 0 \\ 0& 0& 0& 1 \end{bmatrix} \quad R =\begin{bmatrix} 1 \end{bmatrix} $$

If we wanted to prioritize the final angle, $\theta$, and had a powerful motor to use we could adjust these to:

$$Q=\begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 10 & 0 & 0 \\ 0 & 0 &1 & 0 \\ 0& 0& 0& 1 \end{bmatrix} \quad R =\begin{bmatrix} 0.001 \end{bmatrix} $$

Typically these variables are changed by factors of 10. In the next blog post (simulating an inverted pendulum in Python) we will be able to experiment with various $Q$ and $R$ matrices to see the different effects they can have.

By the minimization of the quadratic cost function, $J(\vec{x}, u)$, the optimal controller, $K$, can be expressed as:

$$K = R^{-1}B ^TP$$

Where P is a symmetric positive-definite matrix found from the algebraic Ricatti equation (derivation is omitted):

$$A^TP+PA-PBR ^{-1}B^TP +Q =0$$

To calculate the matrix K this in Python the following commands can be used:

P = np.matrix(linalg.solve_continuous_are(A, B, Q, R))

K = np.matrix(linalg.inv(R)*(B.T*P))

Thus we have found the optimal control input:

$$u = -K(\vec{x}-\vec{x}_{setpoint}) = -R^{-1}B ^TP(\vec{x}-\vec{x}_{setpoint})$$

Finally we can once again check the eigenvalues of $(A-BK)$ and confirm that their real parts are negative:

$$\begin{bmatrix} -53.8+0i \\ -1.1 + 0i \\ -2.8+0.2i \\ -2.8 -0.2i \end{bmatrix}$$

But what is u?

For the last two blog posts I have been referring to the magical mathematical entity $u$, but what is it in the real world? Recall in the second section of the previous blog post I chose to redefine the force applied to the cart, $\vec{F}$,  as the control input, $u$. This was predominately to adhere to the standard control theory conventions, if we now go back to the force, we can express the force applied to the cart as the torque, $\tau$, from the motor and gearbox:

$$F=\frac{\tau}{Gr} \quad where \enspace G=\frac{N_2}{N_1}$$

Where $r$ is the armature radius, $G$ is the gear ratio, and $N_1, N_2$ are the number of teeth on the first and second gears, respectively (assuming a 2 piece gear train).

Making the substitution the control input can be expressed in terms of the motor torque, something that we can control via hardware and software by regulating the current applied to the motor.

Show Comments