Inverted Pendulum Stablization Mega Guide

This large, 7 part guide aims to create a comprehensive resource covering the theory, mathematics, and physical build of the classic control theory problem known as an inverted pendulum on a cart.

Sections:

Derivation of Simple Pendulum (Python Simulation)
Building a Physical Inverted Pendulum
Controlling Inverted Pendulum with PID controller
State Space Model of Inverted Pendulum
Designing the Optimal Controller for an Inverted Pendulum (LQR control)
Simulating an Inverted Pendulum with an LQR Controller (Python)
Deploying an LQR Controller on a Physical Inverted Pendulum

The associated code can be found on GitHub and a video series on YouTube (not currently released as of Nov. 2021). While this project aims to be encompassing and self-contained, it does assume familiarity with classical mechanics, linear algebra, differential equations, some control theory, and programming (Python).

Part 1: Derivation of Simple Pendulum (Python Simulation)

This blog is supplemental documentation pertaining to the derivation for the equation of motion used in the python program, simple-pendulum.py

While the program uses the scipy.integrate.solve_ivp() function, I will also touch on the small angle approximation for solving the system analytically for small initial angles $\theta_0 << 1$. There are a few different ways to derive the equations of motion, I will be using the torque derivation, and mention in passing the Lagrangian approach.

Firstly, since this is a rod-pendulum with uniform mass (unlike the slightly more common, point mass attached to a massless rod pendulum), the moment of inertia $I$ must be found.

$$I_{rod,CoM} = \int_{0}^M r^2dm = \int_{-\frac{L}{2}}^{\frac{L}{2}}\frac{r^2m}{L}dr = \frac{1}{12}mL^2 $$

Using the parallel axis theorem, we can shift the moment of inertia from the center of mass (CoM) to one of the ends:

$$I_{rod,end} = I_{rod,CoM} + md^2 = \frac{1}{3}mL^2$$

We now have the information required to find the equation of motion via the torque:

$$ \tau = I\alpha $$$$ -mg\sin(\theta)(\frac{L}{2}) = I_{r,e}\frac{d^2 \theta}{d t^2}$$$$\frac{d^2 \theta}{d t^2} = - \omega^2 \sin(\theta)$$

where $\omega = \sqrt{\frac{3g}{2L}}$ is the natural frequency of the pendulum. At this point if using a small initial angle $\theta_0 << 1$, one could solve the system analytically using the small-angle approximation $\sin(\theta) \approx \theta$, and then integrating twice to give $\theta(t) = \theta_0 \cos(\omega t)$. Refer to the first couple of commits in the repository to see the code for a small angle approx (of a point-mass pendulum). At this point the pendulum animation can be created using matplotlib. Note: this blog post focuses on the physics of the system, not the actual software. For a deeper understanding of how scipy.integrate.solve_ivp() is used please refer to the scipy documentation.

To create the phase space plot I will calculate the Hamiltonian and Lagrangian (optional) of the system. The kinetic and potential energy can be found as:

$$T = \frac{1}{2}mv^2 + \frac{1}{2}I\dot\theta^2 = \frac{1}{6}mL^2\dot\theta^2$$

$$V = \frac{mgL}{2}(1-\cos(\theta))$$

The Lagrangian, $L=T-V$, can be used with the Euler-Lagrange equation to find the equation of motion (as opposed to the torque derivation, above). This is left as an exercise for the reader. Using the generalised coordinate, $q=\theta$, and generalized momentum, $p = \frac{\partial L}{\partial \dot\theta}$, the Hamiltonian can be expressed as:

$$H = T + V = \frac{p^2}{6m L^2} + mgL(1-\cos(q))$$

The Hamiltonian represents the total energy of the system and the different energy level contours are expressed in the phase space plot.

GIF of classic pendulum — Classic Pendulum Simulation and Phase Space Plot

Part 2: <not released yet>

Part 3: <not released yet>

Part 4: State Space Model of Inverted Pendulum

A sketch of the inverted pendulum system is shown in the figure below. From this, we will derive the state-space model (and eventually find the optimal controller to balance the inverted pendulum). The overall process for finding the state-space model will be as follows: find the Lagrangian, evaluate the Euler-Lagrange equations, rearrange the coupled differential equations in state-space form, and then linearize by evaluating the system at $\theta=\pi$ (pendulum vertical position).

Drawing of a classical pendulm on a cart — Model of the system

Step 1: Finding the Lagrangian

The Lagrangian is defined as $\mathcal{L}=T-V$, where $T$ and $V$ are the kinetic and potential energies, respectively. The terms in the sketch, above, are largely self-evident, however, I will choose to define $l=L/2$. I also make the assumption that the rod is made of a uniform mass, thus making $l$ the center of mass of the rod. Using this information we can establish:

$$\dot{\vec{v}}_{cart} = \dot{x}\hat{i} $$

$$\dot{\vec{v}}_{rod} = \frac{d}{dt}\Big( x+l\sin(\theta) \Big)\hat{i} + \frac{d}{dt}\Big( l\cos(\theta) \Big)\hat{j} = (\dot{x} + l\dot{\theta}\cos(\theta))\hat{i} + (l\dot{\theta}\sin(\theta))(-\hat{j}) $$

$$T = \frac{1}{2}m_1v_{cart}^2 + \frac{1}{2}m_2v_{rod}^2 + \frac{1}{2}I_{rod}\omega^2 $$

$$V = m_2gl(1-\cos(\theta)$$

where $I_{rod}$ and $\omega$ are the moment of inertia and angular velocity of the rod, respectively. Substituting the above four equations into the Lagrangian, $\mathcal{L}=T-V$.

$$\mathcal{L}=\frac{1}{2}m_1\dot{x}^2 + \frac{1}{2}m_2(\dot{x}^2+\dot{x}l\dot{\theta}\cos(\theta) + l^2\dot{\theta}^2) + \frac{1}{2}I_{rod}\dot{\theta}^2-m_2gl(1-\cos(\theta)$$

Step 2: Evaluate Euler-Lagrange Equations

From the Lagrangian by using the Euler-Lagrange equations, below, the equations of motion of the system can be found.

$$\frac{d}{dt}\Big( \frac{\partial\mathcal{L}}{\partial \dot{q_i}} \Big) - \frac{\partial\mathcal{L}}{\partial q_i} = Q_i \quad \quad \quad i =x,\theta$$

Where $Q_{\theta}, Q_x$ are the external forces applied to the system in the $\theta$ and $x$ axes. In the $x$-axis there is a force applied to the cart (that we will later use to stabilize the inverted pendulum) and some dampening force proportional to the velocity. As such $Q_x = F-\beta \dot{x}$. In the $\theta$-axis there is also some friction at the pivot point however this resistance is negligible. As such I will define: $Q_{\theta}=0$. These quantities can be substituted into the Euler Lagrange equations.

For the sake of brevity (and my own sanity), I'm opting out of transcribing the partial derivatives in LaTeX. I will leave it as an exercise to verify that when the Euler-Lagrange equations are evaluated for $i =x,\theta$ that the equations of motion come out to be:

$$m_2\ddot{x}l\cos(\theta)+(m_2l^2-I)\ddot{\theta}= -m_2gl\sin(\theta)$$$$(m_1+m_2)\ddot{x} + m_2l\ddot{\theta}\cos(\theta) = m_2l\dot{\theta}^2\sin(\theta) + F - \beta\dot{x}$$

Going forward I will rename the force, $F$, as $u$ which is the convention in control theory for a 'control input'. In subsequent blog posts I will clarify this convention and explain how $F$ can be expressed in terms of the motor torque and how in the real world the torque is controlled by the electrical current to the motor. But for the moment we will live in mathematical bliss and leave the control input as simply $u$.

Step 3: Some Algebra

At this point, we have found the coupled equations of motion of the system. However $\ddot{x}$ and $\ddot{\theta}$ appear in both equations. In this section, I will rearrange these equations in matrix form and through some tedious manipulations separate the $\ddot{x}$ and $\ddot{\theta}$ terms.

In matrix form, the coupled equations of motion can be expressed in matrix form as:

$$\begin{bmatrix} m_2l\cos(\theta) & m_2l^2 +I \\ m_1+m_2 & m_2l\cos(\theta)\end{bmatrix} \begin{bmatrix} \ddot{x} \\ \ddot{\theta} \end{bmatrix} = \begin{bmatrix} 0 & 0\\ -\beta & m_2l\dot{\theta}\sin(\theta) \end{bmatrix} \begin{bmatrix} \dot{x}\\ \dot{\theta} \end{bmatrix} + \begin{bmatrix} 0 & 0 \\ 0 & 0 \end{bmatrix} \begin{bmatrix} x \\ \theta \end{bmatrix} + \begin{bmatrix} 0 & -m_2gl\sin(\theta) \\ 0 & 0 \end{bmatrix} \begin{bmatrix} 1 \\1 \end{bmatrix} +\begin{bmatrix} 0 \\ 1 \end{bmatrix} u$$

Refactoring the terms a more concise form is:

$$E\ddot{\vec{x}} = F\dot{\vec{x}} + G \vec{x} + H + Iu$$

And by taking the inverse of $E$:

$$\ddot{\vec{x}} = E^{-1}F\dot{\vec{x}} + E^{-1}G \vec{x} + E^{-1}H + E^{-1}Iu$$

$$E^{-1} = \frac{1}{m_2^2l^2 \cos^2(\theta) - (m_1 + m_2)(m_2l^2+I) } \begin{bmatrix} m_2l\cos(\theta) & -m_2l^2 -I \\ -m_1-m_2 & m_2l\cos(\theta)\end{bmatrix}$$

By doing some tedious matrix multiplication (exercise for the reader) the expression for vector $\ddot{\vec{x}}$ can be expanded to:

$$\ddot{x} = \frac{1}{D} \Big[ \beta(m_2l^2 +I)\dot{x} - m_2l\dot{\theta}^2\sin(\theta)(m_2l^2 +I) - m_2^2gl^2\cos(\theta)\sin(\theta) - (m_2l^2 +I)u\Big]$$

$$\ddot{\theta} = \frac{1}{D} \Big[ -\beta m_2l\cos(\theta)\dot{x} + m_2^2l^2\dot{\theta}^2\cos(\theta)\sin(\theta) +(m_1 + m_2)m_2gl\sin(\theta) + m_2l\cos(\theta)u \Big]$$

$$D = m_2^2l^2 \cos^2(\theta) - (m_1 + m_2)(m_2l^2+I)$$

Step 4: Linearization

To recap, so far we have found the equations of motion of the system and expressed them as separate equations of $\ddot{x}$ and $\ddot{\theta}$. These non-linear equations will be useful in the blog post regarding simulating the physical system. However, to find the optimal LQR controller these equations must be linearized around the unstable equilibrium point $\theta = \pi$. Since the LQR controller is only valid for near the equilibrium point we can assume that the system remains within $\phi$ of $\pi$, where $\phi$ is a small angle. Therefore the nonlinear terms of interest become:

$$\cos(\theta) = \cos(\pi + \phi) = \cos(\pi) - \sin(\pi\phi) + O^2 \approx -1$$

$$\sin(\theta) = \sin(\pi + \phi) = \sin(\pi) + \cos(\pi\phi) + O^2 \approx -\phi$$

$$\dot{\theta}^2 = \dot{\phi}^2 \approx 0$$

And so, using this information $\ddot{x}$ and $\ddot{\theta}$ become:

$$\ddot{x} = \frac{1}{D} \Big[ \beta(m_2l^2 +I)\dot{x} - m_2^2gl^2\theta - (m_2l^2 +I)u\Big]$$

$$\ddot{\theta} = \frac{1}{D} \Big[ \beta m_2l\dot{x} -(m_1 + m_2)m_2gl\theta - m_2lu \Big]$$

$$D = -m_1m_2^2l^2 - (m_1 + m_2)I$$

Step 5: State-Space Model

To review, the state-space model representation is a system of two equations (below). Where A, B, C, and D are matrices that describe the state, input, output, and feedthrough, respectively. In most cases, including the inverted pendulum, D is a zero matrix.

$$\dot{\vec{x}} = A\vec{x}+Bu$$

$$\vec{y} = C\vec{x}+Du$$

Looking at the pair of equations at the end of part 4, we can see that converting the linear system back into matrix form will yield the general state-space form:

$$\begin{bmatrix} \dot{x} \\ \dot{\theta} \\ \ddot{x} \\ \ddot{\theta} \end{bmatrix} = \begin{bmatrix} 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & \frac{m_2^2gl^2}{m_1m_2l^2 + I(m_1 + m_2)} & \frac{- \beta(m_2 l^2+I)}{m_1m_2l^2 + I(m_1 + m_2)} & 0 \\ 0 & \frac{m_2gl(m_1+m_2)}{m_1m_2l^2 + I(m_1 + m_2)} & \frac{-m_2l \beta}{m_1m_2l^2 + I(m_1 + m_2)} & 0 \end{bmatrix} \begin{bmatrix} x \\ \theta \\ \dot{x} \\ \dot{\theta} \end{bmatrix} + \begin{bmatrix} 0\\0\\ \frac{m_2l^2+I}{m_1m_2l^2 + I(m_1 + m_2)} \\ \frac{m_2l}{m_1m_2l^2 + I(m_1 + m_2)} \end{bmatrix} u$$

$$y=\begin{bmatrix} 1 & 0&0&0\\0&1&0&0\end{bmatrix}\begin{bmatrix} x \\ \theta \\ \dot{x} \\ \dot{\theta} \end{bmatrix} +\begin{bmatrix} 0 \\ 0\end{bmatrix}u$$

Part 5: Designing the Optimal Controller for an Inverted Pendulum (LQR control)

In this post, I will derive the optimal controller for the inverted system using control theory, while I aim to cover all the relevant steps this is not designed to be a textbook as such I will be presenting equations and theorems without mathematical proofs. For good online control theory courses, I would recommend MIT OpenCourseWare and Steve Brunton's YouTube series on control theory.

To recap from the previous blog post we found that the state-space model is:

$$\begin{bmatrix} \dot{x} \\ \dot{\theta} \\ \ddot{x} \\ \ddot{\theta} \end{bmatrix} = \begin{bmatrix} 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & \frac{m_2^2gl^2}{m_1m_2l^2 + I(m_1 + m_2)} & \frac{- \beta(m_2 l^2+I)}{m_1m_2l^2 + I(m_1 + m_2)} & 0 \\ 0 & \frac{m_2gl(m_1+m_2)}{m_1m_2l^2 + I(m_1 + m_2)} & \frac{-m_2l \beta}{m_1m_2l^2 + I(m_1 + m_2)} & 0 \end{bmatrix} \begin{bmatrix} x \\ \theta \\ \dot{x} \\ \dot{\theta} \end{bmatrix} + \begin{bmatrix} 0\\0\\ \frac{m_2l^2+I}{m_1m_2l^2 + I(m_1 + m_2)} \\ \frac{m_2l}{m_1m_2l^2 + I(m_1 + m_2)} \end{bmatrix} u$$

$$y=\begin{bmatrix} 1 & 0&0&0\\0&1&0&0\end{bmatrix}\begin{bmatrix} x \\ \theta \\ \dot{x} \\ \dot{\theta} \end{bmatrix} +\begin{bmatrix} 0 \\ 0\end{bmatrix}u$$

Which, more succinctly, can be expressed in the general form:

$$\dot{\vec{x}} = A\vec{x}+Bu$$

$$\vec{y} = C\vec{x}+Du$$

Test for Controllability

To ensure our efforts are not futile it is a good first step to ensure the system we wish to control is controllable to begin with. A system is controllable if for all $\vec{x}_0$, $\vec{x}_1$, and $T>0$ there exists a control input, $u$, such that $\vec{x}(t=0)=\vec{x}_0$ and $\vec{x}(t=T)=\vec{x}_1$. The test for controllability states that if the rank of the matrix $\begin{bmatrix} B & AB & ... & A^{n-1}B\end{bmatrix}$ is $n$ (where A is an $n \times m$ matrix ) then the system is controllable (proof omitted).

The following is controllability matrix evaluated for the following quantities: $m_1=1kg, m_2=0.25kg, g=9.81m \cdot s^{-2}, l=0.5m, \beta = 1 kg \cdot s^{-1}$.

$$\begin{bmatrix} B & AB & ... & A^{n-1}B\end{bmatrix} = \begin{bmatrix} 0 & 0.94 & -0.89 & 3.28 \\ 0 & 1.41 & -1.33 & 25.69 \\ 0.94 & -0.89 & 3.28 & -5.39 \\ 1.41 & -1.33 & 25.69 & -27.63\end{bmatrix}$$

I used the Python command control.ctrb(A, B) to calculate this matrix and the numpy.linalg.matrix_rank command to confirm that the rank of the matrix is 4. Thus the system is controllable.

Test for Stability

While we can intuitively tell that the inverted pendulum is an unstable system, we should still test for it mathematically, in some systems that are not as trivial as the inverted pendulum it is not immediately obvious from inspection if the system is stable or not. There are several tests for stability but I will use the eigenvalue method here. The proof is omitted from this blog post but a system is internally stable if all the real parts of matrix A are negative. That is to say the system is stable iff $Re(\lambda)<0, \quad \forall \lambda$.

Using the Python command, scipy.linalg.eig(A), the eigenvalues are found to be:

$$\begin{bmatrix} 0+0i \\ 4.1 + 0i \\ -4.2+0i \\ -0.8 +0i \end{bmatrix}$$

Thus since some of the eigenvalues have non-negative real parts the system is unstable.

LQR Control

Now that it has been established that the system is controllable but unstable, we can find a suitable control input, $u$, to stabilize the system. Let’s define the control input in terms of the state vector, $\vec{x}$:

$$u=-K(\vec{x}(t) - \vec{x}_{setpoint})$$

Where $\vec{x}_{setpoint} = \begin{bmatrix}0& \pi & 0 &0 \end{bmatrix}^T$ is our desired final state and $K$ is some yet to be defined matrix. Notice that our state space can be rewritten as follows:

$$\dot{\vec{x}} = A\vec{x}+Bu = (A-BK)\vec{x}+K\vec{x}_{setpoint}$$

We now have the freedom to choose the matrix K such that the eigenvalues of $(A-BK)$ lead to stability, as shown in the previous section. Let me choose (somewhat) randomly the matrix values $K = \begin{bmatrix} -100 & 800 & -150 & 200\end{bmatrix}$. If we repeat the stability test for $(A-BK)$ we get the following eigenvalues:

$$\begin{bmatrix} -134.7+0i \\ -4.6 + 0i \\ -1.4+0.5i \\ -1.4 -0.5i \end{bmatrix}$$

Notice that now all the eigenvalues have negative real parts, thus the system is stable. This is great! We now have a control input, $u = -K(x-\vec{x}_{setpoint})$, that if applied to the inverted pendulum system would stabilize. However we can do better, we want to find the optimal control matrix K for the most efficient control.

The procedure to do this is a little involved and to be honest, I don't want to write an entire control theory book. As such I will present the relevant equations and an overview of the steps required to find the optimal matrix K. If you want a deeper understanding of what these equations mean, how they work, and the associated mathematical proofs I advise consulting online control theory courses like the one available through MIT OpenCourseWare.

We can define a cost function $J$ as follows. It will be our task to minimize this cost function, in doing so we will find the optimal control.

$$J(\vec{x}, u) = \int_{0}^{\infty}(x^TQx+u^TRu)dt$$

Where $Q$ is a positive semi-definite matrix and $R$ is positive-definite, both are symmetric. $Q$ is effectively used to express the cost we wish to place on the physical system: i.e. how important is it to get the correct final position, the speed, and aggression at which to move? $R$ is effectively the cost we wish to apply to the input, in this case, the motor, maybe we wish to conserve power, or perhaps we have a powerful motor and don't care how hard we force it to move. A common starting point is to define them as:

$$Q=C^TC=\begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 &1 & 0 \\ 0& 0& 0& 1 \end{bmatrix} \quad R =\begin{bmatrix} 1 \end{bmatrix} $$

If we wanted to prioritize the final angle, $\theta$, and had a powerful motor to use we could adjust these to:

$$Q=\begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 10 & 0 & 0 \\ 0 & 0 &1 & 0 \\ 0& 0& 0& 1 \end{bmatrix} \quad R =\begin{bmatrix} 0.001 \end{bmatrix} $$

Typically these variables are changed by factors of 10. In the next blog post (simulating an inverted pendulum in Python) we will be able to experiment with various $Q$ and $R$ matrices to see the different effects they can have.

By the minimization of the quadratic cost function, $J(\vec{x}, u)$, the optimal controller, $K$, can be expressed as:

$$K = R^{-1}B ^TP$$

Where P is a symmetric positive-definite matrix found from the algebraic Ricatti equation (derivation is omitted):

$$A^TP+PA-PBR ^{-1}B^TP +Q =0$$

To calculate the matrix K this in Python the following commands can be used:

P = np.matrix(linalg.solve_continuous_are(A, B, Q, R))

K = np.matrix(linalg.inv(R)*(B.T*P))

Thus we have found the optimal control input:

$$u = -K(\vec{x}-\vec{x}_{setpoint}) = -R^{-1}B ^TP(\vec{x}-\vec{x}_{setpoint})$$

Finally we can once again check the eigenvalues of $(A-BK)$ and confirm that their real parts are negative:

$$\begin{bmatrix} -53.8+0i \\ -1.1 + 0i \\ -2.8+0.2i \\ -2.8 -0.2i \end{bmatrix}$$

But what is u?

For the last two blog posts I have been referring to the magical mathematical entity $u$, but what is it in the real world? Recall in the second section of the previous blog post I chose to redefine the force applied to the cart, $\vec{F}$, as the control input, $u$. This was predominately to adhere to the standard control theory conventions, if we now go back to the force, we can express the force applied to the cart as the torque, $\tau$, from the motor and gearbox:

$$F=\frac{\tau}{Gr} \quad where \enspace G=\frac{N_2}{N_1}$$

Where $r$ is the armature radius, $G$ is the gear ratio, and $N_1, N_2$ are the number of teeth on the first and second gears, respectively (assuming a 2 piece gear train).

Making the substitution the control input can be expressed in terms of the motor torque, something that we can control via hardware and software by regulating the current applied to the motor.

Part 6: Simulating an Inverted Pendulum with an LQR Controller (Python)

In the previous 2 blog posts, I have explored the state-space model of an inverted pendulum and derived the optimal LQR controller for stabilizing the system. Before going to the trouble of building a physical inverted pendulum let’s first verify that the controller works as expected in simulation.

If you wish to dive right into the code it can be found on my GitHub

Code Overview

To provide an explanation of the code line by line is difficult outside of video format. As such I will touch on the highlights. There are two files: variables.py and simulation.py. If you have read the previous 2 blog posts the variables.py file should be fairly clear, I have just defined all the relevant variables that have been previously discussed. Feel free to adjust the mass of the cart, rod, damping, etc. You can also experiment with different values in the Q and R matrices, as discussed in the previous blog post.

Simulation of inverted pendulum on a cart — Animation of aggressive pendulum recovery. Initial state = [-1; pi-0.5; -5; 0], Final state = [1; pi; 0; 0]. *Note: GIF cuts simulation short at x ~= 0.8, pendulum settles at final state sometime later.*