Writing a physics engine (Part 1.2: What are degrees of freedom?)

Brief context

In my first year of my undergrad, after taking an intro mechanics course, I very naively thought that mechanics was boring and repetitive. I thought that for any dynamical system problem, the way to solve is just set up a free body diagram, setup your system of equations, and then take it from there. The hardest part is figuring out what goes in the free body diagram, the rest is just plug and chug.

For many problems, this approach is simple and works great. For example, a block sliding on a ramp is pretty easily solved using free body diagrams.

For many other problems, it can be a lot cleaner and intuitive to use generalized/reduced coordinates. This is especially when the motion of some objects are constrained. Here’s an example: the double pendulum. Drawing the free body diagram is quite annoying:

If we forced ourselves drew the free-body diagram, we would need to start considering the tension along the ropes. If we start working with triple pendulums or add moving parts, then things get very messy very quickly. But notice that we can represent the entire state of the double pendulum using just two angles.

Before we go further, let’s start with the definition of what degrees of freedom are.

The number of degrees of freedom for a system is the minimum/fewest number of parameters/values needed to capture the entire state of the system.

Generalized/reduced coordinates are the minimal, independent set of parameters needed to fully describe the state of the system.

Here’s the simplest example:

we could define the position of the ball at the end of a pendulum using \((x,y)\), but if we know that the pendulum is always moving along an arc of length \(L\), then we can use one number: \(\theta\) to describe the position. The set of generalized coordinates is \(\{\theta\}\).

Here’s one way to look at this: the position of any free/unconstrained point in 2D can be represented using two degrees of freedom: \((x,y)\). However, adding the constraint \(x^2 + y^2 = L\) removes one degree of freedom.

I like to think about this as follows, objects “start out” with the maximum number of degrees of freedom:

A point in 2D has 2 DOFs, a point in 3D has 3 DOFs
A rigid object (e.g., a square) in 2D has 3 DOFs (position, rotation angle)
A rigid body in 3D has 6 DOFs (3 for position, 3 for rotation)

and constraints remove degrees of freedom.

Why do we want to use generalized coordinates?

It’s often the more intuitive decision to use generalized coordinates when you can. Here’s a good example: the MuJoCo ant.

There are around 14 bodies in the MuJoCo ant (1 torso and 3 segments per leg). Creating a free-body diagram for this would be a nightmare.

However, there are only 14 DOFs in the ant (6 for the root body and 2 for each leg). Especially when we deal with kinematic trees, computations will become more intuitive when we use generalized coordinates. Using generalized coordinates will ensure there are no joint violations (i.e., motion will only be along prescribed degrees of freedom).