A Careful Control Loop
Differentiable physics: what it buys you is one lens on dynamics and simulation math. We study it because an embodied agent needs decisions that survive contact with noisy sensors, delayed effects, and changing environments.
This section develops the technical contract for Differentiable physics: what it buys you into a usable mental model. First we define the object of study, then we connect it to the agent loop, then we test it with a compact implementation.
The key question in Differentiable physics: what it buys you is practical: what must the agent know, what can it observe, what action is available, and what evidence shows that the action worked under the stated conditions?
A representation earns its place when it changes the measurable action interface. In Differentiable physics: what it buys you, the reader should keep asking which decision becomes easier, safer, or more reliable.
Theory
For Differentiable physics: what it buys you, the practical design rule is to make the interface inspectable before optimization begins: inputs, outputs, units, latency, bounds, and failure labels should all be visible in the saved artifact.
The mechanism in Differentiable physics: what it buys you is the contract between representation and action. Name what enters the module, what leaves it, which assumptions make that transformation valid, and which log would reveal a bad handoff.
Worked Example: Backprop Through a Rollout, Validated Against Finite Differences
A differentiable simulator exposes a discrete update $x_{k+1} = f_\Delta(x_k, u_k, \theta)$ and a scalar loss $L = \ell(x_T)$, then lets you compute sensitivities such as $\partial L / \partial u_k$ or $\partial L / \partial \theta$ by the chain rule through every step. The example optimizes the launch velocity of a projectile so it lands on a target, by differentiating the landing-error loss through a symplectic rollout. The non-negotiable discipline is to validate the analytic gradient against a centered finite difference before trusting it for optimization.
For a horizon of $T$ steps the reverse-mode gradient accumulates local Jacobians, $\tfrac{\partial L}{\partial x_0} = \big(\prod_{k} \tfrac{\partial x_{k+1}}{\partial x_k}\big)^\top \tfrac{\partial \ell}{\partial x_T}$. Here we differentiate by hand to keep the mechanism visible, then check it numerically.
import numpy as np
g = 9.81
dt, T = 0.01, 300 # 3 s rollout
target_x = 6.0 # land here on the ground (y = 0)
def rollout(v0):
"""Symplectic-Euler projectile from origin. Returns landing x and loss."""
vx, vy = v0
x, y = 0.0, 0.0
for _ in range(T):
vy = vy - g * dt # gravity on vertical velocity
x = x + vx * dt
y = y + vy * dt
loss = 0.5 * (x - target_x)**2
return x, loss
def analytic_grad(v0):
"""d loss / d v0 by differentiating the closed-form rollout.
With constant vx: x_T = vx * (T*dt). Loss = 0.5*(x_T - target)^2."""
vx, vy = v0
x_T = vx * (T * dt)
err = x_T - target_x
dloss_dvx = err * (T * dt) # dx_T/dvx = T*dt
dloss_dvy = 0.0 # horizontal landing x independent of vy here
return np.array([dloss_dvx, dloss_dvy])
def finite_diff_grad(v0, eps=1e-6):
grad = np.zeros(2)
for i in range(2):
vp = v0.copy(); vp[i] += eps
vm = v0.copy(); vm[i] -= eps
grad[i] = (rollout(vp)[1] - rollout(vm)[1]) / (2 * eps)
return grad
v0 = np.array([2.5, 1.0])
print("analytic grad :", analytic_grad(v0))
print("finite-diff grad :", finite_diff_grad(v0))
print("max abs diff :", np.max(np.abs(analytic_grad(v0) - finite_diff_grad(v0))))
# Gradient descent on v0 to hit the target.
v = np.array([2.5, 1.0]); lr = 0.05
for step in range(200):
v = v - lr * analytic_grad(v)
x_final, loss = rollout(v)
print(f"optimized vx={v[0]:.4f} landing x={x_final:.4f} loss={loss:.2e}")
The analytic and finite-difference gradients agree to within numerical tolerance, which is the green light to optimize: gradient descent then drives the landing point onto the target. This rollout is smooth, so the gradient is exact. The cautionary half of the lesson is that adding a contact event (a bounce, a grasp, a foot strike) makes $f_\Delta$ only piecewise smooth, and the same finite-difference check will then expose where the autodiff gradient is biased by contact smoothing or undefined across a mode boundary. Run the check on one smooth case and one contact case before optimizing a long horizon.
For Differentiable physics: what it buys you, the hand-built fragment exposes the physical assumption before maintained tools take over. MuJoCo, MJX, Drake, Pinocchio, and Isaac Lab are useful only when the same mass, contact, actuator, and timestep contract is preserved.
Practical Recipe
- Write the observation, action, and success metric before choosing a model.
- Build a baseline that is simple enough to debug by inspection.
- Add the library implementation only after the baseline behavior is understood.
- Record failures as structured cases: perception error, state error, planning error, control error, or evaluation error.
- Run at least one perturbation test before trusting the result.
The common mistake in Differentiable physics: what it buys you is to celebrate the component score before checking the closed-loop handoff. The failure usually appears at the boundary: stale state, wrong frame, delayed action, saturated actuator, or metric that ignores the real task cost.
A robotics team using Differentiable physics: what it buys you should log not only final success, but intermediate observations, chosen actions, controller status, and recovery events. The logs reveal whether the method is solving the task or merely passing the easiest episodes.
A good embodied system makes differentiable physics: what it buys you visible twice: once in the design sketch and once in the replay artifact. The second view keeps the first one honest.
For Differentiable physics: what it buys you, treat frontier claims as hypotheses until they expose enough detail to reproduce the result: data boundary, embodiment, controller interface, evaluation panel, and failure cases.
Can you name the observation, state estimate, action, success metric, and most likely failure mode for Differentiable physics: what it buys you? If not, the system boundary is still too vague.
Production Pattern
Differentiable physics: what it buys you sits inside the Part II robotics contract: geometry defines where things are, kinematics defines what motion is possible, dynamics defines what motion costs, control defines how errors are corrected, and sensing defines what the agent can know on time.
Use differentiable physics when gradients answer a design question, and validate them against finite differences. This makes the section useful to students, builders, and researchers at the same time: the idea has an intuitive role, a formal interface, a runnable check, and a failure mode that can be reproduced.
Differentiable physics is useful when the question is not only "what happened?" but "which parameter, action, or design choice would make the outcome better?" Gradients can tune masses, friction coefficients, control sequences, morphology parameters, or policy inputs, but only if the gradient is a faithful derivative of the simulation the reader intends to trust.
A rollout loss gives one number at the end of a trajectory. Differentiation distributes that loss backward through time, assigning sensitivity to earlier states, actions, and physical parameters. The power is direct optimization; the risk is that contacts, discontinuities, and solver smoothing can assign blame to the wrong modeled cause.
For Differentiable physics: what it buys you, dynamics adds causes of motion: forces, torques, inertia, contact impulses, and integration. Keep units, solver step, contact parameters, and energy behavior visible.
| Tool or Library | What It Handles | Verification Check |
|---|---|---|
| MuJoCo | runs articulated dynamics and contact simulation for robot learning experiments | Verify timestep, solver parameters, contact settings, and reset semantics. |
| MJX | runs articulated dynamics and contact simulation for robot learning experiments | Verify timestep, solver parameters, contact settings, and reset semantics. |
| Drake | models dynamical systems, multibody plants, optimization, and controllers | Verify scalar type, plant finalization, frame convention, and solver status. |
| Pinocchio | computes articulated-body kinematics, dynamics, and derivatives | Verify model frames, joint ordering, and derivative convention against the URDF. |
| Isaac Lab | scales robot-learning simulation with GPU workflows and sensor-rich scenes | Verify environment parity, reset distribution, and logged seeds before training. |
Use this recipe when turning Differentiable physics: what it buys you into code, a simulator experiment, or a robot diagnostic. The point is not to use every library. The point is to keep the hand-built baseline and the maintained-tool path comparable.
- Specify mass, inertia, actuator limits, contact model, timestep, and solver tolerance before running a rollout.
- Run one free-motion test and one contact test with logged energy, constraint violation, and penetration depth.
- Compare the hand calculation with MuJoCo, Drake, Pinocchio, or MJX on the same model and timestep.
- Store solver settings, random seed, initial state, trajectory, and failure labels in one artifact.
- Scale to Isaac Lab or GPU-parallel simulation only after a small model passes deterministic checks.
For Differentiable physics: what it buys you, compare methods only through one saved artifact that preserves the inputs, outputs, units, timestamps, latency budget, configuration, seed, metric definition, and failure labels relevant to this section. The comparison is meaningful only when the same script evaluates the same panel.
Extend the section exercise by adding one perturbation specific to Differentiable physics: what it buys you and one latency or uncertainty check. Save the result in the EvidenceRecord schema, then explain which library output you trust and why.
For Differentiable physics: what it buys you, distrust smooth simulation until the section-specific physical assumption has been stress-tested: timestep, contact stiffness, damping, friction, actuation, and energy behavior should each have a small diagnostic.
Technical Core
Differentiable physics: what it buys you needs a topic-native core: variables, equations or system contracts, an algorithmic procedure, an expected output, and a failure diagnosis. Figure 6.5.T summarizes the chain this section must preserve when moving from a teaching example to a real embodied system.
A differentiable simulator exposes $x_{k+1}=f_\Delta(x_k,u_k,\theta)$ and a loss $L=\ell(x_T)$, then computes sensitivities such as $\partial L/\partial u_k$ or $\partial L/\partial \theta$. Backpropagation through time multiplies local Jacobians across the rollout, so long horizons can create vanishing or exploding gradients. Contact events are often only piecewise smooth, which means a gradient can be undefined, regularized, or valid only inside one contact mode.
- State what is differentiated: controls, physical parameters, initial state, morphology, or policy inputs.
- Compare automatic gradients with centered finite differences on a short rollout before optimizing a long one.
- Run one smooth free-motion case and one contact case, then report where gradients disagree or become noisy.
- Log smoothing, contact regularization, solver tolerance, horizon length, and gradient norm beside the final loss.
| Contract Field | What To Specify | Why It Matters |
|---|---|---|
| State and observation | Variables, units, timestamps, frames, and uncertainty. | Prevents a model score from being mistaken for robot capability. |
| Action interface | Command type, limits, update rate, and safety fallback. | Makes the learned or planned output executable. |
| Evidence artifact | Trace, metric, configuration, seed, and failure label. | Allows baseline and library path to be compared in one pass. |
| Tool path | MuJoCo, Drake, Isaac Sim, Gazebo, PyBullet, SAPIEN, NumPy | Shows the practical library route after the mechanism is understood. |
For Differentiable physics: what it buys you, expected output is a state trace with the relevant physical invariant: bounded energy error for free motion, bounded penetration for contact, and a solver-status field that explains divergence.
Differentiable physics: what it buys you is validated by conserved quantities where they should hold, stable contact where contact is expected, and reproducible divergence under a named parameter perturbation.
Section References
Core references for Differentiable physics: what it buys you: Modern Robotics; Murray, Li, and Sastry; Siciliano et al.; LaValle; and the official documentation for Drake, MuJoCo, Pinocchio, CasADi, python-control, GTSAM, ROS 2, and OpenCV as applicable.
Use these references to check notation, frame conventions, solver assumptions, and library behavior before comparing hand-built and maintained-tool implementations.
Differentiable physics: what it buys you is useful when it makes the perception-action loop more reliable, not when it merely adds a more impressive model name.
Design a method-matched experiment for Differentiable physics: what it buys you. Specify the environment, observations, actions, metric, one perturbation, and the library output you would compare against the hand-built baseline.