"The first bug in a robot simulator is usually a promise nobody wrote down."
A Constraint-Checking AI Agent
Before choosing MuJoCo, Isaac Lab, Genesis, or any other simulator, we need to know what a physics simulator actually promises. It updates a state through time by applying forces, constraints, contacts, and actuator commands, then exposes part of that state as observations for the agent loop introduced in Chapter 2.
This section connects rigid-body and contact math to simulator design choices: integrator, contact solver, constraint stabilization, and sensor abstraction. It prepares later GPU workflows by making the modeled physics explicit.
The Physics Contract
A simulator answers a question: given the current state $s_t$, an action $a_t$, and a time step $\Delta t$, what state should come next? For rigid-body robotics, the state usually includes body poses, velocities, joint positions, joint velocities, actuator states, and contacts. The update can be summarized as:
$$s_{t+1} = \operatorname{step}(s_t, a_t, \Delta t, \theta)$$
The parameters $\theta$ contain masses, inertias, joint limits, friction coefficients, damping, solver tolerances, sensor noise, and contact settings. This equation looks compact, but each term is a modeling choice that can decide whether a learned policy transfers to reality.
The solver settings deserve special attention because they are not cosmetic. Timestep, integrator choice, constraint softness, contact margin, and solver iteration count decide how much penetration, bounce, jitter, and energy drift the simulator permits. A good experiment records these values next to the policy checkpoint because changing them can change the learned behavior as much as changing a reward term.
| Component | What it means | Why it matters for action | Common failure |
|---|---|---|---|
| Bodies | Rigid or deformable objects with mass and inertia | Determines acceleration, impact, and balance | Incorrect inertia makes policies learn unrealistic motion |
| Joints | Allowed motion between bodies | Defines the action space and robot kinematics | Wrong axis or limit creates impossible behaviors |
| Contacts | Collision points and constraint forces | Controls grasping, walking, pushing, and sliding | Contact jitter hides real failure modes |
| Friction | Resistance to sliding or rolling | Decides whether a grasp holds or a foot slips | Over-tuned friction makes sim easier than reality |
| Sensors | Observed signals derived from state or rendering | Creates the partial observability seen by the agent | Perfect sensors inflate evaluation scores |
The most useful simulator is not the one with the longest feature list. It is the one whose simplifications you can name, measure, and include in the evaluation plan. This is the bridge from Chapter 9 to the transfer work in Chapter 13.
A Minimal Contact Stepper
Code Fragment 1 below implements a tiny one-dimensional contact model with NumPy. It is not a replacement for a simulator. Its purpose is to make the contract visible: integration, gravity, floor contact, and friction are separate modeling decisions.
# Minimal physics stepper: integrate a falling block with floor contact.
# NumPy is used only for clear scalar math and reproducible arrays.
# The output shows how contact and friction change state over time.
import numpy as np
height = 0.20
velocity = -0.10
dt = 0.02
gravity = -9.81
restitution = 0.25
floor_height = 0.0
trajectory = []
for step in range(8):
velocity = velocity + gravity * dt
height = height + velocity * dt
if height < floor_height:
height = floor_height
velocity = -restitution * velocity
trajectory.append((step, round(height, 4), round(velocity, 4)))
for row in trajectory:
print(row)
(0, 0.1941, -0.2962) (1, 0.1842, -0.4924) (2, 0.1705, -0.6886) (3, 0.1528, -0.8848) (4, 0.1312, -1.081) (5, 0.1057, -1.2772) (6, 0.0762, -1.4734) (7, 0.0428, -1.6696)
Code Fragment 1 uses about 20 lines to teach the state update. In practice, MuJoCo reduces the same rigid-body stepping pattern to a model load plus repeated mj_step calls, while handling articulated bodies, constraints, contacts, actuator dynamics, sensors, and solver settings internally. The hand-built stepper remains useful because it teaches what to inspect when a full simulator behaves strangely.
Contacts Are The Hard Part
Free-space motion is usually not where robot simulation becomes surprising. Contact is harder because the simulator must prevent interpenetration, estimate collision points, solve constraint forces, approximate friction cones, and remain numerically stable. A grasp that succeeds in simulation because friction is too generous is not a robot skill. It is a measurement error.
Most engines therefore expose a contact model rather than a single truth about contact. A stiff contact setting can reduce visible penetration but create jitter or smaller stable timesteps. A softer setting can make training smoother but hide impact failures. The validation question is concrete: if friction, restitution, timestep, and solver iterations move within plausible ranges, does the same policy still satisfy the task metric?
Choose the simulator by the physical mechanism being tested: contact stiffness, friction cone behavior, actuator limits, sensor timing, and reset semantics matter more than benchmark reputation.
If a policy only works when contact parameters are tuned to a narrow value, the simulator is not validating the policy. It is training the policy to exploit the simulator. This is why later chapters treat domain randomization, system identification, and real-world evaluation as part of one workflow.
Practical Recipe
- Write down the state variables the simulator owns and the observations the agent receives.
- Identify the contacts that decide success: gripper-object, foot-ground, wheel-floor, drone-air, or tool-surface.
- Record friction, restitution, damping, contact margin, timestep, integrator, and solver tolerances in the experiment config.
- Run one perturbation sweep over each critical contact or solver parameter before trusting a result.
- Keep logs that separate physics failure, perception failure, policy failure, and evaluation failure.
For a tabletop pushing task, do not start by asking whether Isaac Lab or MuJoCo is faster. Start by asking whether object mass, table friction, contact geometry, camera pose, and action frequency match the task. Once those are written down, speed becomes meaningful because you know what the simulator is accelerating.
Expected output: A simulator report for this section should include the state variables, the observation mapping, the contact parameters, the solver settings, and a perturbation table showing whether the policy remains stable under plausible physics changes.
A simulator is a promise about the next state. The debugging trick is to ask which promise failed: the body model, the contact model, the sensor model, or the solver that glued them together.
Pick a task from your own work. Can you name the three physical parameters that would most change the outcome? If the answer is no, the simulator choice is premature.
Modify Code Fragment 1 so the block starts with a horizontal velocity and loses 10 percent of that velocity every time it touches the floor. Explain which line represents a friction approximation and which line represents a contact approximation.
Current robot-learning systems increasingly combine fast simulation with measured uncertainty about the simulator itself. A strong 2026 research direction is contact-aware validation: estimating how sensitive a learned policy is to friction, compliance, restitution, and solver settings before hardware trials. The open question is not only how to make simulation faster, but how to make simulator error visible enough that training, evaluation, and transfer can reason about it.
A physics simulator is a state-transition model plus a sensor model plus an error budget. If you cannot describe all three, you are not ready to interpret the result.
Section 11.2 turns this physics contract into concrete MuJoCo models written in MJCF or imported from URDF.
This paper explains the simulator design that made MuJoCo important for model-based control and robot learning. It is the best starting point for readers who want the mechanics behind constraints, contact, and control-oriented simulation.
Tedrake, R. (2024). Underactuated Robotics.
This open textbook gives the dynamics and control background needed to reason about simulated bodies and contacts. Readers who want deeper mathematical treatment of Chapter 6 and this section should keep it nearby.
Google DeepMind. "MuJoCo Computation Documentation."
The computation docs describe the pipeline from model data to constraints, sensors, and stepping. They are directly relevant when a reader wants to connect this section's simplified stepper to a production simulator.
Drake Development Team. "Drake Documentation."
Drake provides a rigorous systems view of dynamics, simulation, planning, and control. It is useful for readers who want to see physics simulation embedded inside a larger model-based design workflow.
Open Robotics. "Gazebo Documentation."
Gazebo documentation is relevant when the simulator must connect to robot middleware and sensor plugins. Readers focused on ROS 2 integration should compare these docs with the lower-level physics view in this section.