Book Part
Part III

Part III: Simulation, Tooling, and the Modern Stack

Part Overview

This part covers the simulators, environments, benchmarks, and synthetic-data practices used to build embodied systems today. It connects formal ideas with the tools and labs needed to build working systems.

Chapters: 5. Each chapter includes theory, recipes, practical code, a library shortcut, and exercises.

Why This Part Matters

Simulation, Tooling, and the Modern Stack gives the reader a working layer of the embodied AI stack. Later chapters assume this layer when agents must perceive, plan, act, and recover from mistakes.

Part III Production Spine

Part III turns the foundations from Part II into the tool stack used by modern embodied AI experiments. The through-line is deliberate: simulation economics, environment interfaces, physics simulators, benchmark design, and synthetic variation.

Part III Evidence Rule

A result belongs in a paper-facing table only when the compared numbers are co-computed in one pass on one configuration. Diagnostic runs can explore freely, but benchmark claims need shared panels, seeds, splits, and metric definitions.

This chapter explains why simulation changes the cost, safety, and diagnosis profile of embodied learning before hardware trials begin.

  • 9.1 Why real-world learning is slow, costly, and risky
  • 9.2 Simulation as data generator, testbed, and curriculum
  • 9.3 Fidelity: physical, visual, behavioral
  • 9.4 The reality gap as a measurable quantity
  • 9.5 The landscape of benchmark environments

This chapter teaches the Gymnasium and PettingZoo contracts that make observations, actions, rewards, termination, wrappers, and multi-agent turns reproducible.

  • 10.1 Gym is dead; Gymnasium is the standard
  • 10.2 Observation and action spaces
  • 10.3 Reward design and termination
  • 10.4 Vectorized environments; wrappers
  • 10.5 Rendering, logging, and debugging
  • 10.6 Evaluation protocol and seeding
  • 10.7 PettingZoo for multi-agent

This chapter compares MuJoCo, MJX, Isaac Lab, Genesis, Newton, Drake, SAPIEN, and Gazebo by task fit rather than popularity.

  • 11.1 What physics simulators model (bodies, joints, contacts, friction)
  • 11.2 MuJoCo and the MJCF/URDF model formats
  • 11.3 MuJoCo MJX and MuJoCo Warp: massively parallel and differentiable
  • 11.4 NVIDIA Isaac Sim + Isaac Lab; the Isaac Gym -> Isaac Lab migration
  • 11.5 The Newton physics engine and OpenUSD scene interchange
  • 11.6 Genesis and generative multi-physics
  • 11.7 Drake, SAPIEN, ROS 2, and Gazebo
  • 11.8 Choosing a simulator

This chapter shows how task suites, leaderboards, construct-matched metrics, and failure labels turn simulator runs into defensible evidence.

  • 12.1 Why standardized benchmarks matter
  • 12.2 Manipulation: ManiSkill3, robosuite, RoboCasa, robomimic, RLBench
  • 12.3 Lifelong and language-conditioned: LIBERO, CALVIN, Meta-World
  • 12.4 Household and long-horizon: BEHAVIOR-1K / OmniGibson
  • 12.5 Navigation and social: Habitat 3.0, AI2-THOR / ProcTHOR
  • 12.6 Reading a leaderboard without fooling yourself

This chapter designs synthetic variation, domain randomization, real2sim2real loops, and transfer audits as one measurable workflow.

  • 13.1 Why synthetic variation matters
  • 13.2 Visual, physics, sensor, and task randomization
  • 13.3 Curriculum and automatic randomization
  • 13.4 Photoreal rendering and tiled cameras
  • 13.5 real2sim2real and asset/scene reconstruction
  • 13.6 Randomization vs. realism; measuring transfer readiness

What's Next?

After this part, Part IV: Reinforcement Learning for Embodied Agents extends the stack.