Part Overview
This part covers manipulation, grasping, tactile learning, locomotion, humanoids, drones, and autonomous driving as embodied AI systems. It connects contact, mobility, vehicle dynamics, whole-body control, and safety cases to tools and labs the reader can actually run.
Chapters: 7. Each chapter now uses the same production contract: mechanism, diagram, code artifact, library shortcut, same-panel evaluation, failure analysis, lab, and bibliography.
Part IX is where embodied AI stops being a policy in a benchmark and becomes a skill in a body. Hands, legs, drones, and vehicles all expose the same lesson: action quality is measured by contact, timing, recoverability, and safety evidence.
This chapter treats robotic manipulation as object-state control under geometry, contact, friction, and uncertainty.
- 42.1 What manipulation is; reaching and pushing
- 42.2 Pick-and-place pipelines
- 42.3 Contact-rich interaction
- 42.4 Perception for manipulation
- 42.5 Learning manipulation policies (IL, RL, VLA)
- 42.6 Failure detection and recovery
- 42.7 Mobile manipulation: base, arm, perception, and recovery
This chapter studies grasping and dexterity as the design of useful contact sets over time.
- 43.1 Grasp synthesis: analytic and learned (Dex-Net lineage)
- 43.2 Parallel-jaw vs. multi-finger hands
- 43.3 In-hand manipulation and reorientation
- 43.4 Dexterous RL with demonstrations
- 43.5 Sim-to-real for dexterity
This chapter develops touch as a missing modality for contact: DIGIT, GelSight, AnySkin, tactile simulation, slip detection, and multi-modal policies.
- 44.1 Why touch matters for contact-rich tasks
- 44.2 Vision-based tactile sensors (GelSight, DIGIT)
- 44.3 Simulating touch (e.g., tactile sim in Isaac)
- 44.4 Visuo-tactile pretraining and policies
- 44.5 Combining vision and touch
This chapter develops wheeled and legged mobility, balance, gait, massively parallel RL, terrain adaptation, energy, and sim-to-real safety.
- 45.1 Wheeled, legged, and hybrid robots
- 45.2 Balance, stability, and gait
- 45.3 Learning locomotion with massively parallel RL
- 45.4 Terrain adaptation, parkour, and rapid motor adaptation
- 45.5 Energy efficiency; sim-to-real and safety in locomotion
This chapter develops humanoids as whole-body embodied agents: platforms, operational-space control, retargeting, teleoperation, foundation models, and human-scale safety.
- 46.1 Why humanoids became the focus (data, morphology, hardware cost)
- 46.2 Platforms: Unitree G1/H1, Figure, Optimus, 1X, electric Atlas, Apptronik
- 46.3 Whole-body and operational-space control
- 46.4 Learning from humans: HumanPlus, OmniH2O/HOVER, motion retargeting
- 46.5 Teleoperation for humanoids
- 46.6 Dual-system humanoid foundation models
- 46.7 Safety for human-scale robots
- 46.8 Advanced humanoid dynamics and contact mechanics
- 46.9 Boston Dynamics-style loco-manipulation research track
This chapter develops aerial embodied AI through flight dynamics, PX4, ROS 2, MAVLink, obstacle avoidance, coverage, coordination, regulation, and simulation.
- 47.1 Why aerial agents are special
- 47.2 Flight dynamics intuition
- 47.3 Perception, navigation, and obstacle avoidance
- 47.4 Coverage and inspection; multi-drone coordination
- 47.5 Safety, regulation, and simulation for aerial agents
- 47.6 Quadrotor dynamics and flight control
- 47.7 Trajectory generation and GPS-denied missions
- 47.8 PX4 to hardware: SITL, HITL, logs, and flight-test evidence
This chapter develops autonomous driving as embodied AI: sensors, fusion, prediction, planning, world models, CARLA, CommonRoad, scenarios, and safety cases.
- 48.1 Driving as perception, prediction, planning, control
- 48.2 Sensors and sensor fusion in AVs
- 48.3 Detection, lane and behavior prediction
- 48.4 Route and local planning
- 48.5 End-to-end and world-model driving
- 48.6 Scenario testing and safety cases
- 48.7 Vehicle kinematics, dynamics, and control
- 48.8 Route, behavior, and scenario-based planning
- 48.9 Closed-loop driving evaluation and safety assurance
Part IX Production Contract
Every chapter in this part should leave the reader with a reproducible artifact, not only a concept summary. The artifact must name the observation, action, metric, tool route, perturbation, and expected failure.
| Family | Primary Evidence | Representative Tools |
|---|---|---|
| Manipulation and grasping | Object success, contact stability, recovery rate | MoveIt 2, Drake, MuJoCo, ManiSkill, robomimic, LeRobot |
| Tactile learning | Slip detection, contact timing, cross-sensor generalization | DIGIT, GelSight, AnySkin, TACTO, Tactile Gym |
| Locomotion and humanoids | Balance margin, fall rate, energy, safety stops | Isaac Lab, MJX, ROS 2, whole-body control tools |
| Drones and driving | Scenario completion, violations, risk margin, intervention count | PX4, MAVLink, CARLA, CommonRoad, scenario runners |
What's Next?
After this part, Part X: Multi-Agent and Human-Centered Embodiment extends the stack.