Book Part
Part IX

Part IX: Manipulation, Locomotion, and Embodied Skills

Part Overview

This part covers manipulation, grasping, tactile learning, locomotion, humanoids, drones, and autonomous driving as embodied AI systems. It connects contact, mobility, vehicle dynamics, whole-body control, and safety cases to tools and labs the reader can actually run.

Chapters: 7. Each chapter now uses the same production contract: mechanism, diagram, code artifact, library shortcut, same-panel evaluation, failure analysis, lab, and bibliography.

Why This Part Matters

Part IX is where embodied AI stops being a policy in a benchmark and becomes a skill in a body. Hands, legs, drones, and vehicles all expose the same lesson: action quality is measured by contact, timing, recoverability, and safety evidence.

This chapter treats robotic manipulation as object-state control under geometry, contact, friction, and uncertainty.

  • 42.1 What manipulation is; reaching and pushing
  • 42.2 Pick-and-place pipelines
  • 42.3 Contact-rich interaction
  • 42.4 Perception for manipulation
  • 42.5 Learning manipulation policies (IL, RL, VLA)
  • 42.6 Failure detection and recovery
  • 42.7 Mobile manipulation: base, arm, perception, and recovery

This chapter studies grasping and dexterity as the design of useful contact sets over time.

  • 43.1 Grasp synthesis: analytic and learned (Dex-Net lineage)
  • 43.2 Parallel-jaw vs. multi-finger hands
  • 43.3 In-hand manipulation and reorientation
  • 43.4 Dexterous RL with demonstrations
  • 43.5 Sim-to-real for dexterity

This chapter develops touch as a missing modality for contact: DIGIT, GelSight, AnySkin, tactile simulation, slip detection, and multi-modal policies.

  • 44.1 Why touch matters for contact-rich tasks
  • 44.2 Vision-based tactile sensors (GelSight, DIGIT)
  • 44.3 Simulating touch (e.g., tactile sim in Isaac)
  • 44.4 Visuo-tactile pretraining and policies
  • 44.5 Combining vision and touch

This chapter develops wheeled and legged mobility, balance, gait, massively parallel RL, terrain adaptation, energy, and sim-to-real safety.

  • 45.1 Wheeled, legged, and hybrid robots
  • 45.2 Balance, stability, and gait
  • 45.3 Learning locomotion with massively parallel RL
  • 45.4 Terrain adaptation, parkour, and rapid motor adaptation
  • 45.5 Energy efficiency; sim-to-real and safety in locomotion

This chapter develops humanoids as whole-body embodied agents: platforms, operational-space control, retargeting, teleoperation, foundation models, and human-scale safety.

  • 46.1 Why humanoids became the focus (data, morphology, hardware cost)
  • 46.2 Platforms: Unitree G1/H1, Figure, Optimus, 1X, electric Atlas, Apptronik
  • 46.3 Whole-body and operational-space control
  • 46.4 Learning from humans: HumanPlus, OmniH2O/HOVER, motion retargeting
  • 46.5 Teleoperation for humanoids
  • 46.6 Dual-system humanoid foundation models
  • 46.7 Safety for human-scale robots
  • 46.8 Advanced humanoid dynamics and contact mechanics
  • 46.9 Boston Dynamics-style loco-manipulation research track

This chapter develops aerial embodied AI through flight dynamics, PX4, ROS 2, MAVLink, obstacle avoidance, coverage, coordination, regulation, and simulation.

  • 47.1 Why aerial agents are special
  • 47.2 Flight dynamics intuition
  • 47.3 Perception, navigation, and obstacle avoidance
  • 47.4 Coverage and inspection; multi-drone coordination
  • 47.5 Safety, regulation, and simulation for aerial agents
  • 47.6 Quadrotor dynamics and flight control
  • 47.7 Trajectory generation and GPS-denied missions
  • 47.8 PX4 to hardware: SITL, HITL, logs, and flight-test evidence

This chapter develops autonomous driving as embodied AI: sensors, fusion, prediction, planning, world models, CARLA, CommonRoad, scenarios, and safety cases.

  • 48.1 Driving as perception, prediction, planning, control
  • 48.2 Sensors and sensor fusion in AVs
  • 48.3 Detection, lane and behavior prediction
  • 48.4 Route and local planning
  • 48.5 End-to-end and world-model driving
  • 48.6 Scenario testing and safety cases
  • 48.7 Vehicle kinematics, dynamics, and control
  • 48.8 Route, behavior, and scenario-based planning
  • 48.9 Closed-loop driving evaluation and safety assurance

Part IX Production Contract

Every chapter in this part should leave the reader with a reproducible artifact, not only a concept summary. The artifact must name the observation, action, metric, tool route, perturbation, and expected failure.

Skill Families And Evidence
FamilyPrimary EvidenceRepresentative Tools
Manipulation and graspingObject success, contact stability, recovery rateMoveIt 2, Drake, MuJoCo, ManiSkill, robomimic, LeRobot
Tactile learningSlip detection, contact timing, cross-sensor generalizationDIGIT, GelSight, AnySkin, TACTO, Tactile Gym
Locomotion and humanoidsBalance margin, fall rate, energy, safety stopsIsaac Lab, MJX, ROS 2, whole-body control tools
Drones and drivingScenario completion, violations, risk margin, intervention countPX4, MAVLink, CARLA, CommonRoad, scenario runners

What's Next?

After this part, Part X: Multi-Agent and Human-Centered Embodiment extends the stack.