Part IX: Manipulation, Locomotion, and Embodied Skills | Building Embodied AI: From Perception to Autonomous Action

Part Overview

This part covers manipulation, grasping, tactile learning, locomotion, humanoids, drones, and autonomous driving as embodied AI systems. It connects contact, mobility, vehicle dynamics, whole-body control, and safety cases to tools and labs the reader can actually run.

Chapters: 7. Each chapter now uses the same production contract: mechanism, diagram, code artifact, library shortcut, same-panel evaluation, failure analysis, lab, and bibliography.

Why This Part Matters

Part IX is where embodied AI stops being a policy in a benchmark and becomes a skill in a body. Hands, legs, drones, and vehicles all expose the same lesson: action quality is measured by contact, timing, recoverability, and safety evidence.

Chapter 42 Robotic Manipulation

This chapter treats robotic manipulation as object-state control under geometry, contact, friction, and uncertainty.

42.1 What manipulation is; reaching and pushing
42.2 Pick-and-place pipelines
42.3 Contact-rich interaction
42.4 Perception for manipulation
42.5 Learning manipulation policies (IL, RL, VLA)
42.6 Failure detection and recovery
42.7 Mobile manipulation: base, arm, perception, and recovery

Chapter 43 Grasping and Dexterous Manipulation

This chapter studies grasping and dexterity as the design of useful contact sets over time.

43.1 Grasp synthesis: analytic and learned (Dex-Net lineage)
43.2 Parallel-jaw vs. multi-finger hands
43.3 In-hand manipulation and reorientation
43.4 Dexterous RL with demonstrations
43.5 Sim-to-real for dexterity

Chapter 44 Tactile and Visuo-Tactile Learning

This chapter develops touch as a missing modality for contact: DIGIT, GelSight, AnySkin, tactile simulation, slip detection, and multi-modal policies.

44.1 Why touch matters for contact-rich tasks
44.2 Vision-based tactile sensors (GelSight, DIGIT)
44.3 Simulating touch (e.g., tactile sim in Isaac)
44.4 Visuo-tactile pretraining and policies
44.5 Combining vision and touch

Chapter 45 Locomotion and Mobility

This chapter develops wheeled and legged mobility, balance, gait, massively parallel RL, terrain adaptation, energy, and sim-to-real safety.

45.1 Wheeled, legged, and hybrid robots
45.2 Balance, stability, and gait
45.3 Learning locomotion with massively parallel RL
45.4 Terrain adaptation, parkour, and rapid motor adaptation
45.5 Energy efficiency; sim-to-real and safety in locomotion

Chapter 46 Humanoid Robots and Whole-Body Control

This chapter develops humanoids as whole-body embodied agents: platforms, operational-space control, retargeting, teleoperation, foundation models, and human-scale safety.

46.1 Why humanoids became the focus (data, morphology, hardware cost)
46.2 Platforms: Unitree G1/H1, Figure, Optimus, 1X, electric Atlas, Apptronik
46.3 Whole-body and operational-space control
46.4 Learning from humans: HumanPlus, OmniH2O/HOVER, motion retargeting
46.5 Teleoperation for humanoids
46.6 Dual-system humanoid foundation models
46.7 Safety for human-scale robots
46.8 Advanced humanoid dynamics and contact mechanics
46.9 Boston Dynamics-style loco-manipulation research track

Chapter 47 Drones and Aerial Embodied AI

This chapter develops aerial embodied AI through flight dynamics, PX4, ROS 2, MAVLink, obstacle avoidance, coverage, coordination, regulation, and simulation.

47.1 Why aerial agents are special
47.2 Flight dynamics intuition
47.3 Perception, navigation, and obstacle avoidance
47.4 Coverage and inspection; multi-drone coordination
47.5 Safety, regulation, and simulation for aerial agents
47.6 Quadrotor dynamics and flight control
47.7 Trajectory generation and GPS-denied missions
47.8 PX4 to hardware: SITL, HITL, logs, and flight-test evidence

Chapter 48 Autonomous Driving as Embodied AI

This chapter develops autonomous driving as embodied AI: sensors, fusion, prediction, planning, world models, CARLA, CommonRoad, scenarios, and safety cases.

48.1 Driving as perception, prediction, planning, control
48.2 Sensors and sensor fusion in AVs
48.3 Detection, lane and behavior prediction
48.4 Route and local planning
48.5 End-to-end and world-model driving
48.6 Scenario testing and safety cases
48.7 Vehicle kinematics, dynamics, and control
48.8 Route, behavior, and scenario-based planning
48.9 Closed-loop driving evaluation and safety assurance

Part IX Production Contract

Every chapter in this part should leave the reader with a reproducible artifact, not only a concept summary. The artifact must name the observation, action, metric, tool route, perturbation, and expected failure.

Skill Families And Evidence

Family	Primary Evidence	Representative Tools
Manipulation and grasping	Object success, contact stability, recovery rate	MoveIt 2, Drake, MuJoCo, ManiSkill, robomimic, LeRobot
Tactile learning	Slip detection, contact timing, cross-sensor generalization	DIGIT, GelSight, AnySkin, TACTO, Tactile Gym
Locomotion and humanoids	Balance margin, fall rate, energy, safety stops	Isaac Lab, MJX, ROS 2, whole-body control tools
Drones and driving	Scenario completion, violations, risk margin, intervention count	PX4, MAVLink, CARLA, CommonRoad, scenario runners

What's Next?

After this part, Part X: Multi-Agent and Human-Centered Embodiment extends the stack.