Part V: Learning from Demonstration and Robot Data | Building Embodied AI: From Perception to Autonomous Action

Part Overview

This part covers a coherent segment of the embodied AI stack. It connects formal ideas with the tools and labs needed to build working systems.

Chapters: 6. Each chapter includes theory, recipes, practical code, a library shortcut, and exercises.

Why This Part Matters

Learning from Demonstration and Robot Data gives the reader a working layer of the embodied AI stack. Later chapters assume this layer when agents must perceive, plan, act, and recover from mistakes.

Chapter 21 Imitation Learning

This chapter develops imitation learning as part of the embodied AI stack.

21.1 Why learning from demonstration matters for robots
21.2 Behavior cloning; the distribution-shift problem
21.3 DAgger and dataset aggregation
21.4 Inverse reinforcement learning
21.5 Sources of demonstrations: humans, planners, foundation models

Chapter 22 Action Chunking and Diffusion Policies

This chapter develops action chunking and diffusion policies as part of the embodied AI stack.

22.1 Why single-step prediction fails on real manipulation
22.2 ACT (Action Chunking Transformer) and the cVAE formulation
22.3 ALOHA, ALOHA 2, and Mobile ALOHA
22.4 Diffusion Policy: action generation by denoising
22.5 Flow matching for actions

Chapter 23 Teleoperation and Data Collection

This chapter develops teleoperation and data collection as part of the embodied AI stack.

23.1 Why data is the bottleneck
23.2 Leader-follower teleoperation (ALOHA, GELLO)
23.3 Handheld and in-the-wild collection (UMI)
23.4 Immersive/VR teleoperation (Open-TeleVision)
23.5 Data quality, diversity, and labeling

Chapter 24 Robot Datasets and Data Scaling Laws

This chapter develops robot datasets and data scaling laws as part of the embodied AI stack.

24.1 The major datasets: Open X-Embodiment, DROID, BridgeData V2, RH20T, RoboMIND, AgiBot World
24.2 Dataset structure, embodiment metadata, and licensing
24.3 Cross-embodiment pooling
24.4 Empirical data scaling laws in imitation learning
24.5 Curating and mixing data

Chapter 25 Offline RL and Dataset-Based Robot Learning

This chapter develops offline rl and dataset-based robot learning as part of the embodied AI stack.

25.1 Learning without online interaction
25.2 Distribution shift and extrapolation error
25.3 Conservative methods (CQL, IQL) and their intuition
25.4 Offline-to-online fine-tuning
25.5 Evaluating offline policies rigorously

Chapter 26 Skills, Hierarchy, and Task Decomposition

This chapter develops skills, hierarchy, and task decomposition as part of the embodied AI stack.

26.1 What a skill is; low- vs. high-level actions
26.2 The options framework
26.3 Skill discovery and hierarchical RL
26.4 Language as a high-level controller
26.5 Skill libraries for embodied agents

What's Next?

After this part, Part VI: Embodied Perception extends the stack.