Chapter 29: Localization and Mapping (SLAM)

"An agent becomes interesting at the exact moment perception changes what it dares to do next."

A Patient Embodied AI Agent
Big Picture

Localization and Mapping (SLAM) turns perception into action-ready state. A robot that does not know where it is will turn every good plan into a guess. SLAM is the discipline of making that guess explicit, updateable, and testable.

Remember This Chapter

The durable test is not whether a model looks impressive. The test is whether it improves a robot's next action while leaving a clear evidence trail for debugging.

Chapter Overview

Chapter 29 develops Localization and Mapping (SLAM) as a working piece of the embodied AI stack. It connects visual or spatial evidence to state estimates, action choices, visual servoing loops, timing budgets, and failure labels.

The chapter follows the right-tool rhythm used across the book: build the mechanism once, then move to maintained tools such as OpenCV, Open3D, ROS 2, Nav2.

Prerequisites

Readers should be comfortable with Python, tensors, coordinate frames, sensor noise, and the perception-action loop. Useful refreshers appear in Chapter 4, Chapter 8, and Chapter 13.

Chapter Roadmap

Tooling Note

This chapter uses the right-tool principle. The teaching baseline exposes units, frames, uncertainty, and logging. The shortcut stack uses maintained tools to handle optimized kernels, visualization, data formats, simulation hooks, and deployment interfaces.

Hands-On Lab: Build A Localization and Mapping (SLAM) Evidence Panel

Duration: about 75 minutesDifficulty: Intermediate

Objective

Build a small evidence panel that compares a hand-built baseline with a maintained tool workflow for this chapter.

What You'll Practice

  • Writing an observation, action, metric, and perturbation contract.
  • Building one inspectable baseline before using a library shortcut.
  • Logging success, failure labels, latency, and recovery behavior.
  • Explaining which result would change a robot action.

Setup

Use a Python environment with NumPy. Add chapter-specific tools only after the baseline manifest runs.

# Create a small local environment for the chapter lab.
python -m pip install numpy
Code Fragment 29.L1 installs NumPy for the chapter evidence manifest before heavier perception tools are added.

Steps

Step 1: Define The Contract

Write observation, action, metric, and perturbation fields for two sections.

Step 2: Run The Baseline Manifest

Create one comparable row per section, then fill realistic values from the section text.

# Start a Chapter 29 evidence manifest.
# Add one row per section and keep metrics construct matched.
sections = ['29.1', '29.2', '29.3', '29.4', '29.5', '29.6', '29.7']
manifest = [
    {"section": s, "metric": "closed_loop_success", "perturbation": "occlusion_or_noise"}
    for s in sections
]
print(manifest[0])
Code Fragment 29.L2 creates `manifest` rows for Chapter 29, giving each section the same metric and perturbation fields.

Step 3: Add The Library Shortcut

Replace one baseline field with a maintained tool call, while keeping the output schema unchanged.

Step 4: Run One Perturbation

Add occlusion, noise, pose drift, map error, or goal ambiguity. Record whether the action changed.

Step 5: Write The Postmortem

Explain the strongest result, the most informative failure, and the next diagnostic test.

Expected Output

A table with one row per tested section, one baseline result, one shortcut result, one perturbation label, and one failure label.

Stretch Goals

  • Add a plot of metric versus perturbation strength.
  • Run the same manifest in a Habitat-style simulator or ROS 2 bag replay.
  • Export two failure cases with enough metadata to reproduce them later.

Complete Solution

# Start a Chapter 29 evidence manifest.
# Add one row per section and keep metrics construct matched.
sections = ['29.1', '29.2', '29.3', '29.4', '29.5', '29.6', '29.7']
manifest = [
    {"section": s, "metric": "closed_loop_success", "perturbation": "occlusion_or_noise"}
    for s in sections
]
print(manifest[0])
for row in manifest:
    row["baseline_score"] = 0.72
    row["shortcut_score"] = 0.81
    row["failure_label"] = "perception_or_planning_interface"
print(manifest)
Code Fragment 29.L3 fills the manifest with comparable baseline and shortcut fields for a same-panel chapter lab.

Use this chapter as a complete teaching unit: concept, minimal implementation, library shortcut, diagnostic perturbation, and postmortem. The pattern prevents a perception model from being evaluated in isolation and never tested as part of the agent loop.

Chapter Tool Map
Tool or LibraryWhere It Pays Off
OpenCVUse when it shortens the path from mechanism to reproducible embodied evidence.
Open3DUse when it shortens the path from mechanism to reproducible embodied evidence.
ROS 2Use when it shortens the path from mechanism to reproducible embodied evidence.
Nav2Use when it shortens the path from mechanism to reproducible embodied evidence.
GTSAMUse when it shortens the path from mechanism to reproducible embodied evidence.
ORB-SLAM style pipelinesUse when it shortens the path from mechanism to reproducible embodied evidence.
Gaussian Splatting SLAM systemsUse when it shortens the path from mechanism to reproducible embodied evidence.
Readiness Check

Before leaving the chapter, the reader should be able to state one theory claim, one implementation claim, one evaluation claim, and one realistic failure mode.

Teaching Takeaway

A strong chapter session ends with an artifact: a script, trace, simulator run, data card, map, or reproducible evaluation panel.

What's Next?

Start with Section 29.1: Where am I and what does the world look like. After this chapter, continue to Chapter 30: Navigation and Path Planning.

Bibliography & Further Reading

Foundational Papers, Tools, and References

Durrant-Whyte, H. and Bailey, T.. "Simultaneous Localization and Mapping." IEEE Robotics and Automation Magazine, 2006. https://ieeexplore.ieee.org/document/1638022

A classic tutorial framing of SLAM's estimation problem and uncertainty structure.

Mur-Artal, R. and Tardos, J. D.. "ORB-SLAM2: An Open-Source SLAM System for Monocular, Stereo, and RGB-D Cameras." IEEE T-RO, 2017. https://arxiv.org/abs/1610.06475

A widely used reference for feature-based visual SLAM pipelines.

Dellaert, F.. "Factor Graphs and GTSAM." Project documentation. https://gtsam.org/

A practical reference for factor graphs, pose graphs, and nonlinear smoothing.

ROS 2 Navigation. "Nav2 documentation." Project documentation. https://navigation.ros.org/

The maintained navigation stack that connects maps, localization, planners, controllers, and recovery behaviors.