Section 29.5: SLAM: graph-based and visual SLAM | Building Embodied AI: From Perception to Autonomous Action

"SLAM is memory under cross-examination by geometry."
A Loop Closure That Came Back With Receipts

Educational illustration for Section 29.5, showing graph-based and visual slam as a robot reasoning problem that connects measurements, state estimates, decisions, and replayable evidence. — **Figure 29.5.1**: Graph-based and visual SLAM becomes useful when the visual idea is tied to a state variable, an uncertainty model, and the next robot action.

Big Picture

Graph-based and visual SLAM is the state-estimation half of embodied autonomy. The robot has partial, noisy, time-stamped evidence; it must turn that evidence into a pose, a map, and an uncertainty statement that a planner can actually trust.

Problem First

Frame-by-frame correction is too local when the robot revisits a place after a long loop. A loop closure says that two far-apart timestamps are actually nearby in space, which can bend the entire trajectory into consistency.

Graph SLAM represents poses and landmarks as variables and measurements as factors. Visual SLAM adds feature tracking, keyframes, bundle adjustment, and relocalization. The optimizer minimizes residuals weighted by measurement covariance, so a bad association can pull the whole graph into a convincing but false solution.

Action Contract

A SLAM result is incomplete without factor definitions, association policy, loop-closure thresholds, residual plots, uncertainty, map export frame, and the downstream planner using the map.

Formal Model

For graph-based and visual SLAM, the posterior is a factor graph over poses, landmarks, feature tracks, and loop closures. Its deployable form is a graph with residuals, covariances, robust costs, and rejected associations.

$$ x^*=\arg\min_x \sum_k \|r_k(x)\|_{\Omega_k}^{2},\quad \|r\|_{\Omega}^{2}=r^\top\Omega r $$

The evidence terms are odometry factors, visual or scan factors, landmark observations, loop closures, and priors. The estimate becomes trustworthy only when large residuals and suspect closures remain inspectable rather than hidden by a polished trajectory.

Algorithm: Section 29.5 Evidence Loop

Select keyframes or pose nodes that summarize the trajectory.
Create odometry, landmark, scan-match, visual feature, and loop-closure factors.
Reject weak data associations with geometric and appearance checks.
Optimize the pose graph, then inspect residual histograms and loop-closure influence.

Worked Diagnostic

Code Fragment 1 is the graph-SLAM sanity check: add a small set of factors, inspect residuals before and after optimization, and verify that a false closure would be visible.

# Compare two residuals with different information weights.
# High-confidence loop closures should matter more only when association is correct.
import numpy as np

residuals = np.array([0.20, 1.00])
information = np.array([25.0, 4.0])
weighted_cost = np.sum(information * residuals ** 2)
print(f"weighted_cost={weighted_cost:.2f}")
print(f"largest_term={np.argmax(information * residuals ** 2)}")

weighted_cost=5.00 largest_term=1

Expected output interpretation. The second factor dominates the objective even though its information weight is lower, because its residual is much larger. The output should be read as a loop-closure sanity check: a single bad association can outweigh several small nominal residuals and bend the optimized trajectory unless robust loss or outlier rejection intervenes.

Code Fragment 1: The example computes a tiny weighted least-squares cost. Even though the first factor has higher confidence, the larger residual dominates the objective, which is exactly why outlier loop closures need robust checking.

Tool Workflow

Library Shortcut

GTSAM and Ceres provide the nonlinear optimization machinery, while ORB-SLAM3, RTAB-Map, Kimera, and related systems package front ends, loop closure, and map maintenance. The shortcut replaces a full optimizer and visual pipeline with maintained APIs and configuration.

Use the hand factor example to expose residuals and Jacobian intuition, then use GTSAM, Ceres, ORB-SLAM-style systems, or Kimera for real logs. The hand calculation remains the guard against blind optimizer trust.

Failure Mode To Test

Replay with feature dropout, repeated texture, false loop closure, rolling-shutter motion, and delayed transforms as separate perturbations. The failure label should distinguish front-end association errors from back-end optimization and calibration issues.

Practical Example

A warehouse SLAM artifact should include feature tracks or scan matches, factor graph, residual histogram, loop-closure decisions, optimized poses, covariance summary, and planner map export. That record shows whether a beautiful map is geometrically supported.

Research Frontier

The frontier is robust data association at scale: visual place recognition, semantic loop closure, dynamic-scene rejection, and uncertainty-aware graph optimization. Graph SLAM succeeds when it distrusts attractive false matches.

Memory Hook

SLAM is memory under cross-examination by geometry.

Self Check

Can you state the state variables, observation residual, uncertainty representation, replay artifact, and most likely field failure for graph-based and visual slam? If one field is vague, the estimator is not ready for embodied use.

Key Takeaway

Graph-based and visual SLAM is production-ready only when geometry, uncertainty, timing, and action consequences are tested together.

Exercise 29.5.1

Run one loop with a true revisit and one with perceptual aliasing. Report accepted closures, residual change, trajectory jump, map deformation, and the navigation effect of accepting or rejecting the closure.

What's Next?

Continue to Section 29.6: Neural and Gaussian-splat SLAM, where this state-estimation contract becomes the input to the next embodied capability.

Section References

Durrant-Whyte, H. and Bailey, T. "Simultaneous Localization and Mapping." IEEE Robotics and Automation Magazine, 2006. https://ieeexplore.ieee.org/document/1638022

Classic SLAM tutorial that frames the estimation problem and the role of uncertainty.

GTSAM Project. "Factor Graphs and GTSAM." Official documentation. https://gtsam.org/

Primary tool reference for factor graphs, smoothing, pose graphs, and robotics estimation examples.

ROS 2 Navigation Project. "Nav2 documentation." Official documentation. https://navigation.ros.org/

Primary documentation for integrating localization, maps, planners, controllers, behavior trees, and recoveries.