Section 56.2: Spatial, episodic, and semantic memory | Building Embodied AI: From Perception to Autonomous Action

"The map knows where, the episode knows when, and the semantic store keeps insisting it knows why."
A Three-Drawer Memory Cabinet

Technical illustration for Section 56.2: Spatial, episodic, and semantic memory. — **Figure 56.2A**: Different memory types should answer different planning queries.

Big Picture

Spatial, episodic, and semantic memory should not be collapsed into one undifferentiated store. Each memory type has a distinct representation, retrieval objective, and failure mode.

Design Rule

If two memories answer different planning questions, they should not share one scoring rule. Spatial recall, episodic recall, and semantic recall need different freshness tests, confidence estimates, and failure alarms.

Theory

The three memory families can be written as

$$\mathcal M_{\text{spatial}}=(V,E,\psi), \qquad \mathcal M_{\text{episodic}}=\{(o_{1:T},a_{1:T},r_{1:T},c)\}, \qquad \mathcal M_{\text{semantic}}=\{(k,v,\sigma,\eta)\}.$$

Spatial memory is often metric or topological. Episodic memory preserves trajectories and outcomes. Semantic memory stores abstractions or facts with confidence $\sigma$ and freshness $\eta$.

Operationally, these memory types usually sit in different toolchains. Spatial memory may live in `tf2`, occupancy maps, voxel stores, or scene graphs; episodic memory may be written into ROS bags, trajectory databases, or replay buffers; semantic memory may be indexed with FAISS, pgvector, or a knowledge graph. Separating the stores makes provenance and failure analysis tractable.

Three Memory Types, Three Query Families

Memory Type	Typical Representation	Query Example	Failure Mode
Spatial	occupancy map, scene graph, topological map	Where is the charging dock relative to me?	frame drift or map staleness
Episodic	trajectory log, intervention trace, replay buffer	What happened the last time I attempted this grasp?	retrieval of a similar but irrelevant episode
Semantic	key-value memory, embedding index, knowledge graph	Where are scissors usually stored?	overgeneralized or stale fact

These memory types also differ in update cadence. Spatial memory may change every control cycle, episodic memory may be appended after each trajectory, and semantic memory may be updated only after aggregation or human verification. Conflating those cadences causes either excessive churn or stale abstractions.

Worked Example

A household robot searching for scissors may use spatial memory to remember drawer locations, episodic memory to recall where the scissors were last seen, and semantic memory to recall category-level storage priors.

def validate_memory_queries(payload: dict[str, object]) -> dict[str, object]:
    assert payload, "payload must not be empty"
    return payload

memory_queries = {
    "where_is_drawer_3": "spatial",
    "what_happened_last_time_i_opened_this_drawer": "episodic",
    "where_are_scissors_usually_stored": "semantic",
}
print(validate_memory_queries(memory_queries))

{'where_is_drawer_3': 'spatial', 'what_happened_last_time_i_opened_this_drawer': 'episodic', 'where_are_scissors_usually_stored': 'semantic'}

Code Fragment 56.2.1 maps concrete planning queries to the memory type that should answer them.

The expected output is simple, but the design implication is important. If a team cannot assign memory queries cleanly, the system usually lacks structure and later becomes hard to debug.

Library Shortcut

OctoMap, Habitat, Open3D, NetworkX, ROS 2 bags, and FAISS-style retrieval can each support part of this stack, but only when the interface preserves memory type, freshness, coordinate frame, and provenance. The wrong shortcut is one giant vector store that hides whether a returned item was geometric, episodic, or semantic.

Implementation Stack

Use Open3D or SLAM outputs for spatial memory, ROS 2 bag replay for episodic traces, NetworkX for explicit topological or semantic relations, and PyTorch or JAX embeddings only when the retrieved vector still points back to an inspectable record. Habitat can supply controlled scene-memory tasks, while Weights & Biases or TensorBoard logs should record which memory type changed the planner's action.

Algorithm: Choose The Right Memory Type

Classify the target query as geometric, temporal, or conceptual.
Use spatial memory for coordinate, occupancy, and topology questions.
Use episodic memory for trajectory, intervention, and outcome questions.
Use semantic memory for category-level or relational priors.
Store explicit links across memory types rather than flattening them into one blob.

Common Failure Mode

A semantic prior can silently override current spatial evidence. If the map says the corridor is blocked now, a stored fact that it is "usually open" should not win.

Practical Example

In a warehouse, spatial memory may encode docks and aisles, episodic memory may preserve recent deadlock traces, and semantic memory may preserve higher-level knowledge such as "fragile cartons should avoid sharp turns."

A useful evidence artifact here is a memory-routing ledger that records query text, chosen memory type, retrieved record id, freshness field, and downstream action change. That artifact makes it obvious when the robot answered a geometric question with a semantic prior or reused an old episode without checking whether the scene had changed.

Research Frontier

An open problem is learning when to convert repeated episodes into semantic rules, or when to use a semantic prior to guide a new spatial search, without smearing uncertainty across incompatible representations.

Self Check

Can you name one planning query that should fail if answered by the wrong memory type? If not, the representation boundary is still descriptive rather than operational.

Self Check

Can you point to one query that each memory type should answer in your application, and explain what would go wrong if the wrong memory type answered it?

Research Frontier

An open problem is learning when to translate across memory types. A robot may need to convert repeated episodes into a semantic rule, or turn a semantic hypothesis into a targeted spatial search, without smearing uncertainty across incompatible representations.

Self Check

Can you name one planning query that should fail if answered by the wrong memory type? If not, rewrite the task until the representation boundary is operational rather than descriptive.

Key Takeaway

Spatial, episodic, and semantic memory are distinct system components because they answer different queries with different risk profiles.

Exercise 56.2.1

For a robot that restocks shelves, list three planning queries and assign each to spatial, episodic, or semantic memory. Justify the assignment in one sentence per query.

Section References

Parisotto, E. and Salakhutdinov, R. Neural Map: Structured Memory for Deep Reinforcement Learning. ICLR, 2018.

Use for differentiable spatial memory and the distinction between stored geometry and policy state.

Chaplot, D. S. et al. Neural Topological SLAM for Visual Navigation. CVPR, 2020.

Use for map-like memory that supports navigation decisions rather than generic retrieval.

What's Next?

Next, continue with Section 56.3, where retrieval is conditioned on the planner's current decision.