Section 4.2: Points, vectors, poses, frames | Building Embodied AI: From Perception to Autonomous Action

"A robot without frames has many coordinates and no agreement."
A Meticulous Mapping Agent

Technical illustration for Section 4.2: Points, vectors, poses, frames. — Figure 4.2A: Points, vectors, and poses drawn on a 2D plane, then extended to SE(3): a pose arrow shows both position (where) and orientation (which way), distinguishing free vectors from position vectors.

Big Picture

Points, vectors, poses, frames are the nouns of robot geometry. A point names a location, a vector names a displacement or direction, a pose names a coordinate frame, and a frame tells every subsystem how to interpret the numbers.

Many robotics bugs come from treating all length-three arrays as the same kind of object. A grasp point, a surface normal, a gravity vector, a camera optical axis, and a base position may all fit in a NumPy array. They should not all be transformed the same way.

This section develops the habit of asking two questions for every spatial value: what kind of geometric object is it, and in which frame is it expressed?

Different Objects, Different Rules

Translation changes points, but it should not change free vectors. If a normal vector receives a translation offset, a downstream grasp or contact calculation can fail while the array shape still looks correct.

Theory

A rigid pose can be written as $T=(R,t)$, where $R\in SO(3)$ is a rotation and $t\in R^3$ is a translation. The key distinction is that a point (a location) and a free vector (a displacement) transform differently between frames:

$$p^A = R_{AB}\,p^B + t_{AB}, \qquad v^A = R_{AB}\,v^B.$$

A free vector transforms by rotation only because a displacement has orientation and scale but no absolute location, so the translation $t_{AB}$ must not act on it. In homogeneous form this is enforced by the trailing coordinate: a point is $[\,p^B;\,1\,]$ and a vector is $[\,v^B;\,0\,]$, so

$$\begin{bmatrix} p^A \\ 1 \end{bmatrix} = \begin{bmatrix} R_{AB} & t_{AB} \\ 0\ 0\ 0 & 1 \end{bmatrix} \begin{bmatrix} p^B \\ 1 \end{bmatrix}, \qquad \begin{bmatrix} v^A \\ 0 \end{bmatrix} = \begin{bmatrix} R_{AB} & t_{AB} \\ 0\ 0\ 0 & 1 \end{bmatrix} \begin{bmatrix} v^B \\ 0 \end{bmatrix}.$$

A frame is a named coordinate system attached to something physical or conceptual: world, map, odom, base_link, camera_optical, wrist, object, or contact_patch. A pose gives the origin and axes of one frame relative to another.

Mechanism

Homogeneous coordinates encode this distinction by giving points a final coordinate of 1 and vectors a final coordinate of 0. Multiplying by a 4 by 4 transform then applies translation to points and not to vectors.

Worked Example

A gripper approach point and an approach direction are both expressed in the tool frame. The point should rotate and translate into the base frame. The direction should rotate only.

# Points, vectors, poses, frames: keep the section idea tied to observable evidence.
# Run this diagnostic probe before trusting the maintained library path.
import numpy as np

R_base_tool = np.array([[0.0, -1.0, 0.0],
                        [1.0,  0.0, 0.0],
                        [0.0,  0.0, 1.0]])
t_base_tool = np.array([0.30, 0.10, 0.50])
p_tool = np.array([0.20, 0.00, 0.00])
v_tool = np.array([1.00, 0.00, 0.00])
p_base = R_base_tool @ p_tool + t_base_tool
v_base = R_base_tool @ v_tool
print("point", np.round(p_base, 3))
print("vector", np.round(v_base, 3))

Expected output: the point includes the base-frame offset, while the vector does not.

Code Fragment 4.2.1 contrasts point transformation with vector transformation so translation is applied only where it belongs.

Library Shortcut

The teaching fragment is about 16 lines. In production, spatialmath.SE3, Pinocchio.SE3, and Drake RigidTransform make poses explicit, while NumPy remains useful for batch point clouds. These tools reduce manual transform code to a few named operations and make type-like intent visible in code reviews.

Builder Recipe

Name whether the quantity is a point, vector, pose, twist, wrench, or covariance.
Name the source frame and destination frame before multiplying anything.
Write one unit test where translation should affect the value and one where it should not.
Use pose objects or transform messages at subsystem boundaries.
Log both the frame name and the numerical value.

Common Failure Mode

Adding translation to a normal vector is a quiet bug. It can tilt a grasp score, corrupt a contact normal, or make a controller push in a direction that no longer matches the sensed surface.

Practical Example

In a bin-picking pipeline, store object centroid as a point in the camera frame, object normal as a vector in the camera frame, grasp pose as a frame relative to the object, and final command pose in the robot base frame. These four objects should have four explicit labels in logs.

Mental Model

If every length-three array looks like a coordinate, the robot is reading a spreadsheet with the column headers removed.

Research Frontier

Robot foundation models often encode spatial tokens, keypoints, or object-centric features. The strongest deployed systems still convert those learned representations into frame-aware points, vectors, poses, and covariances before motion execution.

Cross Reference

This distinction feeds directly into Section 4.4 on rigid transforms, Section 5.5 on forward kinematics, and Section 6.2 on dynamics.

Self Check

Given a surface normal, a gripper position, and a wrist pose, can you say which ones should receive translation during a frame change?

Production Pattern

Points, vectors, poses, frames sits inside the Part II robotics contract: geometry defines where things are, kinematics defines what motion is possible, dynamics defines what motion costs, control defines how errors are corrected, and sensing defines what the agent can know on time.

Keep points and vectors separate: translations move points, while directions only rotate. This makes the section useful to students, builders, and researchers at the same time: the idea has an intuitive role, a formal interface, a runnable check, and a failure mode that can be reproduced.

Mechanism To Watch

For Points, vectors, poses, frames, a pose is a typed relationship between frames, not just a vector. The artifact should record parent frame, child frame, units, timestamp, and multiplication order before any transform is trusted.

Library Choices And Verification Checks

Tool or Library	What It Handles	Verification Check
SciPy Rotation	converts, composes, applies, and inverts 3D rotations in Python	Verify quaternion order, degrees versus radians, and matrix orthogonality.
ROS 2 tf2	maintains time-buffered coordinate-frame relationships for robot systems	Verify parent-child frame names, lookup time, and transform direction.
spatialmath-python	supports practical work on Points, vectors, poses, frames	Verify the library output against the hand-built baseline on one small case.
Drake	models dynamical systems, multibody plants, optimization, and controllers	Verify scalar type, plant finalization, frame convention, and solver status.
OpenCV calibration	handles camera models, calibration, projection, and vision preprocessing	Verify intrinsics, distortion, image timestamp, and frame-to-camera transform.

Use this recipe when turning Points, vectors, poses, frames into code, a simulator experiment, or a robot diagnostic. The point is not to use every library. The point is to keep the hand-built baseline and the maintained-tool path comparable.

Name every frame with a parent, child, unit convention, and timestamp policy.
Write one hand-checked transform chain and verify identity, inverse, and composition tests.
Run the same transform through ROS 2 tf2 or SciPy Rotation, then compare one point and one direction vector.
Record a frame audit with source sensor, latency, and expected sign convention.
Debug failed behavior by replaying the transform tree before changing policy or controller code.

Evidence Gate

For Points, vectors, poses, frames, compare methods only through one saved artifact that preserves the inputs, outputs, units, timestamps, latency budget, configuration, seed, metric definition, and failure labels relevant to this section. The comparison is meaningful only when the same script evaluates the same panel.

Exercise Extension

Extend the section exercise by adding one perturbation specific to Points, vectors, poses, frames and one latency or uncertainty check. Save the result in the EvidenceRecord schema, then explain which library output you trust and why.

A point translates, a vector does not, and a pose carries orientation plus origin. Test those distinctions in a tiny case before passing data through tf2, SciPy Rotation, or a simulator.

Section References

Core references for Points, vectors, poses, frames: Modern Robotics; Murray, Li, and Sastry; Siciliano et al.; LaValle; and official documentation for Drake, MuJoCo, Pinocchio, CasADi, python-control, GTSAM, ROS 2, and OpenCV as applicable.

Use these references to check notation, frame conventions, units, solver assumptions, and maintained-library behavior.

Key Takeaway

Points, vectors, poses, and frames are different contracts. Treating them differently is the first step toward reliable robot geometry.

Exercise 4.2.1

Create a small test with one point and one vector in the same frame. Apply a transform with nonzero translation, then assert that only the point changes by the translation term.