Section 7.4: State-space control, LQR | Building Embodied AI: From Perception to Autonomous Action

A Careful Control Loop

Technical illustration for Section 7.4: State-space control, LQR. — Figure 7.4A: LQR state-space control for a cart-pole system: the cost matrix Q penalizes pole angle, R penalizes control effort, and the resulting optimal gain K is shown stabilizing the linearized system.

Big Picture

State-space control, LQR is one lens on Control for AI Practitioners. We study it because an embodied agent needs decisions that survive contact with noisy sensors, delayed effects, and changing environments.

This section develops the technical contract for State-space control, LQR into a usable mental model. First we define the object of study, then we connect it to the agent loop, then we test it with a compact implementation.

The key question in State-space control, LQR is practical: what must the agent know, what can it observe, what action is available, and what evidence shows that the action worked under the stated conditions?

Action Is The Test

A representation earns its place when it changes the measurable action interface. In State-space control, LQR, the reader should keep asking which decision becomes easier, safer, or more reliable.

Theory

For State-space control, LQR, the practical design rule is to make the interface inspectable before optimization begins: inputs, outputs, units, latency, bounds, and failure labels should all be visible in the saved artifact.

State-space control writes the robot near an operating point as $x_{t+1}=Ax_t+Bu_t$, where $x_t$ collects the state variables and $u_t$ collects the commands. LQR adds a quadratic preference: make important state errors small while avoiding commands that are too large or too fast. The matrices $Q$ and $R$ are therefore not arbitrary knobs. They encode which errors are costly, which actuators are precious, and which units must be normalized before comparison.

LQR Tuning Is Cost Design

Increasing a diagonal entry of $Q$ tells the controller that the corresponding state error is expensive. Increasing a diagonal entry of $R$ tells it that actuator effort is expensive. If position is measured in meters and angle in radians, scale the costs so the controller is not tuned by units by accident.

Mechanism

The mechanism in State-space control, LQR is the contract between representation and action. Name what enters the module, what leaves it, which assumptions make that transformation valid, and which log would reveal a bad handoff.

Worked Example

LQR turns a cost preference into a feedback gain by solving an algebraic Riccati equation. For $\dot x = Ax + Bu$, the controller minimizes $\int_0^\infty (x^\top Q x + u^\top R u)\,dt$; the optimal law is $u=-Kx$ with $K=R^{-1}B^\top P$, where $P$ solves the continuous-time ARE. Code Fragment 7.4.1 stabilizes a cart-pole linearized about the upright equilibrium. The state is $[\text{pos},\text{vel},\text{angle},\text{ang.\,vel}]$; the $Q$ matrix makes the pole angle ten times as expensive as cart position.

import numpy as np
from scipy.linalg import solve_continuous_are

# Cart-pole linearized about upright. State x = [pos, vel, angle, ang_vel].
g, M, m, l = 9.81, 1.0, 0.1, 0.5
A = np.array([[0, 1, 0,              0],
              [0, 0, (m * g) / M,    0],
              [0, 0, 0,              1],
              [0, 0, (M + m) * g / (M * l), 0]], dtype=float)
B = np.array([[0], [1 / M], [0], [1 / (M * l)]], dtype=float)

Q = np.diag([1.0, 1.0, 10.0, 1.0])   # angle error is 10x as costly as position
R = np.array([[0.1]])                  # control effort is cheap

P = solve_continuous_are(A, B, Q, R)
K = np.linalg.inv(R) @ B.T @ P
print("LQR gain K =", np.round(K.ravel(), 2))
cl_eig = np.linalg.eigvals(A - B @ K)
print("closed-loop eigenvalues =", np.round(cl_eig, 2))
print("stable:", bool(np.all(cl_eig.real < 0)))

# 5 s rollout from a 0.2 rad tilt, Euler integration.
dt, x = 0.01, np.array([0, 0, 0.2, 0.0])
for _ in range(500):
    u = -K @ x
    x = x + dt * (A @ x + (B @ u).ravel())
print(f"angle 0.20 rad -> {x[2]:+.4f} rad after 5 s (regulated to upright)")

LQR gain K = [-3.16 -5.72 46.03 10.41] closed-loop eigenvalues = [-8.8 -3.35 -1.24 -1.7 ] stable: True angle 0.20 rad -> +0.0010 rad after 5 s (regulated to upright)

Code Fragment 7.4.1: the Riccati solution yields a gain whose largest entry weights the pole angle, exactly the state $Q$ marked as most expensive. All four closed-loop eigenvalues land in the left half plane and a 0.2 rad tilt is driven to near zero. The result is only valid near the linearization point: at large angles the true $A$ and $B$ no longer describe the cart-pole and the same gain can destabilize it.

Library Shortcut

The fragment should expose state vector, dynamics matrices, cost matrices, gain, and closed-loop eigenvalues. python-control, Drake, and CasADi scale the design once the state definition is correct.

Practical Recipe

Write the observation, action, and success metric before choosing a model.
Build a baseline that is simple enough to debug by inspection.
Add the library implementation only after the baseline behavior is understood.
Record failures as structured cases: perception error, state error, planning error, control error, or evaluation error.
Run at least one perturbation test before trusting the result.

Common Failure Mode

The common mistake in State-space control, LQR is to celebrate the component score before checking the closed-loop handoff. The failure usually appears at the boundary: stale state, wrong frame, delayed action, saturated actuator, or metric that ignores the real task cost.

Practical Example

A robotics team using State-space control, LQR should log not only final success, but intermediate observations, chosen actions, controller status, and recovery events. The logs reveal whether the method is solving the task or merely passing the easiest episodes.

Memory Hook

When state-space control, lqr feels abstract, ask what would be different in the next frame of video, the next robot state, or the next safety margin.

Research Frontier

For State-space control, LQR, treat frontier claims as hypotheses until they expose enough detail to reproduce the result: data boundary, embodiment, controller interface, evaluation panel, and failure cases.

Self Check

Can you name the observation, state estimate, action, success metric, and most likely failure mode for State-space control, LQR? If not, the system boundary is still too vague.

Production Pattern

State-space control, LQR sits inside the Part II robotics contract: geometry defines where things are, kinematics defines what motion is possible, dynamics defines what motion costs, control defines how errors are corrected, and sensing defines what the agent can know on time.

Connect LQR costs to units and behavior, since the matrix entries are design choices rather than unexplained constants. This makes the section useful to students, builders, and researchers at the same time: the idea has an intuitive role, a formal interface, a runnable check, and a failure mode that can be reproduced.

Mechanism To Watch

For State-space control, LQR, control closes the loop between estimated state and action. Keep reference, measured state, error signal, control law, actuator limits, and safety fallback separate in the evidence record.

Library Choices And Verification Checks

Tool or Library	What It Handles	Verification Check
python-control	analyzes linear systems, transfer functions, state-space models, and feedback loops	Verify units, sample time, poles, stability margin, and reference scaling.
CasADi	formulates optimization-based controllers with constraints and horizons	Verify constraints, warm start, solver status, and deadline behavior.
Drake	models dynamical systems, multibody plants, optimization, and controllers	Verify scalar type, plant finalization, frame convention, and solver status.
do-mpc	formulates optimization-based controllers with constraints and horizons	Verify constraints, warm start, solver status, and deadline behavior.
ROS 2 control	supports practical work on State-space control, LQR	Verify the library output against the hand-built baseline on one small case.

Use this recipe when turning State-space control, LQR into code, a simulator experiment, or a robot diagnostic. The point is not to use every library. The point is to keep the hand-built baseline and the maintained-tool path comparable.

Write the control objective, measured state, actuator command, update rate, and saturation policy.
Run a step-response test before adding learning, with overshoot, settling time, and steady-state error logged.
Compare the hand controller with python-control, CasADi, Drake, do-mpc, or ROS 2 control on the same plant model.
Record latency, missed deadlines, saturation events, constraint violations, and recovery actions.
Only compare controllers and policies when they share sensors, action limits, disturbance tests, and safety checks.

Evidence Gate

For State-space control, LQR, compare methods only through one saved artifact that preserves the inputs, outputs, units, timestamps, latency budget, configuration, seed, metric definition, and failure labels relevant to this section. The comparison is meaningful only when the same script evaluates the same panel.

Exercise Extension

Extend the section exercise by adding one perturbation specific to State-space control, LQR and one latency or uncertainty check. Save the result in the EvidenceRecord schema, then explain which library output you trust and why.

A learned policy can hide a poor LQR linearization until the robot leaves the neighborhood where $A$ and $B$ were valid. Check equilibrium choice, state ordering, units, controllability, sample time, saturation, and the closed-loop eigenvalues before scaling training. For this section, first reproduce one tiny state-space case by hand, then rerun it through python-control or Drake. If the two disagree, inspect matrix convention, discrete versus continuous time, and cost scaling before changing the model.

Technical Core

State-space control, LQR needs a topic-native core: variables, equations or system contracts, an algorithmic procedure, an expected output, and a failure diagnosis. Figure 7.4.T summarizes the chain this section must preserve when moving from a teaching example to a real embodied system.

Figure 7.4.T: The technical core for State-space control, LQR connects assumptions, model, algorithm, evidence, and failure analysis.

Formal Object

Discrete LQR assumes $x_{t+1}=Ax_t+Bu_t$ and minimizes $\sum_{t=0}^{\infty}(x_t^\top Qx_t+u_t^\top Ru_t)$. The controller applies $u_t=-Kx_t$, where $K$ is chosen from the Riccati solution under the assumption that the linear model is valid near the operating point and the pair $(A,B)$ is controllable in the directions being regulated.

Controller evaluation loop

Define the reference, measured state, error signal, actuator command, update rate, and saturation policy.
Run a step or disturbance response before adding learning.
Log overshoot, settling time, steady-state error, latency, saturation, and recovery behavior.
Compare PID, LQR, or MPC only under the same plant, sensors, limits, disturbance panel, and metric code.

Technical Contract For State-space control, LQR

Contract Field	What To Specify	Why It Matters
State and observation	Variables, units, timestamps, frames, and uncertainty.	Prevents a model score from being mistaken for robot capability.
Action interface	Command type, limits, update rate, and safety fallback.	Makes the learned or planned output executable.
Evidence artifact	Trace, metric, configuration, seed, and failure label.	Allows baseline and library path to be compared in one pass.
Tool path	python-control, CasADi, do-mpc, Drake, ROS 2 control, MuJoCo	Shows the practical library route after the mechanism is understood.

For State-space control, LQR, expected output is a trace where the relevant error decreases, overshoot stays within the design bound, and actuator commands remain within limits under the stated timing budget.

Failure Mode To Test

State-space control, LQR should be stress-tested under delay, integral windup, actuator saturation, unmodeled friction, and reference-frame mismatch before the nominal trace is trusted.

Section References

Core references for State-space control, LQR: Modern Robotics; Murray, Li, and Sastry; Siciliano et al.; LaValle; and official documentation for Drake, MuJoCo, Pinocchio, CasADi, python-control, GTSAM, ROS 2, and OpenCV as applicable.

Use these references to check notation, frame conventions, units, solver assumptions, and maintained-library behavior.

Key Takeaway

State-space control, LQR is useful when it makes the perception-action loop more reliable, not when it merely adds a more impressive model name.

Exercise 7.4.1

Design a method-matched experiment for State-space control, LQR. Specify the environment, observations, actions, metric, one perturbation, and the library output you would compare against the hand-built baseline.