A Careful Control Loop
State-space control, LQR is one lens on Control for AI Practitioners. We study it because an embodied agent needs decisions that survive contact with noisy sensors, delayed effects, and changing environments.
This section develops the technical contract for State-space control, LQR into a usable mental model. First we define the object of study, then we connect it to the agent loop, then we test it with a compact implementation.
The key question in State-space control, LQR is practical: what must the agent know, what can it observe, what action is available, and what evidence shows that the action worked under the stated conditions?
A representation earns its place when it changes the measurable action interface. In State-space control, LQR, the reader should keep asking which decision becomes easier, safer, or more reliable.
Theory
For State-space control, LQR, the practical design rule is to make the interface inspectable before optimization begins: inputs, outputs, units, latency, bounds, and failure labels should all be visible in the saved artifact.
State-space control writes the robot near an operating point as $x_{t+1}=Ax_t+Bu_t$, where $x_t$ collects the state variables and $u_t$ collects the commands. LQR adds a quadratic preference: make important state errors small while avoiding commands that are too large or too fast. The matrices $Q$ and $R$ are therefore not arbitrary knobs. They encode which errors are costly, which actuators are precious, and which units must be normalized before comparison.
Increasing a diagonal entry of $Q$ tells the controller that the corresponding state error is expensive. Increasing a diagonal entry of $R$ tells it that actuator effort is expensive. If position is measured in meters and angle in radians, scale the costs so the controller is not tuned by units by accident.
The mechanism in State-space control, LQR is the contract between representation and action. Name what enters the module, what leaves it, which assumptions make that transformation valid, and which log would reveal a bad handoff.
Worked Example
LQR turns a cost preference into a feedback gain by solving an algebraic Riccati equation. For \(\dot x = Ax + Bu\), the controller minimizes \(\int_0^\infty (x^\top Q x + u^\top R u)\,dt\); the optimal law is \(u=-Kx\) with \(K=R^{-1}B^\top P\), where \(P\) solves the continuous-time ARE. Code Fragment 7.4.1 stabilizes a cart-pole linearized about the upright equilibrium. The state is \([\text{pos},\text{vel},\text{angle},\text{ang.\,vel}]\); the \(Q\) matrix makes the pole angle ten times as expensive as cart position.
import numpy as np
from scipy.linalg import solve_continuous_are
# Cart-pole linearized about upright. State x = [pos, vel, angle, ang_vel].
g, M, m, l = 9.81, 1.0, 0.1, 0.5
A = np.array([[0, 1, 0, 0],
[0, 0, (m * g) / M, 0],
[0, 0, 0, 1],
[0, 0, (M + m) * g / (M * l), 0]], dtype=float)
B = np.array([[0], [1 / M], [0], [1 / (M * l)]], dtype=float)
Q = np.diag([1.0, 1.0, 10.0, 1.0]) # angle error is 10x as costly as position
R = np.array([[0.1]]) # control effort is cheap
P = solve_continuous_are(A, B, Q, R)
K = np.linalg.inv(R) @ B.T @ P
print("LQR gain K =", np.round(K.ravel(), 2))
cl_eig = np.linalg.eigvals(A - B @ K)
print("closed-loop eigenvalues =", np.round(cl_eig, 2))
print("stable:", bool(np.all(cl_eig.real < 0)))
# 5 s rollout from a 0.2 rad tilt, Euler integration.
dt, x = 0.01, np.array([0, 0, 0.2, 0.0])
for _ in range(500):
u = -K @ x
x = x + dt * (A @ x + (B @ u).ravel())
print(f"angle 0.20 rad -> {x[2]:+.4f} rad after 5 s (regulated to upright)")
The fragment should expose state vector, dynamics matrices, cost matrices, gain, and closed-loop eigenvalues. python-control, Drake, and CasADi scale the design once the state definition is correct.
Practical Recipe
- Write the observation, action, and success metric before choosing a model.
- Build a baseline that is simple enough to debug by inspection.
- Add the library implementation only after the baseline behavior is understood.
- Record failures as structured cases: perception error, state error, planning error, control error, or evaluation error.
- Run at least one perturbation test before trusting the result.
The common mistake in State-space control, LQR is to celebrate the component score before checking the closed-loop handoff. The failure usually appears at the boundary: stale state, wrong frame, delayed action, saturated actuator, or metric that ignores the real task cost.
A robotics team using State-space control, LQR should log not only final success, but intermediate observations, chosen actions, controller status, and recovery events. The logs reveal whether the method is solving the task or merely passing the easiest episodes.
When state-space control, lqr feels abstract, ask what would be different in the next frame of video, the next robot state, or the next safety margin.
For State-space control, LQR, treat frontier claims as hypotheses until they expose enough detail to reproduce the result: data boundary, embodiment, controller interface, evaluation panel, and failure cases.
Can you name the observation, state estimate, action, success metric, and most likely failure mode for State-space control, LQR? If not, the system boundary is still too vague.
Production Pattern
State-space control, LQR sits inside the Part II robotics contract: geometry defines where things are, kinematics defines what motion is possible, dynamics defines what motion costs, control defines how errors are corrected, and sensing defines what the agent can know on time.
Connect LQR costs to units and behavior, since the matrix entries are design choices rather than unexplained constants. This makes the section useful to students, builders, and researchers at the same time: the idea has an intuitive role, a formal interface, a runnable check, and a failure mode that can be reproduced.
For State-space control, LQR, control closes the loop between estimated state and action. Keep reference, measured state, error signal, control law, actuator limits, and safety fallback separate in the evidence record.
| Tool or Library | What It Handles | Verification Check |
|---|---|---|
| python-control | analyzes linear systems, transfer functions, state-space models, and feedback loops | Verify units, sample time, poles, stability margin, and reference scaling. |
| CasADi | formulates optimization-based controllers with constraints and horizons | Verify constraints, warm start, solver status, and deadline behavior. |
| Drake | models dynamical systems, multibody plants, optimization, and controllers | Verify scalar type, plant finalization, frame convention, and solver status. |
| do-mpc | formulates optimization-based controllers with constraints and horizons | Verify constraints, warm start, solver status, and deadline behavior. |
| ROS 2 control | supports practical work on State-space control, LQR | Verify the library output against the hand-built baseline on one small case. |
Use this recipe when turning State-space control, LQR into code, a simulator experiment, or a robot diagnostic. The point is not to use every library. The point is to keep the hand-built baseline and the maintained-tool path comparable.
- Write the control objective, measured state, actuator command, update rate, and saturation policy.
- Run a step-response test before adding learning, with overshoot, settling time, and steady-state error logged.
- Compare the hand controller with python-control, CasADi, Drake, do-mpc, or ROS 2 control on the same plant model.
- Record latency, missed deadlines, saturation events, constraint violations, and recovery actions.
- Only compare controllers and policies when they share sensors, action limits, disturbance tests, and safety checks.
For State-space control, LQR, compare methods only through one saved artifact that preserves the inputs, outputs, units, timestamps, latency budget, configuration, seed, metric definition, and failure labels relevant to this section. The comparison is meaningful only when the same script evaluates the same panel.
Extend the section exercise by adding one perturbation specific to State-space control, LQR and one latency or uncertainty check. Save the result in the EvidenceRecord schema, then explain which library output you trust and why.
A learned policy can hide a poor LQR linearization until the robot leaves the neighborhood where $A$ and $B$ were valid. Check equilibrium choice, state ordering, units, controllability, sample time, saturation, and the closed-loop eigenvalues before scaling training. For this section, first reproduce one tiny state-space case by hand, then rerun it through python-control or Drake. If the two disagree, inspect matrix convention, discrete versus continuous time, and cost scaling before changing the model.
Technical Core
State-space control, LQR needs a topic-native core: variables, equations or system contracts, an algorithmic procedure, an expected output, and a failure diagnosis. Figure 7.4.T summarizes the chain this section must preserve when moving from a teaching example to a real embodied system.
Discrete LQR assumes $x_{t+1}=Ax_t+Bu_t$ and minimizes $\sum_{t=0}^{\infty}(x_t^\top Qx_t+u_t^\top Ru_t)$. The controller applies $u_t=-Kx_t$, where $K$ is chosen from the Riccati solution under the assumption that the linear model is valid near the operating point and the pair $(A,B)$ is controllable in the directions being regulated.
- Define the reference, measured state, error signal, actuator command, update rate, and saturation policy.
- Run a step or disturbance response before adding learning.
- Log overshoot, settling time, steady-state error, latency, saturation, and recovery behavior.
- Compare PID, LQR, or MPC only under the same plant, sensors, limits, disturbance panel, and metric code.
| Contract Field | What To Specify | Why It Matters |
|---|---|---|
| State and observation | Variables, units, timestamps, frames, and uncertainty. | Prevents a model score from being mistaken for robot capability. |
| Action interface | Command type, limits, update rate, and safety fallback. | Makes the learned or planned output executable. |
| Evidence artifact | Trace, metric, configuration, seed, and failure label. | Allows baseline and library path to be compared in one pass. |
| Tool path | python-control, CasADi, do-mpc, Drake, ROS 2 control, MuJoCo | Shows the practical library route after the mechanism is understood. |
For State-space control, LQR, expected output is a trace where the relevant error decreases, overshoot stays within the design bound, and actuator commands remain within limits under the stated timing budget.
State-space control, LQR should be stress-tested under delay, integral windup, actuator saturation, unmodeled friction, and reference-frame mismatch before the nominal trace is trusted.
Section References
Core references for State-space control, LQR: Modern Robotics; Murray, Li, and Sastry; Siciliano et al.; LaValle; and official documentation for Drake, MuJoCo, Pinocchio, CasADi, python-control, GTSAM, ROS 2, and OpenCV as applicable.
Use these references to check notation, frame conventions, units, solver assumptions, and maintained-library behavior.
State-space control, LQR is useful when it makes the perception-action loop more reliable, not when it merely adds a more impressive model name.
Design a method-matched experiment for State-space control, LQR. Specify the environment, observations, actions, metric, one perturbation, and the library output you would compare against the hand-built baseline.