We’re Building Autonomous Systems Without Really Understanding Them

Rigorous thinking about the AI and robots we're actually building

May 24, 2026

A language model hallucinates a safety procedure that doesn’t exist. A robot learns a policy that works flawlessly in simulation but fails catastrophically in the real world. An autonomous vehicle makes a split-second decision that kills someone, and months later, nobody can explain why. A reinforcement learning agent discovers an exploit in its reward function that turns a benign task into something dangerous. These aren’t hypotheticals. They’re happening now.

We’ve made incredible progress on AI. Language models scale in ways nobody predicted. Robotics teams are building systems that can grasp, manipulate, and collaborate in physical space. Computer vision works. Reinforcement learning agents beat humans at games that were supposed to be unsolvable. The technical breakthroughs are real, and they’re coming fast.

But there’s a problem hiding beneath the excitement: we’re deploying these systems faster than we understand them. We’re shipping robots and autonomous agents into the world with safety frameworks borrowed from software engineering, governance structures built for slower technologies, and explanations that don’t actually hold up under scrutiny. We have breakthroughs in capability but we’re running blind on consequences.

This Substack exists to contribute to that conversation.

We’ll explore what it actually takes to build safe, interpretable, trustworthy AI systems in the physical world. We’re talking about mechanistic interpretability, digging into what neural networks are actually doing. We’re talking about robotics safety evaluation and verification. How do you test a system before you let it loose? We’re talking about governance frameworks that keep pace with technology instead of chasing it. We’re talking about the gap between what we think our AI systems are doing and what they’re really doing.

We’ll pull from research, industry practice, policy, and real-world failures. We’ll ask hard questions. We’ll avoid both the hype cycle and the doom-saying. We’ll treat AI, robotics, and safety not as separate conversations but as one integrated problem that needs rigorous thinking from people who actually understand the technical depth.

The systems we’re building are too powerful to stay broken. Let’s fix them.

The Closed Loop

Discussion about this post

Ready for more?