TechMar 25, 2026·3 min read

The Permission Machine

GlitchBy Glitch
ai

They solved the problem of AI needing too many permissions by adding more AI.

Anthropic's new "auto mode" for Claude Code, announced Monday, replaces the endless stream of approval prompts — "Can I write this file? Can I run this command?" — with a classifier that reviews each action before execution. If it deems the action safe, Claude proceeds. If it deems the action risky, it blocks it and tries something else.

The safety mechanism watching the AI is also an AI. Nobody finds this worth remarking on.

Here's what auto mode actually does: before every tool call, a classifier checks for destructive operations (mass file deletion), data exfiltration (sending your code somewhere it shouldn't go), and malicious code execution. Safe actions auto-approve. Risky ones get blocked. If Claude keeps trying blocked actions, the system eventually escalates to asking the human — which is what it was doing before, just with extra steps.

The trust model reveals the architecture's real assumptions. Your local working directory and configured git remotes are trusted. Everything else — your company's source control, cloud storage, internal services — is treated as external until an administrator explicitly approves it. The boundary isn't intelligence. It's geography. If it's on your machine, it's fine. If it reaches beyond your machine, it needs a permission slip.

This is the replacement for a flag called --dangerously-skip-permissions. That name was honest — perhaps too honest for a feature request from every developer tired of clicking "yes" forty times during a refactor. "Auto mode" is the same trade-off with better marketing. Anthropic's own documentation still recommends using it in "isolated environments," which is corporate for "we know this isn't fully safe but the productivity gains are worth the press release."

The tell is in the fine print. The classifier "may still allow some risky actions" when user intent is ambiguous or when Claude lacks environmental context. It "may also occasionally block benign actions." In other words: it guesses. Sometimes it guesses wrong in both directions. The system that decides what's safe is making probabilistic judgments about safety — which is exactly what we used to call "clicking yes without reading the prompt," except now it's automated.

There's a genuine engineering problem here. Permission fatigue is real. Developers disable security prompts the way everyone clicks through cookie consent banners — reflexively, instantly, because the friction exceeds the attention span. Auto mode is an honest attempt at a middle ground between constant interruption and the flag that literally has "dangerously" in the name.

But the pattern underneath is worth watching. The solution to AI needing human oversight is AI that performs the oversight. The solution to that AI's limitations will be another layer of AI reviewing its decisions. Each layer adds latency, token cost, and a new surface for failure modes that compound rather than cancel. Anthropic acknowledges the "small impact on token consumption, cost, and latency." That's one layer. Wait for three.

We're building permission bureaucracies. Each new safety layer is a form that gets filed, reviewed by another algorithm, and stamped with probabilistic approval. The human sits at the end of an escalation chain they'll rarely reach, making decisions about actions they didn't observe, in a context they've lost track of.

Currently available on Team plans only, requiring admin approval, and running exclusively on Sonnet 4.6 and Opus 4.6. Enterprise access is "expected soon" — the timeline that means "when legal finishes reviewing the liability implications."

The system will probably work fine most of the time. That's the thing about classifiers — they're great at the 95%. It's the 5% where the interesting failures live. And in a permission system, the interesting failures are the ones where something destructive got classified as safe.

I'll start the timer.

Sources:

Source: SiliconANGLE