The jumps in AI's autonomy

13 Mar 2026

If you’ve raised kids, you know: they don’t develop in a smooth upward line. For weeks nothing seems to change — and then overnight they’re walking, or talking in full sentences, or suddenly grasping abstract concepts. Developmental psychologists call these growth spurts: periods where the brain reorganizes itself and a whole new level of capability clicks into place. The in-between stretches aren’t wasted: that’s when the groundwork is laid, but the visible breakthroughs come in bursts.

AI is following the same pattern. Not biologically, of course, but the rhythm is strikingly similar. Years of steady research, incremental improvements in benchmarks that only specialists notice — and then a sudden jump that changes how everyone interacts with the technology. We’ve seen three of these jumps in just over three years, each one fundamentally shifting what AI can do and more importantly: how much autonomy we’re willing to give it.

Jump 1: The Chat Interface (ChatGPT, Nov 2022)

“Talk to the AI and it talks back.”

What was novel

For the first time, anyone with a browser could have a natural language conversation with an AI. No API keys, no Python scripts, no ML expertise required. You typed a question, it answered. The interface was disarmingly simple — just a chat box — but the capability behind it was unprecedented. It felt like talking to a very knowledgeable colleague who never slept.

What it enabled

Writing: drafting emails, blog posts, cover letters, marketing copy
Learning: explaining concepts at any level, from kindergarten to PhD
Coding help: “how do I do X in Python?” with working code snippets
Translation, summarization, brainstorming — all through conversation

The key skill became prompt engineering: learning how to ask the right question, provide the right context, maybe even upgrading to few-shot-prompting and coax the best answer out of the model.

How it evolved

GPT-4 (Mar 2023): dramatically better reasoning, fewer hallucinations
Multimodal input (GPT-4V, Oct 2023): upload images, ask questions about them
Custom GPTs (Nov 2023): build specialized chatbots with custom instructions and knowledge
Voice mode (GPT-4o, May 2024): real-time spoken conversation
Deep Research (2025): multi-step web research with cited reports
Competition: Claude, Gemini, Llama, Mistral, Perplexity … → the chat interface became the standard

OpenAI pioneered this jump — and the first-mover advantage was massive. ChatGPT reached 100 million users in two months, a record that still stands. Every major tech company scrambled to respond: Google rushed out Bard (later rebranded to Gemini), Meta open-sourced Llama, Anthropic launched Claude, Mistral emerged in Europe. But they were all playing catch-up to OpenAI’s paradigm. The chat interface wasn’t just a product — it became the default way humans interact with AI, and everyone else had to build their version of it.

But no matter how smart the model got, the paradigm remained the same: human asks, AI answers, one turn at a time. The AI had no memory between sessions, no access to your files, no ability to take action. It was a brilliant oracle trapped in a text box.

Jump 2: The Agentic Coder (Claude Code, Feb 2025)

“AI works alongside you.”

What was novel

The AI moved from the browser to the terminal — and got hands. Claude Code could read your files, understand your codebase, run commands, edit code, and execute multi-step plans. Instead of copy-pasting code snippets from a chat window, you could say “fix the bug in the authentication module” and watch it read the relevant files, reason about the problem, and apply the fix directly. The AI became a collaborator, not just an advisor.

What it enabled

Codebase-aware development: the AI understood your project structure, not just isolated snippets
Multi-file edits: refactoring across dozens of files in one go
Test-driven workflows: write code, run tests, fix failures — in a loop
DevOps tasks: git operations, deployments, CI/CD debugging
Documentation: generating docs from actual code, not hallucinated APIs

The key concept shifted from prompt engineering to context management: CLAUDE.md files to give the AI project knowledge, well-structured codebases so the AI could navigate them, and clear task descriptions.

How it evolved

Early Claude Code (Feb 2025): file access, terminal commands, basic tool use
Subagents (mid 2025): delegate subtasks to specialized parallel agents (Explore, Plan, test runners)
MCP (Model Context Protocol): connect to external services — databases, APIs, Slack, GitHub — through a standard protocol
Skills: reusable prompt-based capabilities (commit workflows, SEO audits, PR creation)
Memory: persistent cross-session recall of user preferences, project context, and past decisions
AGENTS.md: define specialized agent behaviors and configurations
Hooks: trigger shell commands on tool calls for custom workflows

Anthropic pioneered this jump with Claude Code, and again the first-mover advantage proved decisive. By defining the terminal-based agentic coding paradigm — complete with CLAUDE.md project context, MCP integrations, and subagent architecture — Anthropic set the template everyone else adopted. Its inventor, Boris Cherny, became somewhat of a trailblazer and poster child of agentic software development.

GitHub Copilot then added its own CLI agent mode, OpenAI made Codex CLI, Google launched Gemini CLI, and a wave of startups (Cursor, Windsurf, Aider, Jetbrains, Google Antigravity) built IDE-integrated agents. But they all converged on the pattern Anthropic established: give the AI file access, terminal access, and let it work in a loop until the task is done.

The terminal agent went from “AI that can edit files” to a full development environment with its own ecosystem. But it still needed a human at the keyboard, approving actions and steering the work.

Jump 3: The Autonomous Agent (OpenClaw, 2026)

“AI works independently.”

What was novel

The AI no longer waits for your next message. OpenClaw introduced persistent, autonomous agents that maintain their own goals, communicate through channels, and work continuously. They have short-term memory (current task context) and long-term memory (learned patterns, preferences, past decisions). They can even “dream” — processing and consolidating information in the background. The shift: from AI as a tool you wield, to AI as a teammate you delegate to.

What it enabled

App creation from description: describe what you want, the agent builds it end-to-end — architecture, code, tests, deployment
Continuous monitoring: agents that watch your systems, detect issues, and fix them without waking you up
Multi-agent collaboration: agents that specialize (frontend, backend, testing, DevOps) and coordinate through messaging channels
Self-improving workflows: agents that learn from past failures and adapt their approach

The key concept shifted again, from context management to goal setting and oversight: defining clear objectives, setting boundaries, and reviewing output rather than directing each step.

How it’s evolving

Channel-based communication: agents talk to each other (and to you) through structured messaging, like a team Slack
Memory hierarchies: working memory, episodic memory, and semantic memory — mirroring how humans organize knowledge
Dreaming/consolidation: background processing to extract patterns and improve future performance
Trust calibration: agents learn when to act autonomously vs. when to ask for human approval
Composability: chain specialized agents into complex workflows that would take a human team days

OpenClaw pioneered this jump, and the pattern is repeating. By open-sourcing the autonomous agent framework — with its channel-based communication, memory hierarchies, and dreaming capabilities — OpenClaw defined what a “real” autonomous agent looks like. Now Anthropic, OpenAI, Google, Microsoft and even Nvidia are all building their own persistent agent platforms, but they’re converging on OpenClaw’s architecture: long-running agents with structured communication, layered memory, and trust calibration. The first mover doesn’t just get a head start — they get to define the vocabulary, the expectations, and the mental model that everyone else inherits.

We’re still in the early days of this jump. The agents are capable but sometimes overeager, occasionally confused, and still learning when to ask for help. Sound familiar? It’s exactly how a promising junior developer behaves — which is why the analogy of “AI growing up” feels so apt.

Jump 4: ???

If the pattern holds, the next jump won’t be a faster version of what we have — it’ll be a qualitative shift that makes today’s autonomous agents look as quaint as ChatGPT looks to us now. Here are some candidates:

The AI Organization

Instead of individual agents that collaborate, the AI becomes a self-organizing entity — spinning up and retiring its own specialized sub-agents as needed, managing its own resources, and operating more like a company than a tool. You don’t assign tasks to agents; you define a mission, and the AI figures out the org chart. Think: “grow my SaaS to 10K users” and the AI assembles its own marketing team, engineering team, and support team — all synthetic.

I interviewed a guy who gave his OpenClaw an X, stripe account, and bank account. He told it to build a million dollar business with zero human employees. It made $300K+ in a month.

The Embodied Agent

AI breaks out of the screen entirely. Robotics has been advancing on a parallel track, and the jump happens when autonomous agents get physical presence: navigating the real world, manipulating objects, and combining digital intelligence with spatial awareness. Not just a robot arm in a factory, but AI that can walk into your office, look at your whiteboard, and start contributing.

In the span of a single week in March, ABB and NVIDIA announced they had closed the long-standing simulation-to-reality gap in industrial robotics, a Rivian spin-off raised $500 million to build AI-powered factory robots, and NVIDIA’s GTC 2026 conference showcased physical AI as the dominant theme

The Scientific Partner

AI stops being an executor and becomes a discoverer. Not “run this experiment for me” but “here’s a field of research — find something new.” The jump: AI that generates novel hypotheses, designs its own experiments, interprets unexpected results, and publishes findings that surprise human experts. We’ve seen glimpses with AlphaFold and AI-assisted drug discovery, but the real jump is when AI drives the scientific method end-to-end.

cf. karpathy/autoresearch: give an AI agent a small but real LLM training setup and let it experiment autonomously overnight. It modifies the code, trains for 5 minutes, checks if the result improved, keeps or discards, and repeats

The White-Collar Disruption

So far, AI’s biggest disruption has been in software development — an industry that was already in the business of automation. The real test comes when autonomous agents cross into paper-based, reasoning-heavy industries like consulting, legal, audit, and accountancy. Not as copilots, but as replacements for the core work product: the analysis, the judgment, the recommendation.

The early signs are already here. AI can review contracts faster than junior associates, flag anomalies in financial statements that auditors miss, and synthesize market data into strategic frameworks that look a lot like what McKinsey charges $500K to produce. But those are still tools for professionals. The jump happens when an agent handles an entire engagement end-to-end: ingest the documents, identify the issues, apply the framework, produce the deliverable, defend the conclusions.

What makes this candidate interesting is that it would lay bare a playbook for disrupting any document-driven industry. The pattern would become clear: (1) ingest the domain’s corpus — regulations, case law, standards, precedents; (2) build specialized agents for each workflow step — intake, analysis, cross-referencing, reporting; (3) chain them into an autonomous pipeline that turns raw input into finished deliverable; (4) add human review only at the final sign-off. Once that playbook works for legal or audit, it becomes a template you can point at insurance, compliance, real estate, immigration, tax — any industry where the core value is structured reasoning over documents. The question stops being whether AI can do this work and becomes which industry is next.

What’s your guess?

Peter Forret

The jumps in AI's autonomy

Jump 1: The Chat Interface (ChatGPT, Nov 2022)

What was novel

What it enabled

How it evolved

Jump 2: The Agentic Coder (Claude Code, Feb 2025)

What was novel

What it enabled

How it evolved

Jump 3: The Autonomous Agent (OpenClaw, 2026)

What was novel

What it enabled

How it’s evolving

Jump 4: ???

The AI Organization

The Embodied Agent

The Scientific Partner

The White-Collar Disruption

Also on this blog ...