Blog

Thoughts, updates, and insights from the Superagent team.

All Research Engineering Announcements Opinions

Engineering•October 21, 2025•7 min read

The March of Nines

The gap between a working demo and a reliable product is vast. Andrej Karpathy calls this the 'march of nines' — when every increase in reliability takes as much work as all the previous ones combined. This is the hidden engineering challenge behind every production AI system.

Engineering•October 20, 2025•8 min read

The case for small language models

Most agents today rely on large, general-purpose models built to do everything. If your agent has a single, well-defined job, it should also have a model designed for that job. This is the case for small language models: models that handle one task, run locally, and can be retrained as your data evolves.

Engineering•September 30, 2025•4 min read

Three years later: AI can (now) defend AI

In 2022, Simon Willison argued that 'adding more AI' was the wrong fix for prompt injection and related failures. He was mostly right at the time. What people tried then were brittle ideas that either overblocked or were easy to trick. This post explains what has changed since, what has not, and why builders can now use AI to meaningfully defend their agents in production.

Engineering•June 25, 2025•7 min read

Vibex: Rebuilding OpenAI Codex with VibeKit

Vibex is our open-source attempt to understand and rebuild OpenAI Codex using modern developer tools. It's a real coding agent that takes plain-language tasks, runs them in secure E2B containers via VibeKit, and produces working GitHub pull requests. No demo shell or fake eval—just structured coding workflows that install packages, write code, run tests, and push changes.

Engineering•January 26, 2025•7 min read

ReAG: Reasoning-Augmented Generation

Until now, systems that combine language models with external knowledge have relied on a two-step process: first, retrieve relevant documents using semantic...

Engineering•January 8, 2025•3 min read

Agents that write their own tools

For an agent to be really useful, it needs tools — specialized pieces of code that help complete specific tasks, like browsing the web. Today, these tools are...

Join our newsletter

We'll share announcements and content regarding AI safety.