Blog

Thoughts, updates, and insights from the Superagent team.

All Security Announcements Research Benchmarks Compliance Red Teaming Guardrails Engineering Opinions Category

Guardrails•November 24, 2025•2 min read

Your RAG Pipeline Is One Prompt Away From a Jailbreak

RAG is marketed as a safety feature, but connect it to agents that browse, call APIs, or touch databases, and every document becomes a potential jailbreak payload. Learn how malicious files, knowledge base poisoning, and indirect prompt injection turn RAG into an attack surface—and how to defend against it.

Security•November 20, 2025•5 min read

Practical guide to building safe & secure AI agents

System prompts aren't enough to secure AI agents. As agents move from chatbots to systems that read files, hit APIs, and touch production, we need real runtime protection. Learn how to defend against prompt injection, poisoned tool results, and the 'lethal trifecta' with practical guardrails.

Research•November 19, 2025•2 min read

AI Is Getting Better at Everything—Including Being Exploited

As AI models become more capable and obedient, safety improvements struggle to keep pace. The GPT-5.1 safety score drop reveals a structural problem: capability and attack surface scale faster than safety.

Research•November 17, 2025•5 min read

Are AI Models Getting Safer? A Data-Driven Look at GPT vs Claude Over Time

Are frontier models actually getting safer to deploy—or just smarter at getting around guardrails? We analyze 18 months of Lamb-Bench safety scores for GPT and Claude models.

Research•November 11, 2025•8 min read

Introducing Lamb-Bench: How Safe Are the Models Powering Your Product?

We built Lamb-Bench to solve a problem every founder faces when selling to enterprise: proving AI safety without a standard way to measure it. An adversarial testing framework that gives both buyers and sellers a common measurement standard.

Research•October 24, 2025•8 min read

VibeSec: The Current State of AI-Agent Security and Compliance

Over the past weeks, we've spoken with dozens of developers who are building AI agents and LLM-powered products. The notes below come directly from those conversations and transcripts.

1 2 3 4 5 6 7

Join our newsletter

We'll share announcements and content regarding AI safety.