The Agent Trusts the Answer

In 2003, the most common class of web application vulnerability was not exotic. User input concatenated into a SQL string and executed. The fix was not a new security discipline. It was a single principle applied consistently: treat user input as untrusted, validate before execution. Two decades of security education later, that principle is well understood for user input. Developers are relearning it for AI output.

This week, two separate AI agent frameworks disclosed CVSS 9.8 vulnerabilities with identical root causes. PraisonAI’s CVE-2026-47391 exposes an A2A server that registers a calculate() tool implemented with Python eval(). The LLM receives a user request, generates an expression, and the application executes that expression without any sandboxing. Langroid’s CVE-2026-25879 connects an AI agent to a database by passing LLM-generated SQL directly to a query engine without a parameterized query layer between them. In both cases, an attacker does not need to compromise the AI model or craft a sophisticated prompt injection. They send a crafted request to an application that trusts LLM output as safe to execute. The interpreter runs it. The attacker gets code execution.

Neither CVE requires deep AI expertise to exploit. PraisonAI’s A2A server ships with default configuration binding to 0.0.0.0 with no authentication tokens required, meaning the endpoint is network-accessible by anyone who can reach the host. The CVSS 9.8 scores reflect not sophistication but straightforwardness: reliable, reproducible, unauthenticated code execution in widely deployed AI development frameworks.

The Same Mistake, in Two Different Execution Contexts

The parallel to SQL injection is not rhetorical. In both cases, the vulnerability exists because the application treats output from a trusted system as inherently safe to pass to an interpreter. In the SQL injection era, the trusted system was user registration forms, search boxes, and login fields. Developers reasoned, implicitly, that their own application components were generating or validating the input before it reached the database. The reasoning was wrong: the execution context does not care about the source of the input. The database engine evaluates whatever string it receives.

AI agent frameworks are making the same mistake with a different trusted source. A developer builds a tool that calls an LLM, receives structured output, and passes that output to an execution context: Python’s eval(), a SQL engine, a shell command, a subprocess call. The reasoning is that the LLM generated the content, so it is system-generated output rather than user-supplied input. The reasoning is wrong. From the perspective of the execution context, LLM output is a string from an external source. If an attacker can influence what prompt the LLM receives, they can influence what string the execution context receives. The attack surface is not the model. It is the gap between model output and execution without validation.

A third structural parallel arrived this week in a different form. Starlette’s CVE-2026-48710, the “BadHost” host-header authentication bypass disclosed by X41 D-Sec and Persistent Security Industries, exploits a trust failure in the ASGI layer rather than an AI tool: Starlette constructs request.url by concatenating the HTTP Host header with the request path, then uses that URL for middleware security decisions. Any middleware using request.url.path for access control is bypassable by appending a single character to the Host header. A protected /admin endpoint returns 200 OK on a request that should return 403. The fix shipped in Starlette 1.0.1 one day before public disclosure, giving operators near-zero lead time. Starlette processes 325 million weekly downloads and underlies FastAPI, vLLM, LiteLLM, and the majority of Python MCP server deployments. Three trust failures in AI infrastructure in a single week, three different technical mechanisms, one cognitive pattern: the system trusted its input source instead of validating the input itself.

Why the “AI Security” Framing Obscures the Fix

Most of the attention in AI security research focuses on prompt injection, model poisoning, and data exfiltration. These are real risks. But they did not produce the two CVSS 9.8 vulnerabilities this week. What produced those vulnerabilities was a failure to apply a principle that has been in the OWASP Top Ten for fifteen years: validate input before passing it to an interpreter, regardless of origin.

The AI security framing implies that novel defenses are needed. That implication delays remediation. Organizations waiting for a dedicated AI security framework or AI-specific SAST rules before acting are deferring fixes that existing tooling handles today. Parameterized queries prevent LLM-generated SQL injection with exactly the same mechanism they prevent user-generated SQL injection. Sandboxed execution environments prevent eval() exploitation whether the string came from a user or a model. The defense is not new. The application of it to AI output is what is missing.

PraisonAI’s full advisory covers six CVEs (CVE-2026-47391 through CVE-2026-47396), four at CVSS 9.8 or higher. The compound failure in the most severe item combines three independent decisions: ship without authentication, bind to all network interfaces, implement the calculation tool with eval(). Any one of those decisions, corrected, would have reduced the risk substantially. All three together produce a fully open RCE endpoint that ships as the framework’s default quickstart configuration. Developers who followed the documentation landed in production exposed.

Where This Goes

The six consecutive weeks of AI framework CVE disclosures are not a coincidence of researcher scheduling. The security research community has consolidated attention on AI infrastructure, and that attention is producing systematic coverage rather than opportunistic one-offs. The progression has moved from specific AI tools in W18 to production inference platforms in W22 (NVIDIA Triton, seven CVEs) to the ASGI web framework layer and the MCP protocol layer in W23. Each layer produces its own vulnerability class, and each class tends to repeat across different implementations once the initial disclosure establishes the template.

The LLM output trust failure is now a documented template. PraisonAI and Langroid are the first two confirmed CVSS 9.8 instances against production AI agent frameworks. They will not be the last. Any AI agent framework that passes model output to an execution context without an explicit validation step carries this vulnerability in latent form, waiting for a researcher to confirm it.

The immediate audit is straightforward: find every place in AI-integrated code where LLM output reaches eval(), exec(), subprocess, raw SQL construction, or shell execution. For each item, apply the same control applied to user input: validate against an allowlist, use parameterized APIs where available, sandbox the execution context where not. Update Starlette to 1.0.1 across all ASGI deployments and verify that downstream framework pinning for FastAPI, vLLM, and LiteLLM pulls the patched version.

The underlying cognitive error is not unique to any framework. Developers building AI integrations categorize LLM output as system-generated content rather than external untrusted input. That categorization is wrong in the same way that treating user-submitted form data as trusted was wrong in 2003. The industry spent roughly five years discovering that lesson the hard way across web applications before the consensus shifted. The AI infrastructure exploitation arc is moving considerably faster. This week’s two CVSS 9.8 disclosures are early confirmations. The question is whether the lesson from 2003 transfers faster than the attack tooling does.

Security Unlocked publishes weekly threat intelligence and strategic analysis. This post is based on intelligence collected May 26 - June 1, 2026.

The Agent Trusts the Answer

The Same Mistake, in Two Different Execution Contexts

Why the “AI Security” Framing Obscures the Fix

Where This Goes

The Weekly Brief, free.