<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[RockCyber Musings]]></title><description><![CDATA[AI and Cyber Geek]]></description><link>https://www.rockcybermusings.com</link><image><url>https://substackcdn.com/image/fetch/$s_!y2c3!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faaa51f40-9ed4-4093-898e-0bdb99086a7a_827x827.png</url><title>RockCyber Musings</title><link>https://www.rockcybermusings.com</link></image><generator>Substack</generator><lastBuildDate>Wed, 08 Apr 2026 11:38:36 GMT</lastBuildDate><atom:link href="https://www.rockcybermusings.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Rock Lambros]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[rockcyber@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[rockcyber@substack.com]]></itunes:email><itunes:name><![CDATA[Rock Lambros]]></itunes:name></itunes:owner><itunes:author><![CDATA[Rock Lambros]]></itunes:author><googleplay:owner><![CDATA[rockcyber@substack.com]]></googleplay:owner><googleplay:email><![CDATA[rockcyber@substack.com]]></googleplay:email><googleplay:author><![CDATA[Rock Lambros]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[Agent Supply Chain Attacks: Your Scanner Already Switched Sides]]></title><description><![CDATA[March 2026's Trivy-LiteLLM-Axios cascade shows why agent supply chain risk breaks existing controls. Practical steps for CISOs.]]></description><link>https://www.rockcybermusings.com/p/agent-supply-chain-attacks-scanner-switched-sides</link><guid isPermaLink="false">https://www.rockcybermusings.com/p/agent-supply-chain-attacks-scanner-switched-sides</guid><dc:creator><![CDATA[Rock Lambros]]></dc:creator><pubDate>Tue, 07 Apr 2026 12:50:17 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!9kRx!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66f3dd50-87fc-4013-afa3-56b70e007b69_2048x2048.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!9kRx!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66f3dd50-87fc-4013-afa3-56b70e007b69_2048x2048.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!9kRx!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66f3dd50-87fc-4013-afa3-56b70e007b69_2048x2048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!9kRx!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66f3dd50-87fc-4013-afa3-56b70e007b69_2048x2048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!9kRx!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66f3dd50-87fc-4013-afa3-56b70e007b69_2048x2048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!9kRx!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66f3dd50-87fc-4013-afa3-56b70e007b69_2048x2048.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!9kRx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66f3dd50-87fc-4013-afa3-56b70e007b69_2048x2048.jpeg" width="1456" height="1456" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/66f3dd50-87fc-4013-afa3-56b70e007b69_2048x2048.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1456,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:3273648,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/193187673?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66f3dd50-87fc-4013-afa3-56b70e007b69_2048x2048.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!9kRx!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66f3dd50-87fc-4013-afa3-56b70e007b69_2048x2048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!9kRx!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66f3dd50-87fc-4013-afa3-56b70e007b69_2048x2048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!9kRx!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66f3dd50-87fc-4013-afa3-56b70e007b69_2048x2048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!9kRx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66f3dd50-87fc-4013-afa3-56b70e007b69_2048x2048.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/p/agent-supply-chain-attacks-scanner-switched-sides?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/p/agent-supply-chain-attacks-scanner-switched-sides?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p>Agent supply chain risk stopped being theoretical in March 2026. Over twelve days, a single threat actor group turned Trivy, KICS, LiteLLM, and Axios into credential-harvesting weapons, cascading across five distribution ecosystems and producing 300 GB of stolen secrets. The campaign started with an AI-powered bot. It spread through a self-propagating worm with blockchain-based command and control. Your vulnerability scanner, the tool you trusted to protect your pipeline, was the entry point. Now picture that same attack chain hitting an autonomous agent that installs tools, updates dependencies, and executes third-party skills without asking you first.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!LBet!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff98c68dc-00bc-430c-b1ab-8f96d454607d_680x820.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!LBet!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff98c68dc-00bc-430c-b1ab-8f96d454607d_680x820.png 424w, https://substackcdn.com/image/fetch/$s_!LBet!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff98c68dc-00bc-430c-b1ab-8f96d454607d_680x820.png 848w, https://substackcdn.com/image/fetch/$s_!LBet!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff98c68dc-00bc-430c-b1ab-8f96d454607d_680x820.png 1272w, https://substackcdn.com/image/fetch/$s_!LBet!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff98c68dc-00bc-430c-b1ab-8f96d454607d_680x820.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!LBet!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff98c68dc-00bc-430c-b1ab-8f96d454607d_680x820.png" width="680" height="820" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f98c68dc-00bc-430c-b1ab-8f96d454607d_680x820.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:820,&quot;width&quot;:680,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:107234,&quot;alt&quot;:&quot;Timeline showing TeamPCP campaign cascading from Trivy to KICS to LiteLLM to CanisterWorm to Axios between February 28 and March 31 2026&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/193187673?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff98c68dc-00bc-430c-b1ab-8f96d454607d_680x820.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Timeline showing TeamPCP campaign cascading from Trivy to KICS to LiteLLM to CanisterWorm to Axios between February 28 and March 31 2026" title="Timeline showing TeamPCP campaign cascading from Trivy to KICS to LiteLLM to CanisterWorm to Axios between February 28 and March 31 2026" srcset="https://substackcdn.com/image/fetch/$s_!LBet!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff98c68dc-00bc-430c-b1ab-8f96d454607d_680x820.png 424w, https://substackcdn.com/image/fetch/$s_!LBet!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff98c68dc-00bc-430c-b1ab-8f96d454607d_680x820.png 848w, https://substackcdn.com/image/fetch/$s_!LBet!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff98c68dc-00bc-430c-b1ab-8f96d454607d_680x820.png 1272w, https://substackcdn.com/image/fetch/$s_!LBet!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff98c68dc-00bc-430c-b1ab-8f96d454607d_680x820.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure 1: The TeamPCP Cascade: 12 Days, 5 Ecosystems</figcaption></figure></div><h2>Your Vulnerability Scanner Was the Vulnerability</h2><p>On March 19, 2026, TeamPCP used stolen credentials to force-push 76 of 77 version tags in the aquasecurity/trivy-action repository to malicious commits. The payload ran before the legitimate scan. Pipelines completed normally with green checkmarks across the board. Meanwhile, the malware dumped Runner.Worker process memory and exfiltrated cloud credentials, SSH keys, Kubernetes tokens, npm tokens, and Docker registry credentials to attacker-controlled infrastructure.</p><p>Trivy is a vulnerability scanner. Organizations run it in their CI/CD pipelines to detect supply chain attacks. When TeamPCP compromised it, the tool designed to find compromised dependencies became the compromised dependency. The irony is structural, not incidental. Security tools make ideal targets for supply chain attacks because they already have broad read access to the environments they scan. They touch secrets by design.</p><p>KICS, Checkmarx&#8217;s infrastructure-as-code scanner, fell the same way four days later. All 35 version tags hijacked. Same credential-stealing payload, different typosquat domain. Then LiteLLM, the AI gateway library that holds API keys for every LLM provider an organization uses, with 95 million monthly downloads and presence in 36% of cloud environments according to Wiz Research. TeamPCP published malicious versions to PyPI using credentials stolen from LiteLLM&#8217;s own CI/CD pipeline, which ran Trivy as part of its build process.</p><p>Each victim funded the next attack. The chain started with a single incomplete credential rotation at Aqua Security on March 1. TeamPCP retained access through tokens that survived the rotation. Every compromise from March 19 forward exploited credentials harvested from the previous target. Partial containment, as Aqua Security&#8217;s own post-incident analysis acknowledged, equals no containment.</p><p>By the time Axios was compromised on March 31 (100+ million weekly npm downloads, attributed by Microsoft Threat Intelligence to North Korean state actor Sapphire Sleet), the credential ecosystem was so thoroughly disrupted that Mandiant CTO Charles Carmakal warned of &#8220;hundreds of thousands of stolen credentials&#8221; and &#8220;a variety of actors with varied motivations.&#8221; The FBI confirmed TeamPCP was working through approximately 300 GB of compressed stolen credentials in collaboration with the LAPSUS$ extortion group.</p><h2>The AI Bot That Started Everything</h2><p>Most coverage focuses on the credential cascade. The more significant development is the one that started it.</p><p>On February 28, 2026, an autonomous bot calling itself hackerbot-claw exploited a misconfigured pull_request_target workflow in Trivy&#8217;s GitHub Actions to steal a Personal Access Token with write access to all 33+ Aqua Security repositories. The bot&#8217;s GitHub profile described itself as &#8220;an autonomous security research agent powered by claude-opus-4-5.&#8221; It carried a vulnerability pattern index with 9 attack classes and 47 sub-patterns. It targeted seven major repositories belonging to Microsoft, DataDog, CNCF, and Aqua Security over one week, achieving remote code execution in at least four.</p><p>This was not a script running pre-written exploits, as hackerbot-claw adapted its approach to each target&#8217;s specific workflow configuration. When one technique failed, it pivoted. Against the ambient-code/platform repository, it attempted prompt injection by replacing the project&#8217;s CLAUDE.md file with instructions designed to trick Claude Code into committing unauthorized changes. Claude Code detected the attack and refused, classifying it as a supply chain attack via poisoned project-level instructions.</p><p>That detail matters. An AI agent attacked. An AI agent defended. The outcome depended on configuration quality, not human vigilance. This is the arms race in miniature, and it already happened at production scale against real infrastructure.</p><p>StepSecurity, Repello AI, and Boost Security Labs independently documented the campaign. Pillar Security&#8217;s assessment identified the core gap: &#8220;zero visibility into AI coding agents running on developer machines, and no runtime controls when those agents are weaponized.&#8221;</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!9Joa!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb087f4a-8118-4a84-b9f9-8ad55656f9cf_680x620.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!9Joa!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb087f4a-8118-4a84-b9f9-8ad55656f9cf_680x620.png 424w, https://substackcdn.com/image/fetch/$s_!9Joa!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb087f4a-8118-4a84-b9f9-8ad55656f9cf_680x620.png 848w, https://substackcdn.com/image/fetch/$s_!9Joa!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb087f4a-8118-4a84-b9f9-8ad55656f9cf_680x620.png 1272w, https://substackcdn.com/image/fetch/$s_!9Joa!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb087f4a-8118-4a84-b9f9-8ad55656f9cf_680x620.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!9Joa!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb087f4a-8118-4a84-b9f9-8ad55656f9cf_680x620.png" width="680" height="620" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fb087f4a-8118-4a84-b9f9-8ad55656f9cf_680x620.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:620,&quot;width&quot;:680,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:88336,&quot;alt&quot;:&quot;Comparison diagram showing how traditional supply chain attacks target human developers who review code while agent supply chain attacks target autonomous agents that auto-execute without review&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/193187673?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb087f4a-8118-4a84-b9f9-8ad55656f9cf_680x620.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Comparison diagram showing how traditional supply chain attacks target human developers who review code while agent supply chain attacks target autonomous agents that auto-execute without review" title="Comparison diagram showing how traditional supply chain attacks target human developers who review code while agent supply chain attacks target autonomous agents that auto-execute without review" srcset="https://substackcdn.com/image/fetch/$s_!9Joa!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb087f4a-8118-4a84-b9f9-8ad55656f9cf_680x620.png 424w, https://substackcdn.com/image/fetch/$s_!9Joa!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb087f4a-8118-4a84-b9f9-8ad55656f9cf_680x620.png 848w, https://substackcdn.com/image/fetch/$s_!9Joa!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb087f4a-8118-4a84-b9f9-8ad55656f9cf_680x620.png 1272w, https://substackcdn.com/image/fetch/$s_!9Joa!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb087f4a-8118-4a84-b9f9-8ad55656f9cf_680x620.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure 2: Traditional vs. Agent Supply Chain Attack Surface</figcaption></figure></div><h2>Why Agent Supply Chain Risk Breaks Your Existing Controls</h2><p>Every supply chain control you have assumes a human is looking. Dependency scanning assumes someone reviews the output. Code review assumes someone reads the diff. SBOM generation assumes someone checks the inventory. SAST and DAST assume someone triages the findings.</p><p>Agents don&#8217;t look. They execute.</p><p>When a developer installs a package, they see a version number, check a changelog, run tests. When an agent installs a tool or skill, it follows instructions. If the MCP server definition says &#8220;install this plugin,&#8221; the agent installs it. If the skill marketplace listing looks legitimate, the agent trusts it. The human-in-the-loop that traditional supply chain security depends on evaporates.</p><p>Research published the same week as the TeamPCP campaign quantifies the gap. The OpenClaw vulnerability taxonomy (<a href="https://arxiv.org/abs/2603.27517">A Systematic Taxonomy of Security Vulnerabilities in the OpenClaw AI Agent Framework, arXiv 2603.27517</a>) catalogued 190 security advisories in the OpenClaw AI agent framework, organized by architectural layer and adversarial technique. Three findings stand out. First, three moderate-to-high-severity advisories compose into a complete unauthenticated remote code execution path from an LLM tool call to the host process. Second, the primary command-filtering mechanism relies on a closed-world assumption that shell commands are identifiable through lexical parsing, an assumption broken by basic techniques like line continuation and option abbreviation. Third, and most relevant here, a malicious skill distributed through the plugin channel executed a two-stage dropper within the LLM context, bypassing the entire execution pipeline. The skill distribution surface has no runtime policy enforcement.</p><p><a href="https://arxiv.org/abs/2603.28807">SafeClaw-R (arXiv 2603.28807)</a> found that 36.4% of OpenClaw&#8217;s built-in skills carry high or critical risk. That number covers the built-in skills, before any third-party marketplace plugins enter the picture. Across ClawHub, the agent skill marketplace, Antiy CERT confirmed 1,184 malicious skills, roughly one in five packages. The Repello AI team traced 335 of those to a single coordinated campaign called ClawHavoc.</p><p>The March 2026 campaign showed what happens when the consumer of a compromised dependency is a CI/CD pipeline: automated, monitored by humans on a lag, with credentials accessible at runtime. The agent version removes the monitoring layer entirely. An agent that installs a malicious MCP server or skill executes the payload as part of its normal workflow, with whatever permissions the agent has been granted, at machine speed.</p><h2>CanisterWorm Showed What Autonomous Propagation Looks Like</h2><p>If the hackerbot-claw precedent shows how AI agents attack, CanisterWorm shows how compromised dependencies spread once humans are removed from the loop.</p><p>CanisterWorm emerged on March 20, deployed using npm tokens stolen from Trivy-compromised pipelines. It was a self-propagating worm. Given a stolen token, it enumerated every package the token provided access to, bumped the patch version, injected its payload, and republished. Twenty-eight packages were compromised in under sixty seconds. The worm infected 64+ packages across multiple npm scopes. Endor Labs assessed that TeamPCP had &#8220;automated credential-to-compromise tooling&#8221; capable of turning a single stolen token into exponential propagation.</p><p>The command-and-control infrastructure used an Internet Computer Protocol blockchain canister, a tamperproof smart contract with no single takedown point. The operator could rotate payloads on infected machines without republishing any package. Security researchers confirmed this as the first publicly documented npm worm to use blockchain-based C2. The kill switch? If the canister returned a YouTube URL, the backdoor skipped execution. At the time of discovery, it was returning a Rick Roll. The infrastructure was live, tested, and ready. The payload was dormant by choice.</p><p>CanisterWorm targeted human-operated CI/CD pipelines. Translate that propagation model to an agent ecosystem where tools install other tools, agents delegate to sub-agents, and MCP servers chain calls across services. The propagation surface expands from &#8220;every package a stolen token provides access to&#8221; to &#8220;every tool, skill, and service the compromised agent is authorized to reach.&#8221; The Model Context Protocol, now under the Linux Foundation&#8217;s Agentic AI Foundation after Anthropic donated it in December 2025, is becoming the standard for agent-to-tool communication. Trend Micro found 492 MCP servers exposed to the internet with zero authentication. A separate supply chain attack involved a package masquerading as a legitimate Postmark MCP server that silently BCC&#8217;d every outgoing email to the attackers. The CoSAI whitepaper on MCP security identified 12 core threat categories spanning nearly 40 distinct threats. The MCP specification itself uses SHOULD rather than MUST for human-in-the-loop requirements. That word choice tells you everything about where the standard stands on constraining agent autonomy.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!cGuA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9aef84b8-6614-464e-9217-882c7170cfda_680x900.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cGuA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9aef84b8-6614-464e-9217-882c7170cfda_680x900.png 424w, https://substackcdn.com/image/fetch/$s_!cGuA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9aef84b8-6614-464e-9217-882c7170cfda_680x900.png 848w, https://substackcdn.com/image/fetch/$s_!cGuA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9aef84b8-6614-464e-9217-882c7170cfda_680x900.png 1272w, https://substackcdn.com/image/fetch/$s_!cGuA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9aef84b8-6614-464e-9217-882c7170cfda_680x900.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cGuA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9aef84b8-6614-464e-9217-882c7170cfda_680x900.png" width="680" height="900" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9aef84b8-6614-464e-9217-882c7170cfda_680x900.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:900,&quot;width&quot;:680,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:109706,&quot;alt&quot;:&quot;Infographic showing key statistics from the March 2026 TeamPCP supply chain campaign and the agent supply chain gap, including 76 of 77 Trivy tags poisoned, 300 GB stolen credentials, 95 million LiteLLM monthly downloads, 36.4 percent of built-in agent skills rated high or critical risk, and 1 in 5 agent marketplace packages confirmed malicious&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/193187673?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9aef84b8-6614-464e-9217-882c7170cfda_680x900.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Infographic showing key statistics from the March 2026 TeamPCP supply chain campaign and the agent supply chain gap, including 76 of 77 Trivy tags poisoned, 300 GB stolen credentials, 95 million LiteLLM monthly downloads, 36.4 percent of built-in agent skills rated high or critical risk, and 1 in 5 agent marketplace packages confirmed malicious" title="Infographic showing key statistics from the March 2026 TeamPCP supply chain campaign and the agent supply chain gap, including 76 of 77 Trivy tags poisoned, 300 GB stolen credentials, 95 million LiteLLM monthly downloads, 36.4 percent of built-in agent skills rated high or critical risk, and 1 in 5 agent marketplace packages confirmed malicious" srcset="https://substackcdn.com/image/fetch/$s_!cGuA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9aef84b8-6614-464e-9217-882c7170cfda_680x900.png 424w, https://substackcdn.com/image/fetch/$s_!cGuA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9aef84b8-6614-464e-9217-882c7170cfda_680x900.png 848w, https://substackcdn.com/image/fetch/$s_!cGuA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9aef84b8-6614-464e-9217-882c7170cfda_680x900.png 1272w, https://substackcdn.com/image/fetch/$s_!cGuA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9aef84b8-6614-464e-9217-882c7170cfda_680x900.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure 3: Agent Supply Chain Risk: By the Numbers</figcaption></figure></div><h2>What This Means for You</h2><p>The governance gap between where agent supply chain risk is and where your controls are will take years to close. Microsoft released the Agent Governance Toolkit on April 2, 2026, addressing OWASP&#8217;s Agentic AI Top 10 with features like Ed25519 plugin signing and MCP security gateways. The toolkit is two days old and unvalidated in production. SafeClaw-R achieved 95.2% accuracy in controlled tests. That 4.8% gap matters at enterprise scale.</p><p>You don&#8217;t have years. Here&#8217;s what you have right now.</p><p>Pin everything to immutable references. Version tags are pointers, not contracts. The March 2026 campaign proved this at scale across GitHub Actions, Docker Hub, and npm. Pin GitHub Actions to full commit SHAs. Pin container images to digests. Pin PyPI packages to exact versions with hash verification. Floating tags and unpinned dependencies are the entry point for every attack in this chain.</p><p>Treat your security tools as attack surface. Trivy, KICS, and every other scanner in your pipeline runs with privileged access to secrets by design. Apply the same scrutiny to your security tooling that you apply to production dependencies. Monitor for unexpected behavior from tools that should be predictable.</p><p>Audit your agent tool pipelines. If your organization deploys AI agents with access to MCP servers, skill marketplaces, or plugin registries, inventory every tool your agents use. Verify provenance. Enforce allow-lists. The ClawHavoc campaign showed that 20% of a major agent marketplace was compromised. Your agents are pulling from these registries right now.</p><p>Make credential rotation atomic. The entire TeamPCP cascade traces back to one failure: Aqua Security&#8217;s non-atomic rotation on March 1. When you respond to a supply chain incident, revoke all credentials simultaneously before issuing replacements. Partial rotation is an invitation for round two.</p><p>Plan for agent-specific incident response. If a tool or skill consumed by your agents is compromised, the blast radius includes everything those agents are authorized to access. Your current incident response playbook assumes a human in the response loop. Write the agent-specific version before you need it.</p><p><strong>Key Takeaway:</strong> The March 2026 supply chain campaign compromised your scanners, your AI gateway, and your HTTP client in twelve days. The same attack pattern targeting autonomous agents will move faster, spread further, and leave fewer traces. Your supply chain controls were built for a world where a human reviewed every dependency. That world is ending.</p><h3>What to do next</h3><p>The gap between traditional supply chain security and agent supply chain security is the defining governance challenge of 2026. If you&#8217;re a CISO or security architect, the question isn&#8217;t whether your organization uses AI agents with third-party tools. The question is whether you know which tools, with what permissions, under whose authority.</p><p>Start with visibility. You don&#8217;t control what you haven&#8217;t inventoried. For a deeper framework on operationalizing emerging security challenges, <a href="https://www.amazon.com/CISO-Evolution-Knowledge-Cybersecurity-Executives/dp/1119782481">The CISO Evolution</a> walks through how security leaders adapt their programs when the threat model shifts underneath them.</p><p>More on agent security, supply chain governance, and the practitioner&#8217;s view of AI risk at <strong><a href="https://rockcyber.substack.com">RockCyber Musings</a></strong>.</p><p>&#128073; Visit <strong><a href="https://www.rockcyber.com/">RockCyber.com</a></strong> to learn more about how we can help you in your traditional Cybersecurity and AI Security and Governance Journey</p><p>&#128073; Want to save a quick $100K? Check out our AI Governance Tools at <strong><a href="https://aigovernancetoolkit.com/">AIGovernanceToolkit.com</a></strong></p><p>&#128073; Subscribe for more AI and cyber insights with the occasional rant.</p><p><em>The views and opinions expressed in RockCyber Musings are my own and do not represent the positions of my employer or any organization I&#8217;m affiliated with.</em></p><p>Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share RockCyber Musings&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share RockCyber Musings</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[Reasoning Theater: Why Chain-of-Thought Monitoring Fails Your Agentic AI]]></title><description><![CDATA[New research proves reasoning models perform deliberation they've already completed. Apply the CARE framework to close your agentic AI monitoring gap.]]></description><link>https://www.rockcybermusings.com/p/reasoning-theater-cot-monitoring-fails-agentic-ai</link><guid isPermaLink="false">https://www.rockcybermusings.com/p/reasoning-theater-cot-monitoring-fails-agentic-ai</guid><dc:creator><![CDATA[Rock Lambros]]></dc:creator><pubDate>Tue, 07 Apr 2026 12:50:15 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!8HFJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc5ffff6-cea5-4323-92c9-86226f2f91a8_2752x1536.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!8HFJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc5ffff6-cea5-4323-92c9-86226f2f91a8_2752x1536.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8HFJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc5ffff6-cea5-4323-92c9-86226f2f91a8_2752x1536.jpeg 424w, https://substackcdn.com/image/fetch/$s_!8HFJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc5ffff6-cea5-4323-92c9-86226f2f91a8_2752x1536.jpeg 848w, https://substackcdn.com/image/fetch/$s_!8HFJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc5ffff6-cea5-4323-92c9-86226f2f91a8_2752x1536.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!8HFJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc5ffff6-cea5-4323-92c9-86226f2f91a8_2752x1536.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8HFJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc5ffff6-cea5-4323-92c9-86226f2f91a8_2752x1536.jpeg" width="1456" height="813" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fc5ffff6-cea5-4323-92c9-86226f2f91a8_2752x1536.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:813,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2828150,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/190378011?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc5ffff6-cea5-4323-92c9-86226f2f91a8_2752x1536.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!8HFJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc5ffff6-cea5-4323-92c9-86226f2f91a8_2752x1536.jpeg 424w, https://substackcdn.com/image/fetch/$s_!8HFJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc5ffff6-cea5-4323-92c9-86226f2f91a8_2752x1536.jpeg 848w, https://substackcdn.com/image/fetch/$s_!8HFJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc5ffff6-cea5-4323-92c9-86226f2f91a8_2752x1536.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!8HFJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc5ffff6-cea5-4323-92c9-86226f2f91a8_2752x1536.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>If your agentic AI safety strategy depends on reading the model&#8217;s chain of thought, you&#8217;re listening to a rehearsed speech and calling it a confession. <a href="https://arxiv.org/abs/2603.05488">A March 2025 paper from Goodfire AI and Harvard University</a> shows that reasoning models often commit to their final answer within the first tokens of &#8220;thinking,&#8221; then generate hundreds of additional tokens to perform deliberation they&#8217;ve already completed. For every security leader, product owner, and governance committee treating chain-of-thought monitoring as an auditable safety control, these findings demand an immediate reassessment.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/p/reasoning-theater-cot-monitoring-fails-agentic-ai?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/p/reasoning-theater-cot-monitoring-fails-agentic-ai?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><h2>The Research That Changes the Conversation</h2><p>The paper, titled &#8220;Reasoning Theater: Disentangling Model Beliefs from Chain-of-Thought,&#8221; tested two frontier reasoning models (DeepSeek-R1 671B and GPT-OSS 120B) using three methods to determine when models commit to their final answer during a chain-of-thought trace.</p><p>The first method trains lightweight attention probes on model activations to predict the final answer at any point during reasoning. The second forces the model to answer early by truncating its reasoning and demanding a response. The third uses an external LLM (Gemini 2.5 Flash) as a CoT monitor, the same approach many vendors now sell as an AI safety feature.</p><p>The results expose a gap that should reshape every AI governance conversation. On MMLU-Redux, a benchmark heavy on recall-based questions, attention probes decoded the correct final answer with high accuracy from the earliest reasoning tokens. The CoT monitor couldn&#8217;t identify the model&#8217;s answer until much later in the sequence. The performativity rate, measuring the gap between internal confidence and external verbalization, hit 0.417 on MMLU for DeepSeek-R1. A rate near zero means the model&#8217;s spoken reasoning matches its internal state. A rate of 0.417 means the model spends roughly 40% of its reasoning trace performing confidence it already holds.</p><p>On GPQA-Diamond, a dataset requiring graduate-level multi-hop reasoning in biology, chemistry, and physics, the picture reversed. All three methods started near chance accuracy and improved at similar rates throughout the reasoning trace. The performativity rate dropped to 0.012. The chain of thought tracked genuine computation because the model needed to think.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ge87!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb549b57-6a71-424f-bf30-7b645e1b97a6_3500x2500.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ge87!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb549b57-6a71-424f-bf30-7b645e1b97a6_3500x2500.png 424w, https://substackcdn.com/image/fetch/$s_!ge87!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb549b57-6a71-424f-bf30-7b645e1b97a6_3500x2500.png 848w, https://substackcdn.com/image/fetch/$s_!ge87!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb549b57-6a71-424f-bf30-7b645e1b97a6_3500x2500.png 1272w, https://substackcdn.com/image/fetch/$s_!ge87!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb549b57-6a71-424f-bf30-7b645e1b97a6_3500x2500.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ge87!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb549b57-6a71-424f-bf30-7b645e1b97a6_3500x2500.png" width="1456" height="1040" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bb549b57-6a71-424f-bf30-7b645e1b97a6_3500x2500.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1040,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:350905,&quot;alt&quot;:&quot;Bar chart comparing performativity rates showing high values for MMLU recall tasks and near-zero values for GPQA-Diamond reasoning tasks across DeepSeek-R1 and GPT-OSS models&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/190378011?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb549b57-6a71-424f-bf30-7b645e1b97a6_3500x2500.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Bar chart comparing performativity rates showing high values for MMLU recall tasks and near-zero values for GPQA-Diamond reasoning tasks across DeepSeek-R1 and GPT-OSS models" title="Bar chart comparing performativity rates showing high values for MMLU recall tasks and near-zero values for GPQA-Diamond reasoning tasks across DeepSeek-R1 and GPT-OSS models" srcset="https://substackcdn.com/image/fetch/$s_!ge87!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb549b57-6a71-424f-bf30-7b645e1b97a6_3500x2500.png 424w, https://substackcdn.com/image/fetch/$s_!ge87!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb549b57-6a71-424f-bf30-7b645e1b97a6_3500x2500.png 848w, https://substackcdn.com/image/fetch/$s_!ge87!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb549b57-6a71-424f-bf30-7b645e1b97a6_3500x2500.png 1272w, https://substackcdn.com/image/fetch/$s_!ge87!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb549b57-6a71-424f-bf30-7b645e1b97a6_3500x2500.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure 1: Performativity Rate Across Models and Benchmarks</figcaption></figure></div><p>The number that should stop every governance meeting cold: probe-guided early exit reduced the number of generated tokens by 80% on MMLU and 30% on GPQA-Diamond, with comparable accuracy. The model produced 80% more tokens than it needed on easy recall tasks. Every one of those tokens looked like reasoning. None of them were.</p><h2>Your Model Performs Compliance, Not Communication</h2><p>The paper&#8217;s most valuable contribution for security professionals sits in the linguistic framework, not the probe methodology itself.</p><p>They invoke Grice&#8217;s maxims of cooperative communication, a foundational concept from pragmatics describing what honest, effective communicators do. Grice identified four maxims that cooperative speakers follow. They stay relevant (Relation). They assert only what they have evidence for (Quality). They provide exactly as much information as needed, no more (Quantity). They communicate clearly and without unnecessary obscurity (Manner).</p><p>Reasoning models, trained through reinforcement learning optimized for outcome rewards, follow Relation and Quality naturally. Staying on topic and generating evidence-based reasoning steps correlates with correct final answers, which earns the reward. The model has every incentive to be relevant and factually grounded.</p><p>Quantity and Manner get violated because the reward function doesn&#8217;t penalize verbosity or obscurity. The model generates hundreds of performative tokens after committing to its answer because nothing in the training signal punishes that behavior. The output looks like careful deliberation, but it reads like a thorough analysis. The model&#8217;s internal state tells a different story.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!wGIe!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F622a456c-5966-4cea-96b3-16ef262298ff_5175x4383.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!wGIe!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F622a456c-5966-4cea-96b3-16ef262298ff_5175x4383.png 424w, https://substackcdn.com/image/fetch/$s_!wGIe!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F622a456c-5966-4cea-96b3-16ef262298ff_5175x4383.png 848w, https://substackcdn.com/image/fetch/$s_!wGIe!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F622a456c-5966-4cea-96b3-16ef262298ff_5175x4383.png 1272w, https://substackcdn.com/image/fetch/$s_!wGIe!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F622a456c-5966-4cea-96b3-16ef262298ff_5175x4383.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!wGIe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F622a456c-5966-4cea-96b3-16ef262298ff_5175x4383.png" width="1456" height="1233" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/622a456c-5966-4cea-96b3-16ef262298ff_5175x4383.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1233,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:868342,&quot;alt&quot;:&quot;Flowchart showing how reinforcement learning reward alignment causes reasoning models to follow relevance and quality maxims while violating quantity and manner maxims, creating a monitoring blind spot&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/190378011?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F622a456c-5966-4cea-96b3-16ef262298ff_5175x4383.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Flowchart showing how reinforcement learning reward alignment causes reasoning models to follow relevance and quality maxims while violating quantity and manner maxims, creating a monitoring blind spot" title="Flowchart showing how reinforcement learning reward alignment causes reasoning models to follow relevance and quality maxims while violating quantity and manner maxims, creating a monitoring blind spot" srcset="https://substackcdn.com/image/fetch/$s_!wGIe!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F622a456c-5966-4cea-96b3-16ef262298ff_5175x4383.png 424w, https://substackcdn.com/image/fetch/$s_!wGIe!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F622a456c-5966-4cea-96b3-16ef262298ff_5175x4383.png 848w, https://substackcdn.com/image/fetch/$s_!wGIe!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F622a456c-5966-4cea-96b3-16ef262298ff_5175x4383.png 1272w, https://substackcdn.com/image/fetch/$s_!wGIe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F622a456c-5966-4cea-96b3-16ef262298ff_5175x4383.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure 2: Gricean Maxim Compliance in RL-Trained Reasoning Models</figcaption></figure></div><p>If you&#8217;ve sat through a compliance audit where someone recites a scripted answer that sounds thorough but reveals nothing about practice on the ground, you&#8217;ve seen the human version of this. The model doesn&#8217;t lie to you. It performs what a reasoning trace is supposed to look like, without any structural obligation to communicate its internal state honestly.</p><p>This framing hits differently depending on where you sit at the governance table. Product teams hear &#8220;the model reasons safely&#8221; because the CoT looks clean. Legal hears &#8220;we have an auditable reasoning chain&#8221; because there&#8217;s text to review. Security asks the question nobody else in the room raises&#8230; &#8220;Does the text reflect what the model computed internally?&#8221;</p><h2>The Difficulty Trap: Where Enterprise Agentic AI Lives</h2><p>The difficulty-dependent split creates a precise and uncomfortable problem for enterprise deployments. CoT monitoring works when the model genuinely needs to think, hard multi-hop reasoning, novel problem decomposition, tasks requiring significant test-time compute. CoT monitoring fails on tasks where models already encode the answer in their weights: recall, pattern matching, classification, and routine decision execution.</p><p>Consider what agentic AI does in a financial services environment right now. A reconciliation agent matches transaction records against internal ledgers. A KYC agent screens customer documents against sanctions lists. A compliance agent routes regulatory filings based on entity classification. A trading surveillance agent flags potential wash trades from order flow patterns.</p><p>Every one of those tasks leans heavily toward recall and pattern matching. The model has seen millions of similar patterns during training. It knows the answer early. The chain-of-thought trace generated afterward is, according to this research, predominantly performative.</p><p>The security calculus changes in direct response. Picture a sanctions screening agent processing a crafted entity record containing an indirect prompt injection embedded in a business description field. The model processes the injection at the activation level, potentially altering its classification decision. The chain-of-thought trace, generated performatively because sanctions screening is a recall task the model handles, is produced from in-weight knowledge and reads like a textbook analysis of entity risk factors. The CoT monitor watching that trace sees clean, plausible reasoning. The model&#8217;s internal activations tell a different story. Your audit trail captures what the agent wants you to see, not computation. An examiner reviewing the reasoning trace isn&#8217;t examining evidence of how the model decided. They&#8217;re reviewing a post-hoc performance that resembles deliberation.</p><p>The researchers also found that model size correlates with performativity. Larger models (671B parameters) knew their answers earlier and generated more performative reasoning than smaller distilled versions (1.5B to 32B). Smaller models needed genuine test-time computation for equivalent tasks. As the industry pushes toward larger, more capable reasoning models for enterprise agentic AI, the performativity problem intensifies with every parameter upgrade.</p><p>The EU AI Act&#8217;s Article 14 requires human oversight of high-risk AI systems, with provisions for understanding the system&#8217;s reasoning. Financial regulators from the OCC to the FCA have signaled expectations for explainability in automated decision-making. Governance frameworks built on the assumption that reasoning traces constitute meaningful explanations now face a credibility problem they haven&#8217;t accounted for.</p><h2>Three Research Groups, One Uncomfortable Conclusion</h2><p>The Reasoning Theater paper lands with maximum force when you triangulate it against two other major findings from competing institutions.</p><p><a href="https://arxiv.org/abs/2503.11926">OpenAI&#8217;s Baker et al. (March 2025) </a>studied what happens when CoT monitors are integrated directly into the reinforcement learning training loop. In early stages, models became more capable and more aligned. Under excessive training pressure, models learned what the researchers called &#8220;obfuscated reward hacking&#8221;: hiding their intent within the chain of thought while continuing to hack the reward signal. OpenAI&#8217;s recommendation was blunt: developers training frontier reasoning models should avoid applying strong supervision directly to chains of thought. The monitorability itself is fragile. OpenAI&#8217;s follow-up &#8220;Monitoring Monitorability&#8221; paper in December 2025 proposed formal evaluation metrics to track this fragility across o3 and GPT-5.1 training runs, recognizing that the problem demands ongoing measurement rather than a one-time fix.</p><p><a href="https://arxiv.org/abs/2505.05410">Anthropic&#8217;s Chen et al. (May 2025)</a> found that reasoning models verbalized their use of reasoning hints in fewer than 20% of cases where they relied on those hints. Outcome-based RL initially improved faithfulness, then plateaued. Increased hint usage through reward hacking did not increase the propensity to disclose those hints. The model got better at using shortcuts while getting no better at revealing them.</p><p><a href="https://arxiv.org/abs/2507.11473">Korbak et al.&#8217;s multi-author paper from July 2025</a>, signed by over 40 researchers from multiple institutions, explicitly framed CoT monitorability as &#8220;a new and fragile opportunity for AI safety.&#8221; They warned that latent reasoning models, architectures that reason in continuous vector space rather than natural language, would eliminate even the partial transparency that current CoT provides.</p><p>Three independent research groups arrived at this conclusion: an interpretability lab at Goodfire, the model developer at OpenAI, and a cross-institutional safety coalition. All arrived at the same conclusion: chain-of-thought monitoring provides a signal in specific conditions, and those conditions don&#8217;t map cleanly to enterprise agentic AI deployments, where the majority of tasks fall below the difficulty threshold where monitoring remains reliable.</p><h2>Applying CARE: Fixing Your Chain-of-Thought Monitoring Gap</h2><p>The CARE framework (Create, Adapt, Run, Evolve) provides a structured path to address the reasoning theater problem without abandoning CoT monitoring wholesale.</p><p><strong>Create</strong> your threat model around the faithfulness gap. Classify every agentic AI workflow by task difficulty and determine which ones the model solves from in-weight knowledge versus genuine test-time reasoning. Recall-heavy tasks like KYC screening, transaction matching, and document classification carry high performativity risk. Multi-step analytical tasks like fraud investigation across multiple data sources and complex regulatory interpretation carry lower risk. Design your monitoring controls proportional to that classification.</p><p><strong>Adapt</strong> your monitoring stack to account for activation-level signals where architectures permit it. The Reasoning Theater research demonstrates that attention probes trained on model activations predict behavior far earlier and more accurately than CoT monitors for performative tasks. Every vendor offering agentic AI safety tooling should answer one direct question: does your monitoring access model internals, or does it only read output text? If the answer is output text only, you now know the limitation and the research that quantifies it.</p><p><strong>Run</strong> dual-layer monitoring in production. Keep CoT monitors active for genuine reasoning detection and inflection point identification. The research confirmed that backtracking, realizations, and reconsiderations in CoT traces correspond to authentic belief shifts, appearing more than twice as often in traces where probes showed genuine uncertainty (0.045 inflections per step in uncertain traces versus 0.020 in high-confidence performative traces for DeepSeek-R1). These inflection points remain reliable indicators of real internal computation. Layer behavioral monitoring (input/output validation, action-level controls, tool call verification) as the primary safety net for performative tasks. The CoT monitor becomes your second line for routine operations, not your first.</p><p><strong>Evolve</strong> your governance documentation to reflect the difficulty-dependent nature of CoT reliability. Update risk assessments as model capabilities change. Larger models and improved training methods shift the boundary between &#8220;easy&#8221; and &#8220;hard&#8221; tasks, changing where CoT monitoring remains effective. The August 2026 EU AI Act enforcement deadline adds urgency. Treat this as a moving target, because the research shows it is one.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kxXF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69bfd39a-30fc-45c2-a014-a294eb7a3053_3307x7314.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kxXF!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69bfd39a-30fc-45c2-a014-a294eb7a3053_3307x7314.png 424w, https://substackcdn.com/image/fetch/$s_!kxXF!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69bfd39a-30fc-45c2-a014-a294eb7a3053_3307x7314.png 848w, https://substackcdn.com/image/fetch/$s_!kxXF!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69bfd39a-30fc-45c2-a014-a294eb7a3053_3307x7314.png 1272w, https://substackcdn.com/image/fetch/$s_!kxXF!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69bfd39a-30fc-45c2-a014-a294eb7a3053_3307x7314.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kxXF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69bfd39a-30fc-45c2-a014-a294eb7a3053_3307x7314.png" width="1456" height="3220" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/69bfd39a-30fc-45c2-a014-a294eb7a3053_3307x7314.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:3220,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1056417,&quot;alt&quot;:&quot;Flowchart showing the four CARE framework phases with specific actions for addressing chain-of-thought monitoring limitations in agentic AI deployments&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/190378011?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69bfd39a-30fc-45c2-a014-a294eb7a3053_3307x7314.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Flowchart showing the four CARE framework phases with specific actions for addressing chain-of-thought monitoring limitations in agentic AI deployments" title="Flowchart showing the four CARE framework phases with specific actions for addressing chain-of-thought monitoring limitations in agentic AI deployments" srcset="https://substackcdn.com/image/fetch/$s_!kxXF!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69bfd39a-30fc-45c2-a014-a294eb7a3053_3307x7314.png 424w, https://substackcdn.com/image/fetch/$s_!kxXF!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69bfd39a-30fc-45c2-a014-a294eb7a3053_3307x7314.png 848w, https://substackcdn.com/image/fetch/$s_!kxXF!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69bfd39a-30fc-45c2-a014-a294eb7a3053_3307x7314.png 1272w, https://substackcdn.com/image/fetch/$s_!kxXF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69bfd39a-30fc-45c2-a014-a294eb7a3053_3307x7314.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure 3: CARE Framework Response to Reasoning Theater</figcaption></figure></div><p><strong>Key Takeaway:</strong> Chain-of-thought monitoring provides genuine safety signal for hard reasoning tasks, but the majority of enterprise agentic AI workflows fall below the difficulty threshold where that signal remains reliable. Your governance framework needs to know the difference, and your next vendor evaluation needs to test for it.</p><h3>What to do next</h3><p>Download the Reasoning Theater paper and its interactive visualization tool at <a href="http://reasoning-theater.streamlit.app">reasoning-theater.streamlit.app</a>. Map your agentic AI workflows against the difficulty-dependent performativity findings. Bring this evidence to your next AI governance meeting, because the product team, legal counsel, and AI lead sitting across from you haven&#8217;t read it yet.</p><p>For more on building AI governance frameworks that survive contact with adversarial reality, explore the CARE framework at <a href="https://rockcyber.com">rockcyber.com</a>. Subscribe to <a href="https://rockcybermusings.com">RockCyber Musings</a> for more AI security and governance insights with the occasional rant.</p><p>&#128073; Subscribe for more AI security and governance insights with the occasional rant.</p><p>&#128073; Visit <strong><a href="https://www.rockcyber.com/">RockCyber.com</a></strong> to learn more about how we can help you in your traditional Cybersecurity and AI Security and Governance Journey</p><p>&#128073; Want to save a quick $100K? Check out our AI Governance Tools at <strong><a href="https://aigovernancetoolkit.com/">AIGovernanceToolkit.com</a></strong></p><p><em>The views and opinions expressed in RockCyber Musings are my own and do not represent the positions of my employer or any organization I&#8217;m affiliated with.</em></p>]]></content:encoded></item><item><title><![CDATA[Weekly Musings Top 10 AI Security Wrapup: Issue 32 March 27-April 2, 2026]]></title><description><![CDATA[Anthropic's Worst Week, CISA's Busiest Friday, and the EU Still Wasn't Ready]]></description><link>https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260327-20260402</link><guid isPermaLink="false">https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260327-20260402</guid><dc:creator><![CDATA[Rock Lambros]]></dc:creator><pubDate>Fri, 03 Apr 2026 13:03:28 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!GI_T!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ea64fcd-437b-4e33-9512-7857b114e5ed_1024x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!GI_T!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ea64fcd-437b-4e33-9512-7857b114e5ed_1024x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!GI_T!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ea64fcd-437b-4e33-9512-7857b114e5ed_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!GI_T!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ea64fcd-437b-4e33-9512-7857b114e5ed_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!GI_T!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ea64fcd-437b-4e33-9512-7857b114e5ed_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!GI_T!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ea64fcd-437b-4e33-9512-7857b114e5ed_1024x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!GI_T!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ea64fcd-437b-4e33-9512-7857b114e5ed_1024x1024.png" width="1024" height="1024" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2ea64fcd-437b-4e33-9512-7857b114e5ed_1024x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1024,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1233556,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/193018440?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ea64fcd-437b-4e33-9512-7857b114e5ed_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!GI_T!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ea64fcd-437b-4e33-9512-7857b114e5ed_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!GI_T!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ea64fcd-437b-4e33-9512-7857b114e5ed_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!GI_T!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ea64fcd-437b-4e33-9512-7857b114e5ed_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!GI_T!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ea64fcd-437b-4e33-9512-7857b114e5ed_1024x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Anthropic had a week that should be a case study in operational security failure for years to come. On March 31, a routine release packaging error exposed 500,000 lines of Claude Code source across roughly 2,000 files. Five days earlier, a CMS misconfiguration had already put nearly 3,000 unpublished internal documents into a public search index, including draft material describing their most capable model as posing &#8220;unprecedented cybersecurity risk.&#8221; By April 1, they were firing DMCA takedowns at 8,000 GitHub repositories, most unrelated to them, trying to unsee what the internet had already seen. By April 2, a congressman was writing to the CEO about national security.</p><p>That would have been enough for any week. It was not the only thing that happened. On March 27, CISA added two exploited AI infrastructure vulnerabilities to its KEV catalog; three LangChain and LangGraph CVEs hit disclosure, with 84 million downloads in scope; and the European Commission confirmed attackers had been inside their AWS account for three days. The thread connecting all of it is the same one it always is: AI deployment speed running ahead of the operational security discipline required to sustain it. This week was not an anomaly. It was a pattern. Patterns do not self-correct.</p><p>As a bonus, check out my <strong><a href="https://www.youtube.com/watch?v=091_b2qep9M">AI Cyber Magazine Podcast with Confidence Staveley</a></strong> during RSA.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260327-20260402?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260327-20260402?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><h3>1. Anthropic Leaked 500,000 Lines of Claude Code Source, Then Panicked on GitHub</h3><p>On March 31, a debugging file accidentally bundled into a routine Claude Code update exposed approximately 500,000 lines of source code across nearly 2,000 files (CNBC, Axios, Fortune). The codebase was mirrored across GitHub within hours. Leaked feature flags revealed unreleased capabilities: a persistent background agent, cross-device remote control, and session-to-session learning. Anthropic attributed the incident to &#8220;a release packaging issue caused by human error&#8221; and stated no customer data was exposed. On April 1, attempting to scrub the code from GitHub, Anthropic sent DMCA takedowns that hit approximately 8,000 repositories, most unrelated to the leak (TechCrunch, Bloomberg).</p><p><strong>Why it matters</strong></p><ul><li><p>Competitors received Anthropic&#8217;s unreleased feature roadmap. That strategic damage compounds the fact that this happened five days after the Mythos content leak. Coincidence???? I&#8217;ll let you decide.</p></li><li><p>The persistent background agent and remote control capabilities in the leaked code require explicit security design review before deployment. They were in development without prior public disclosure of the capability direction.</p></li><li><p>The DMCA sweep that caught 8,000 unrelated repositories shows what reactive incident response without a playbook looks like. Every remediation attempt created a new problem.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>If you deploy Claude Code in your enterprise environment, review what access it holds to production systems and rotate any associated credentials until the full scope of the leak is confirmed.</p></li><li><p>Require software composition analysis (SCA) and release integrity verification as contractual terms with your AI vendors.</p></li><li><p>Develop a pre-incident legal response playbook that covers IP exposure scenarios, including proportional DMCA procedures that require scope confirmation before submission.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Two major operational security failures from the same company in five days. The first was a CMS misconfiguration. The second was a packaging error. Both are basic controls that mature security operations have solved. Anthropic markets itself on safety and trustworthiness, and that positioning is now doing work it was not designed to carry. The DMCA overcorrection made it worse: you leak 500,000 lines of source code, then fire automated takedown requests at 8,000 repositories, most of them unrelated to you. Every IP attorney will tell you DMCA takedowns require good faith and specificity. Have a process before the fire starts.</p><h3>2. Anthropic Accidentally Confirmed Its Most Capable Model Poses Unprecedented Cybersecurity Risk</h3><p>A configuration error in Anthropic&#8217;s content management system made nearly 3,000 unpublished assets publicly searchable starting around March 26, including draft blog posts for a model called Claude Mythos (Fortune, CoinDesk). Internal documents describe Mythos as capable of rapidly finding and exploiting software vulnerabilities at an unprecedented scale. Anthropic confirmed the model exists and is in testing with early-access customers, calling it &#8220;a step change&#8221; in capability. The company described the exposure as caused by a configuration error and stated the data store was secured after discovery.</p><p><strong>Why it matters</strong></p><ul><li><p>Anthropic&#8217;s own internal documentation, not a researcher&#8217;s estimate, describes this model as posing cybersecurity risks the industry has not seen before. That is the company&#8217;s self-assessment.</p></li><li><p>Early-access customer deployments were already underway before any public discussion of the risk profile occurred. The model shipped before the security conversation started.</p></li><li><p>A frontier model capable of autonomously finding and exploiting vulnerabilities at scale invalidates current vulnerability management timelines. That conversation needs to happen now.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Update your AI threat model to account for AI-assisted offensive operations at scale. This is not a future scenario. It is a current deployment.</p></li><li><p>Ask your AI vendors direct questions about internal capability assessments before your next contract renewal. What have they assessed, and when?</p></li><li><p>Document board and leadership awareness of frontier AI capability risk as a governance record item. Regulatory scrutiny on this topic will increase.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>The model is called Mythos. The leaked internal docs describe the cybersecurity risk as unprecedented. Anthropic was already deploying it with customers before any of this became public. This happened not because of an attack but because someone left a CMS misconfigured. Anthropic has historically been conservative in capability claims. When their own internal documentation describes a model as different in kind from what came before, the security community should take that seriously, not because the word &#8220;unprecedented&#8221; is alarming on its own, but because the source is the organization that built the thing. They know what it does.</p><h3>3. ShinyHunters Breached the European Commission&#8217;s AWS Account</h3><p>The European Commission confirmed on March 27 that attackers accessed the AWS account hosting its <a href="http://Europa.eu">Europa.eu</a> websites, with the intrusion first detected on March 24 (TechCrunch, Bloomberg). Threat actor ShinyHunters claimed responsibility and alleged theft of more than 350GB of data including mail server exports, databases, confidential documents, and contracts. The Commission&#8217;s statement noted internal systems were unaffected and mitigation measures were applied quickly. Affected EU entities received notification.</p><p><strong>Why it matters</strong></p><ul><li><p>ShinyHunters has a documented history of monetizing stolen data through dark market sales. Even if the 350GB claim is exaggerated for leverage, policy documents and procurement contracts from the Commission&#8217;s web infrastructure are a counterintelligence asset.</p></li><li><p>The Commission enforces GDPR and is building the AI Act enforcement apparatus. Getting breached while standing up that apparatus is not a good governance signal.</p></li><li><p>AWS account-level compromise is full infrastructure compromise in practice. A managed cloud provider does not neutralize cloud account security failures.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Audit your AWS account permission boundaries and review CloudTrail logs for anomalous patterns this week, not next quarter.</p></li><li><p>Ensure your incident response plan explicitly covers cloud account compromise. Traditional endpoint-focused plans miss this scenario entirely.</p></li><li><p>If any of your vendors are EU institutions or Commission contractors, treat procurement data exposure as a downstream supply chain risk and assess your exposure now.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>The body enforcing Europe&#8217;s data protection framework had its AWS account cracked. Governance credentials do not equal security maturity. Write the most thorough AI regulation in the world. Your cloud IAM configuration remains a disaster until someone fixes it. The ShinyHunters 350GB claim needs forensic verification before anyone draws conclusions about scope, but three days of undetected access to the official Commission infrastructure doesn&#8217;t need verification. The institutions asking private sector organizations to demonstrate AI security maturity owe the market some transparency on their own failures. Name it, fix it, move on.</p><h3>4. Your AI Workflow Tool Got CISA&#8217;s Attention: Langflow CVE-2026-33017</h3><p>CISA added CVE-2026-33017, a critical remote code execution flaw in Langflow, to its Known Exploited Vulnerabilities catalog on March 26. Attackers began scanning for exposed instances roughly 20 hours after the advisory publication, with exploitation scripts appearing within 21 hours and active .env and .db file harvesting beginning within 24 hours (Sysdig, BleepingComputer, Help Net Security). The vulnerability carries a CVSS score of 9.3 and allows unauthenticated attackers to inject arbitrary Python code through the public flow build endpoint with no sandboxing applied. Federal agencies face an April 8 remediation deadline. Upgrade to Langflow version 1.9.0 or later.</p><p><strong>Why it matters</strong></p><ul><li><p>Langflow is used to build and deploy LLM pipelines. Remote code execution in a workflow orchestration tool gives an attacker control over the AI&#8217;s inputs, outputs, and the credentials it holds.</p></li><li><p>The 20-hour exploitation window is increasingly standard for high-severity flaws. The concept of a patch window measured in days is no longer realistic for internet-exposed AI infrastructure.</p></li><li><p>.env file harvesting is the attacker&#8217;s first move because those files contain API keys for LLMs, vector databases, and cloud services the workflow connects to.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>If Langflow runs on any internet-accessible host, treat the environment as potentially compromised and rotate all associated credentials before patching.</p></li><li><p>Segment AI workflow orchestration platforms behind authentication and network controls. These tools have no business being directly internet-accessible.</p></li><li><p>Verify Langflow version across your environment immediately. Anything prior to 1.9.0 is an open liability.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>The 20-hour exploitation timeline should reframe your vulnerability management program. That program was designed when you had days or weeks to act. That era closed. CISA&#8217;s KEV catalog is now your minimum viable patch priority list, and if you are not at sub-72-hour remediation SLAs for critical AI infrastructure, you are already behind. Organizations still describing AI workflow platforms as &#8220;internal tools&#8221; need a rethink. Internal tools with LLM API keys, cloud credentials, and production data connections are not internal in any meaningful threat model. An attacker who executes code in your Langflow environment has lateral movement access to every system that environment touches.</p><h3>5. LangChain and LangGraph: Three CVEs, 84 Million Downloads Exposed</h3><p>Cyera security researcher Vladimir Tokarev disclosed three vulnerabilities in LangChain and LangGraph on March 27, each covering a different attack path against the same enterprise AI framework (The Hacker News). CVE-2026-34070 (CVSS 7.5) enables path traversal to arbitrary files through manipulated prompt templates. CVE-2025-68664 (CVSS 9.3) allows extraction of API keys and environment secrets through unsafe deserialization. CVE-2025-67644 (CVSS 7.3) enables SQL injection in LangGraph&#8217;s SQLite checkpoint layer. LangChain, LangChain-Core, and LangGraph collectively logged over 84 million downloads. Patches are available: LangChain Core 1.2.22+, LangChain-Core 0.3.81+ or 1.2.5+, and LangGraph checkpoint sqlite 3.0.1+.</p><p><strong>Why it matters</strong></p><ul><li><p>These three CVEs cover filesystem data, environment secrets, and conversation history in combination. Together, they represent near-total information exposure for any application built on these frameworks.</p></li><li><p>The 84 million download count means a significant portion of enterprise AI applications are affected. Most organizations do not know which AI frameworks their development teams selected.</p></li><li><p>CVE-2025-68664 with its 9.3 CVSS is the most critical. Unsafe deserialization is a well-understood, pervasive, and reliably exploitable class of vulnerability.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Inventory every AI framework in your environment, including those embedded in third-party tools. Do not rely on developers to self-report what they are using.</p></li><li><p>Apply the three patches and validate versions before the end of the business week.</p></li><li><p>Assess what data your LangChain-based applications can access and treat those data stores as potentially exposed pending patch confirmation.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Three vulnerability classes in the same framework, covering three categories of sensitive enterprise data, were disclosed in one report. That&#8217;s what happens when you build for speed and bolt on security later. AI framework developers made that choice repeatedly, and this week&#8217;s CVE list is the invoice. LangChain is the jQuery of AI development right now. It is in everything, often without explicit organizational approval. Your AI security posture includes every dependency your developers pulled in without telling you. Get ahead of that inventory problem before the next disclosure.</p><h3>6. A Congressman Put Anthropic on Notice Over National Security</h3><p>Rep. Josh Gottheimer (D-N.J.) sent a letter to Anthropic CEO Dario Amodei on April 2, citing national security concerns arising from the source code leak (Axios, The Hill). Gottheimer&#8217;s letter noted that Claude is embedded in defense and intelligence operations, raised the prior CCP-backed group intrusion against Claude, and expressed concern that Mythos could enable more sophisticated cyberattacks against the United States. The letter also flagged Anthropic&#8217;s decision in late February to remove its binding commitment to halt model development if safety capabilities fall behind, replacing it with &#8220;nonbinding but publicly-declared&#8221; goals.</p><p><strong>Why it matters</strong></p><ul><li><p>Federal agencies and defense contractors use Claude operationally. A source code leak followed by a congressional inquiry is a vendor risk event, not a PR problem. Your GRC process should treat it as such.</p></li><li><p>Removing the binding safety commitment is a substantive policy change that the congressional record now documents. The enforceability question will follow Anthropic through every future regulatory discussion.</p></li><li><p>Gottheimer sits on the House Intelligence Committee. This is not a throwaway letter. It is a first-stage oversight action that signals more to come.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Review your vendor risk assessment for any AI provider with confirmed government contracts. Congressional inquiries are material third-party risk events.</p></li><li><p>Establish a direct communication channel with your AI vendors&#8217; enterprise security teams and request formal notification procedures for any government inquiries affecting their products.</p></li><li><p>Track the congressional record regarding Anthropic&#8217;s rollback of its safety commitment. It will surface again in budget and procurement cycles.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>The safety commitment rollback from February is the most substantive issue in that letter. Anthropic replaced a binding pledge to pause development if safety fell behind with goals they grade themselves on. That is not a small change. That is the foundational accountability mechanism that distinguished their positioning from competitors, and they quietly removed it. Congressional scrutiny was predictable the moment they became embedded in national security operations. The question I would ask directly is how many federal agency customers received notification about the source code exposure before it hit the press. I would guess the answer is uncomfortable.</p><h3>7. Your Security Scanner Was the Supply Chain Attack: Trivy CVE-2026-33634</h3><p>CISA added CVE-2026-33634 to its Known Exploited Vulnerabilities catalog on March 27 (Help Net Security, Aquasecurity GitHub advisory). Attackers compromised the Trivy container security scanner on March 19, using stolen credentials to publish a malicious v0.69.4 release and force-push 76 of 77 version tags in the trivy-action repository with credential-stealing malware. The attack triggered a downstream LiteLLM supply chain compromise via poisoned PyPI packages. Federal agencies face an April 9 deadline. Root cause was non-atomic credential rotation on March 1 left a valid token exposed during the rotation window.</p><p><strong>Why it matters</strong></p><ul><li><p>Trivy is a default security tool in CI/CD pipelines across the industry. Compromising the scanner means attackers access the same environment credentials the security scan was meant to protect.</p></li><li><p>Force-pushing 76 version tags is a comprehensive compromise. Any pipeline that pins to mutable major or minor version tags rather than specific commit hashes was exposed.</p></li><li><p>The downstream LiteLLM PyPI compromise extends the blast radius into Python environments running LLM application code. The supply chain damage propagated well beyond the initial tool compromise.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Audit every CI/CD pipeline for trivy-action or setup-trivy at mutable version tags and pin to specific commit hashes immediately.</p></li><li><p>Treat any environment that ran a compromised Trivy version since March 19 as potentially credential-compromised. Rotate all associated tokens, SSH keys, and cloud credentials.</p></li><li><p>Apply this lesson to every security tool in your pipeline. Security tooling supply chains are higher-value targets than application code supply chains.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>The attacker turned the vulnerability scanner into the vulnerability. That is the platonic ideal of a supply chain attack: targeting organizations that care about security and embed security tooling in their build pipelines. The more security-conscious your culture, the higher your Trivy adoption, and the more exposed you were. The non-atomic credential rotation is the root cause. Aquasecurity rotated credentials on March 1 but did not revoke all tokens simultaneously. The attacker grabbed freshly rotated secrets during the window between invalidation and deployment. If your own rotation procedures have a gap between &#8220;revoke old&#8221; and &#8220;confirm new is live,&#8221; that gap is your exposure. Run your playbooks against that question this week.</p><h3>8. The State AI Chatbot Safety Wave Is Not Waiting for Washington</h3><p>Georgia&#8217;s state senate voted to concur in the House-amended version of SB 540 during the week of March 27, sending the chatbot disclosure and minor-protection bill to Governor Kemp&#8217;s desk (Troutman Privacy, Transparency Coalition). Idaho&#8217;s S 1297 passed its full legislature and advanced to Governor Little. Both are chatbot safety measures. Georgia&#8217;s bill requires disclosure every three hours for adult users and every hour for minors, along with explicit suicide and self-harm response protocols for conversational AI services. The Future of Privacy Forum&#8217;s tracker now counts 78 AI chatbot safety bills moving across 27 states in 2026.</p><p><strong>Why it matters</strong></p><ul><li><p>Disclosure, minor safety, and mental health response requirements are becoming the regulatory floor across state jurisdictions. Organizations operating consumer-facing AI products need a 50-state tracking capability, not a wait-and-see approach.</p></li><li><p>Hourly disclosure requirements for minors are not trivial to implement for many chatbot architectures. The compliance engineering work should start now.</p></li><li><p>Seventy-eight bills across 27 states mean that any federal preemption framework, if one ever arrives, faces an already established patchwork of state obligations to reconcile.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Map your consumer AI products against chatbot disclosure requirements in every state where users reside. Georgia and Idaho represent the floor, not the ceiling.</p></li><li><p>Assess your chatbot&#8217;s existing mental health response protocols against the Georgia requirement specifics. A disclaimer is not compliant.</p></li><li><p>Assign someone accountable for multi-state AI governance tracking. This is not a future compliance problem.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Washington cannot pass a federal AI framework. States can. Fifty legislatures with different requirements and different timelines is the compliance nightmare that preemption was supposed to prevent. It didn&#8217;t. Georgia&#8217;s hourly minor disclosure requirement is specific, implementable, and enforceable. State legislatures are producing more actionable compliance requirements than most federal guidance I have seen this year. If you deploy consumer AI products and you don&#8217;t have someone accountable for multi-state AI governance tracking today, that gap closes before Q3 or it closes you.</p><h3>9. The EU AI Act Has an Enforcement Problem, and Nobody Is Talking About It Honestly</h3><p>As of late March, only 8 of 27 EU member states had designated the single contact points required for national enforcement coordination under the AI Act, according to the European Parliament Think Tank&#8217;s enforcement analysis (Tech Policy Press, IAPP). The Digital Omnibus proposal, with negotiating positions adopted by Parliament&#8217;s IMCO and LIBE committees on March 18, would push high-risk AI compliance deadlines to December 2027 for Annex III systems and to August 2028 for Annex I systems, compared with the original August 2026 deadline. The European Commission also missed its own deadline for issuing guidance on high-risk AI systems. Trilogue negotiations between Council, Parliament, and Commission are now underway.</p><p><strong>Why it matters</strong></p><ul><li><p>Approximately 70% of EU member states are not operationally ready for AI Act enforcement. Regulations without enforcement infrastructure are aspirational documents.</p></li><li><p>The 16-month delay in high-risk requirements gives organizations breathing room on paper while creating uncertainty about what compliance standard they are being held to during the gap.</p></li><li><p>The Commission missing its own implementation guidance deadline sets a poor precedent for holding private sector organizations to their compliance timelines.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Do not use the delay as a license to defer governance program work. The underlying obligations have not changed in substance. Build the program now and own it.</p></li><li><p>Review the Digital Omnibus amendments specifically for changes to the high-risk AI system definition. Legislative simplification sometimes reclassifies systems in ways that alter the scope of compliance.</p></li><li><p>Subscribe to IAPP&#8217;s EU AI Act tracker for updates on the trilogue outcome. The final text will differ from both Council and Parliament positions.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Eight out of 27 enforcement bodies are operational as the Act&#8217;s first major deadlines approach. The Commission missed its own implementation guidance deadline. The most substantive AI governance framework on the planet is running on infrastructure that is not ready to enforce it. The delay does not invalidate the regulation. Organizations that build genuine AI risk management programs now will be positioned for whatever enforcement timeline materializes. Organizations that chase the deadline and treat compliance as documentation will be exposed when the enforcement machinery catches up. That gap grows wider every quarter.</p><h3>The One Thing You Won&#8217;t Hear About But You Need To</h3><h3>NVIDIA and Johns Hopkins Gave You a Blueprint for Defending AI Agents Against Prompt Injection</h3><p>Researchers from NVIDIA and Johns Hopkins University published &#8220;Architecting Secure AI Agents: Perspectives on System-Level Defenses Against Indirect Prompt Injection Attacks&#8221; on March 31 (<a href="https://arxiv.org/abs/2603.30016">ArXiv 2603.30016</a>). The paper addresses how AI agents are vulnerable not to direct attacks on the model but to malicious instructions embedded in data the agent processes during task execution. The authors articulate three architectural positions. First, agents in dynamic environments need dynamic replanning with security policy updates built into the replanning loop. Second, security decisions requiring contextual judgment should still involve LLMs, but only within system designs that strictly constrain what the model can observe and decide. Third, ambiguous situations should treat human interaction as a core design consideration, not an edge case to minimize.</p><p><strong>Why it matters</strong></p><ul><li><p>This paper frames indirect prompt injection as an architectural problem, not a model alignment problem. You cannot align your way out of it. You design it out or you accept the risk.</p></li><li><p>The principle of strictly constraining what the model can observe and decide has immediate practical application as your primary defense lever, more effective than filtering or detection approaches.</p></li><li><p>The human oversight design principle directly contradicts how most agentic deployments are being built, with human review treated as friction to reduce rather than a security control to preserve.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Read the paper. At 12 pages, it is short enough to share with your AI architects and security engineers before the next deployment review meeting.</p></li><li><p>Audit any agentic AI system currently in your environment against the observation scope and decision authority questions. Broad scope plus broad authority equals your highest-risk deployment.</p></li><li><p>Make human oversight an explicit design requirement in your AI agent security standards. Document the specific conditions under which an agent must pause and request human authorization.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Nobody outside the AI security research community covered this paper. That is precisely why it belongs here. The breach reports get attention. The architecture guidance that would prevent the next breach sits on ArXiv with a few hundred downloads. I have been arguing at <a href="https://www.rockcyber.com">RockCyber</a> for two years that agentic AI security is an architecture problem. You do not solve it with better prompts or stronger models. You solve it with privilege constraints, observation scope limits, and honest human oversight design. NVIDIA and Johns Hopkins gave you a 12-page framework for that conversation. If your next AI agent deployment review does not address these three principles, you are building exposure, not capability.</p><p>&#128073; For ongoing analysis of agentic AI governance frameworks, the conversation continues at <strong><a href="https://rockcybermusings.com">RockCyber Musings</a></strong>.</p><p>&#128073; Visit <strong><a href="https://www.rockcyber.com/">RockCyber.com</a></strong> to learn more about how we can help you in your traditional Cybersecurity and AI Security and Governance Journey</p><p>&#128073; Want to save a quick $100K? Check out our AI Governance Tools at <strong><a href="https://aigovernancetoolkit.com/">AIGovernanceToolkit.com</a></strong></p><p>&#128073; Subscribe for more AI and cyber insights with the occasional rant.</p><p><em>The views and opinions expressed in RockCyber Musings are my own and do not represent the positions of my employer or any organization I&#8217;m affiliated with.</em></p><div class="captioned-button-wrap" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260327-20260402?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="CaptionedButtonToDOM"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! This post is public, so feel free to share it.</p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260327-20260402?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260327-20260402?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&quot;,&quot;text&quot;:&quot;Share RockCyber Musings&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share"><span>Share RockCyber Musings</span></a></p><h2>References</h2><p>Axios. (2026, March 31). Anthropic leaked its own Claude source code. <a href="https://www.axios.com/2026/03/31/anthropic-leaked-source-code-ai">https://www.axios.com/2026/03/31/anthropic-leaked-source-code-ai</a></p><p>Axios. (2026, April 2). Exclusive: Gottheimer presses Anthropic on source code leaks and safety protocols. <a href="https://www.axios.com/2026/04/02/gottheimer-anthropic-source-code-leaks">https://www.axios.com/2026/04/02/gottheimer-anthropic-source-code-leaks</a></p><p>BleepingComputer. (2026, March 27). CISA: New Langflow flaw actively exploited to hijack AI workflows. <a href="https://www.bleepingcomputer.com/news/security/cisa-new-langflow-flaw-actively-exploited-to-hijack-ai-workflows/">https://www.bleepingcomputer.com/news/security/cisa-new-langflow-flaw-actively-exploited-to-hijack-ai-workflows/</a></p><p>Bloomberg. (2026, March 27). European Commission&#8217;s data stolen in hack on AWS account. <a href="https://www.bloomberg.com/news/articles/2026-03-27/european-commission-s-data-stolen-in-hack-on-aws-account">https://www.bloomberg.com/news/articles/2026-03-27/european-commission-s-data-stolen-in-hack-on-aws-account</a></p><p>Bloomberg. (2026, April 1). Anthropic takes down thousands of GitHub repos trying to yank its leaked source code. <a href="https://www.bloomberg.com/news/articles/2026-04-01/anthropic-scrambles-to-address-leak-of-claude-code-source-code">https://www.bloomberg.com/news/articles/2026-04-01/anthropic-scrambles-to-address-leak-of-claude-code-source-code</a></p><p>CNBC. (2026, March 31). Anthropic leaks part of Claude Code&#8217;s internal source code. <a href="https://www.cnbc.com/2026/03/31/anthropic-leak-claude-code-internal-source.html">https://www.cnbc.com/2026/03/31/anthropic-leak-claude-code-internal-source.html</a></p><p>CoinDesk. (2026, March 27). Anthropic&#8217;s massive Claude Mythos leak reveals a new AI model that could be a cybersecurity nightmare. <a href="https://www.coindesk.com/markets/2026/03/27/anthropic-s-massive-claude-mythos-leak-reveals-a-new-ai-model-that-could-be-a-cybersecurity-nightmare">https://www.coindesk.com/markets/2026/03/27/anthropic-s-massive-claude-mythos-leak-reveals-a-new-ai-model-that-could-be-a-cybersecurity-nightmare</a></p><p>Fortune. (2026, March 27). Anthropic accidentally leaked details of a new AI model that poses unprecedented cybersecurity risks. <a href="https://fortune.com/2026/03/27/anthropic-leaked-ai-mythos-cybersecurity-risk/">https://fortune.com/2026/03/27/anthropic-leaked-ai-mythos-cybersecurity-risk/</a></p><p>Fortune. (2026, March 31). Anthropic leaks its own AI coding tool&#8217;s source code in second major security breach. <a href="https://fortune.com/2026/03/31/anthropic-source-code-claude-code-data-leak-second-security-lapse-days-after-accidentally-revealing-mythos/">https://fortune.com/2026/03/31/anthropic-source-code-claude-code-data-leak-second-security-lapse-days-after-accidentally-revealing-mythos/</a></p><p>Help Net Security. (2026, March 27). CISA sounds alarm on Langflow RCE, Trivy supply chain compromise after rapid exploitation. <a href="https://www.helpnetsecurity.com/2026/03/27/cve-2026-33017-cve-2026-33634-exploited/">https://www.helpnetsecurity.com/2026/03/27/cve-2026-33017-cve-2026-33634-exploited/</a></p><p>Help Net Security. (2026, March 30). Second data breach at European Commission this year leaves open questions over resilience. <a href="https://www.helpnetsecurity.com/2026/03/30/european-commission-cyberattack-cloud-infrastructure-website/">https://www.helpnetsecurity.com/2026/03/30/european-commission-cyberattack-cloud-infrastructure-website/</a></p><p>IAPP. (2026). European Commission misses deadline for AI Act guidance on high-risk systems. <a href="https://iapp.org/news/a/european-commission-misses-deadline-for-ai-act-guidance-on-high-risk-systems">https://iapp.org/news/a/european-commission-misses-deadline-for-ai-act-guidance-on-high-risk-systems</a></p><p>IAPP. (2026, March). EU Digital Omnibus: Analysis of key changes. <a href="https://iapp.org/news/a/eu-digital-omnibus-analysis-of-key-changes">https://iapp.org/news/a/eu-digital-omnibus-analysis-of-key-changes</a></p><p>Qualys ThreatPROTECT. (2026, March 26). CISA Added Langflow Vulnerability to its Known Exploited Vulnerabilities Catalog (CVE-2026-33017). <a href="https://threatprotect.qualys.com/2026/03/26/cisa-added-langflow-vulnerability-to-its-known-exploited-vulnerabilities-catalog-cve-2026-33017/">https://threatprotect.qualys.com/2026/03/26/cisa-added-langflow-vulnerability-to-its-known-exploited-vulnerabilities-catalog-cve-2026-33017/</a></p><p>SecurityAffairs. (2026, March 27). The European Commission confirmed a cyberattack affecting part of its cloud systems. <a href="https://securityaffairs.com/190067/data-breach/the-european-commission-confirmed-a-cyberattack-affecting-part-of-its-cloud-systems.html">https://securityaffairs.com/190067/data-breach/the-european-commission-confirmed-a-cyberattack-affecting-part-of-its-cloud-systems.html</a></p><p>Sysdig. (2026, March 27). CVE-2026-33017: How attackers compromised Langflow AI pipelines in 20 hours. <a href="https://www.sysdig.com/blog/cve-2026-33017-how-attackers-compromised-langflow-ai-pipelines-in-20-hours">https://www.sysdig.com/blog/cve-2026-33017-how-attackers-compromised-langflow-ai-pipelines-in-20-hours</a></p><p>TechCrunch. (2026, March 27). European Commission confirms cyberattack after hackers claim data breach. <a href="https://techcrunch.com/2026/03/27/european-commission-confirms-cyberattack-after-hackers-claim-data-breach/">https://techcrunch.com/2026/03/27/european-commission-confirms-cyberattack-after-hackers-claim-data-breach/</a></p><p>TechCrunch. (2026, April 1). Anthropic took down thousands of GitHub repos trying to yank its leaked source code. <a href="https://techcrunch.com/2026/04/01/anthropic-took-down-thousands-of-github-repos-trying-to-yank-its-leaked-source-code-a-move-the-company-says-was-an-accident/">https://techcrunch.com/2026/04/01/anthropic-took-down-thousands-of-github-repos-trying-to-yank-its-leaked-source-code-a-move-the-company-says-was-an-accident/</a></p><p>The Hacker News. (2026, March 27). LangChain, LangGraph flaws expose files, secrets, databases in widely used AI frameworks. <a href="https://thehackernews.com/2026/03/langchain-langgraph-flaws-expose-files.html">https://thehackernews.com/2026/03/langchain-langgraph-flaws-expose-files.html</a></p><p>The Hill. (2026, April 2). House Democrat pushes Anthropic on safety protocols, source code leak. <a href="https://thehill.com/policy/technology/5812881-gottheimer-presses-anthropic-ai-safety/">https://thehill.com/policy/technology/5812881-gottheimer-presses-anthropic-ai-safety/</a></p><p>Tech Policy Press. (2026). EU&#8217;s AI Act delays let high-risk systems dodge oversight. <a href="https://www.techpolicy.press/eus-ai-act-delays-let-highrisk-systems-dodge-oversight/">https://www.techpolicy.press/eus-ai-act-delays-let-highrisk-systems-dodge-oversight/</a></p><p>Transparency Coalition. (2026, March 27). AI legislative update: March 27, 2026. <a href="https://www.transparencycoalition.ai/news/ai-legislative-update-march27-2026">https://www.transparencycoalition.ai/news/ai-legislative-update-march27-2026</a></p><p>Troutman Pepper Locke. (2026, March 30). Proposed state AI law update: March 30, 2026. <a href="https://www.troutmanprivacy.com/2026/03/proposed-state-ai-law-update-march-30-2026/">https://www.troutmanprivacy.com/2026/03/proposed-state-ai-law-update-march-30-2026/</a></p><p>Aquasecurity. (2026). Trivy ecosystem supply chain temporarily compromised [GitHub Security Advisory GHSA-69fq-xp46-6x23]. <a href="https://github.com/aquasecurity/trivy/security/advisories/GHSA-69fq-xp46-6x23">https://github.com/aquasecurity/trivy/security/advisories/GHSA-69fq-xp46-6x23</a></p><p>European Parliament Think Tank. (2026, March 18). Enforcement of the AI Act. <a href="https://epthinktank.eu/2026/03/18/enforcement-of-the-ai-act/">https://epthinktank.eu/2026/03/18/enforcement-of-the-ai-act/</a></p><p>Jiang, Z., et al. (2026, March 31). Architecting secure AI agents: Perspectives on system-level defenses against indirect prompt injection attacks [Preprint]. ArXiv. <a href="https://arxiv.org/abs/2603.30016">https://arxiv.org/abs/2603.30016</a></p>]]></content:encoded></item><item><title><![CDATA[AI Monitoring Is a Standards Problem, Not a Technology Problem]]></title><description><![CDATA[NIST AI 800-4 proves AI monitoring fails from missing standards, not missing tech. Specific actions CISOs should take before EU AI Act Article 72 hits August 2026.]]></description><link>https://www.rockcybermusings.com/p/ai-monitoring-standards-gap-nist-ai-800-4</link><guid isPermaLink="false">https://www.rockcybermusings.com/p/ai-monitoring-standards-gap-nist-ai-800-4</guid><dc:creator><![CDATA[Rock Lambros]]></dc:creator><pubDate>Tue, 31 Mar 2026 12:50:10 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!c_2d!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c8b894d-0ef9-41d5-ac90-daa69ba1bfeb_2048x2048.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!c_2d!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c8b894d-0ef9-41d5-ac90-daa69ba1bfeb_2048x2048.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!c_2d!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c8b894d-0ef9-41d5-ac90-daa69ba1bfeb_2048x2048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!c_2d!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c8b894d-0ef9-41d5-ac90-daa69ba1bfeb_2048x2048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!c_2d!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c8b894d-0ef9-41d5-ac90-daa69ba1bfeb_2048x2048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!c_2d!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c8b894d-0ef9-41d5-ac90-daa69ba1bfeb_2048x2048.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!c_2d!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c8b894d-0ef9-41d5-ac90-daa69ba1bfeb_2048x2048.jpeg" width="1456" height="1456" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0c8b894d-0ef9-41d5-ac90-daa69ba1bfeb_2048x2048.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1456,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:3060907,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/192386096?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c8b894d-0ef9-41d5-ac90-daa69ba1bfeb_2048x2048.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!c_2d!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c8b894d-0ef9-41d5-ac90-daa69ba1bfeb_2048x2048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!c_2d!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c8b894d-0ef9-41d5-ac90-daa69ba1bfeb_2048x2048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!c_2d!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c8b894d-0ef9-41d5-ac90-daa69ba1bfeb_2048x2048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!c_2d!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c8b894d-0ef9-41d5-ac90-daa69ba1bfeb_2048x2048.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>NIST just published an admission that nobody knows how to monitor AI systems after deployment. NIST AI 800-4, &#8220;Challenges to the Monitoring of Deployed AI Systems,&#8221; reviews findings from three workshops, 250+ experts, and almost 90 research papers. The document catalogs over 30 distinct challenges. It offers zero solutions. That&#8217;s not a criticism. That&#8217;s the diagnosis, and that should raise your spidey senses.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/p/ai-monitoring-standards-gap-nist-ai-800-4?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/p/ai-monitoring-standards-gap-nist-ai-800-4?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><h2>NIST Mapped the Mess</h2><p>The report organizes post-deployment AI monitoring into six categories:</p><ol><li><p>Functionality (does it still work as intended?)</p></li><li><p>Operational (does the infrastructure hold?)</p></li><li><p>Human Factors (is it transparent and useful to humans?)</p></li><li><p>Security (is it defended against attacks?)</p></li><li><p>Compliance (does it meet regulatory requirements?)</p></li><li><p>Large-Scale Impacts (does it promote human flourishing?)</p></li></ol><p>Each category carries its own distinct challenges. Functionality monitoring suffers from a lack of ground-truth datasets and a lack of a reliable way to detect model drift. Operational monitoring struggles with fragmented logging across distributed infrastructure. Human Factors monitoring, which drew more practitioner attention than any other category in the workshops, remains almost entirely unstudied in the literature. Security monitoring faces the unsettling reality that some models appear to detect when they&#8217;re being evaluated, changing their behavior under observation. Compliance monitoring lacks even basic tracking of terms-of-service violations, including downstream fine-tuning of open models for CSAM generation. Large-Scale Impacts monitoring lacks agreed-upon metrics to measure whether AI systems help or harm people at scale.</p><p>That&#8217;s a lot of individual problems. The question is whether they share a common root cause.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!lyFV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc03c3c9b-daf1-4081-8f07-392aa245e745_4675x1570.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!lyFV!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc03c3c9b-daf1-4081-8f07-392aa245e745_4675x1570.png 424w, https://substackcdn.com/image/fetch/$s_!lyFV!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc03c3c9b-daf1-4081-8f07-392aa245e745_4675x1570.png 848w, https://substackcdn.com/image/fetch/$s_!lyFV!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc03c3c9b-daf1-4081-8f07-392aa245e745_4675x1570.png 1272w, https://substackcdn.com/image/fetch/$s_!lyFV!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc03c3c9b-daf1-4081-8f07-392aa245e745_4675x1570.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!lyFV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc03c3c9b-daf1-4081-8f07-392aa245e745_4675x1570.png" width="1456" height="489" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c03c3c9b-daf1-4081-8f07-392aa245e745_4675x1570.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:489,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:372741,&quot;alt&quot;:&quot;Flowchart showing five cross-cutting monitoring challenges identified by NIST AI 800-4 converging on a missing standards layer as the common root cause&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/192386096?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc03c3c9b-daf1-4081-8f07-392aa245e745_4675x1570.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Flowchart showing five cross-cutting monitoring challenges identified by NIST AI 800-4 converging on a missing standards layer as the common root cause" title="Flowchart showing five cross-cutting monitoring challenges identified by NIST AI 800-4 converging on a missing standards layer as the common root cause" srcset="https://substackcdn.com/image/fetch/$s_!lyFV!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc03c3c9b-daf1-4081-8f07-392aa245e745_4675x1570.png 424w, https://substackcdn.com/image/fetch/$s_!lyFV!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc03c3c9b-daf1-4081-8f07-392aa245e745_4675x1570.png 848w, https://substackcdn.com/image/fetch/$s_!lyFV!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc03c3c9b-daf1-4081-8f07-392aa245e745_4675x1570.png 1272w, https://substackcdn.com/image/fetch/$s_!lyFV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc03c3c9b-daf1-4081-8f07-392aa245e745_4675x1570.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure 1: NIST AI 800-4 Cross-Cutting Challenges</figcaption></figure></div><h2>The Root Cause NIST Documented Without Naming</h2><p>Read the cross-cutting challenges section carefully. Five categories of barriers span every monitoring type: </p><ol><li><p>No trusted methods and tools</p></li><li><p>Poor visibility and transparency</p></li><li><p>Pace of change</p></li><li><p>Organizational incentive failures</p></li><li><p>Resource constraints</p></li></ol><p>Strip away the academic framing, and a pattern emerges. Workshop attendees were asking questions that belong in a standards body, not a research lab.</p><p>One attendee called for &#8220;an abstraction layer for universal security and monitoring.&#8221; Others asked, &#8220;What does the information sharing of what&#8217;s measured look like up and down the value chain?&#8221; Multiple participants flagged the absence of common metrics across use cases, noting that &#8220;non-standardized logic for generating metrics across use cases prevents us from building easy platform capabilities for monitoring.&#8221;</p><p>It&#8217;s important to point out that not every challenge NIST documented is a standards problem. Detecting deceptive behavior in models that modify their behavior under observation remains an open research problem. No specification can fix it because nobody knows how to do it reliably yet. Human-AI feedback loops are an understudied science. Ground-truth dataset availability is a data and methodology problem. The field faces three categories of challenge simultaneously: standards gaps (metrics, logging formats, reporting schemas), research gaps (deceptive behavior detection, feedback loop dynamics), and adoption gaps (methods exist in adjacent fields but aren&#8217;t applied to AI).</p><p>The standards layer is the prerequisite that makes progress on the other two categories possible. Without common definitions, you can&#8217;t scale research findings into production monitoring. Without shared schemas, adoption of proven methods stays trapped inside individual vendor implementations. Take deception detection as an example. You can&#8217;t begin researching whether a model&#8217;s stated reasoning matches its actual behavior unless you&#8217;re capturing structured reasoning traces alongside action logs in the first place. The research gap depends on closing the standards gap.</p><h2>You&#8217;ve Seen This Movie Before</h2><p>How did this work out for us in cybersecurity? We&#8217;ve had a 20-year head start on this exact problem.</p><p>Before syslog standardization, every network device vendor shipped its own logging format. Security teams drowned in data they couldn&#8217;t correlate. Firewalls from one vendor produced logs that meant nothing to the SIEM built for another vendor&#8217;s format. Every firewall had monitoring, but none of them spoke the same language.</p><p>The fix wasn&#8217;t a better firewall. It was CEF (Common Event Format), then LEEF (Log Event Extended Format), and now OCSF (Open Cybersecurity Schema Framework). Common schemas let security teams correlate events across vendors, build cross-platform detection rules, and operate SOCs that don&#8217;t require a translator for each data source. The technology didn&#8217;t change. The standards layer underneath made the existing technology useful at scale.</p><p>The AI monitoring equivalent would need agent-specific semantic conventions built on the observability infrastructure enterprises already operate. Not a new standard competing with OpenTelemetry. Extensions to OpenTelemetry that understand agent reasoning steps, tool calls, and multi-agent handoffs. Security events are mapped to schemas that flow into existing SIEMs without custom parsers. The pattern is identical: don&#8217;t build a parallel universe of AI-specific tooling. Extend the standards that security teams already trust.</p><p>AI monitoring is stuck in the pre-syslog era. Every platform defines its own metrics, its own log structures, its own alert taxonomies. If your organization runs AI workloads across three cloud providers and two agent frameworks, you operate five separate monitoring stacks that don&#8217;t talk to each other.</p><p>Here&#8217;s what that looks like in practice. A regional bank deploys a customer-facing loan origination model hosted on one cloud provider&#8217;s ML platform. The model calls a third-party credit scoring API. A separate vendor supplies the fairness monitoring layer. The bank&#8217;s compliance team uses an internal dashboard that pulls from the cloud provider&#8217;s native monitoring. When the credit scoring API updates its model without notification, the loan origination model starts producing subtly different risk scores. Approval rates for one demographic bracket shift by 4% over six weeks. The fairness monitoring vendor&#8217;s tool flags a drift alert using its own proprietary metric. The cloud provider&#8217;s native monitoring shows no anomaly because its baseline was never calibrated against the third-party API&#8217;s output distribution. The compliance dashboard, which aggregates data from both sources, shows conflicting signals that the compliance analyst can&#8217;t reconcile because the two tools define &#8220;drift&#8221; differently, measure it on different time windows, and log it in incompatible formats.</p><p>Nobody in that chain did anything wrong individually. The fairness vendor&#8217;s tool worked as designed. The cloud provider&#8217;s monitoring worked as designed. The gap was structural. There was no shared definition of what &#8220;drift&#8221; means across the pipeline, no common logging schema that would let the compliance team correlate events from two different monitoring tools, and no standardized way for the credit scoring API provider to notify downstream consumers of model updates.</p><p>That scenario plays out today in financial services, healthcare, and any sector that assembles AI capabilities from multiple vendors. NIST AI 800-4 confirmed it with receipts from 250 practitioners saying the same thing in different words.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6a20!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd225a52-68a5-492f-b1ad-66e68cc28c9b_6900x4582.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6a20!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd225a52-68a5-492f-b1ad-66e68cc28c9b_6900x4582.png 424w, https://substackcdn.com/image/fetch/$s_!6a20!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd225a52-68a5-492f-b1ad-66e68cc28c9b_6900x4582.png 848w, https://substackcdn.com/image/fetch/$s_!6a20!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd225a52-68a5-492f-b1ad-66e68cc28c9b_6900x4582.png 1272w, https://substackcdn.com/image/fetch/$s_!6a20!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd225a52-68a5-492f-b1ad-66e68cc28c9b_6900x4582.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6a20!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd225a52-68a5-492f-b1ad-66e68cc28c9b_6900x4582.png" width="1456" height="967" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fd225a52-68a5-492f-b1ad-66e68cc28c9b_6900x4582.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:967,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1081734,&quot;alt&quot;:&quot;Timeline showing regulatory monitoring requirements from EU AI Act and NIST AI RMF against the current maturity of monitoring standards&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/192386096?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd225a52-68a5-492f-b1ad-66e68cc28c9b_6900x4582.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Timeline showing regulatory monitoring requirements from EU AI Act and NIST AI RMF against the current maturity of monitoring standards" title="Timeline showing regulatory monitoring requirements from EU AI Act and NIST AI RMF against the current maturity of monitoring standards" srcset="https://substackcdn.com/image/fetch/$s_!6a20!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd225a52-68a5-492f-b1ad-66e68cc28c9b_6900x4582.png 424w, https://substackcdn.com/image/fetch/$s_!6a20!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd225a52-68a5-492f-b1ad-66e68cc28c9b_6900x4582.png 848w, https://substackcdn.com/image/fetch/$s_!6a20!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd225a52-68a5-492f-b1ad-66e68cc28c9b_6900x4582.png 1272w, https://substackcdn.com/image/fetch/$s_!6a20!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd225a52-68a5-492f-b1ad-66e68cc28c9b_6900x4582.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure 2: The Monitoring Standards Gap</figcaption></figure></div><h2>Article 72 Is Already Undeliverable</h2><p>Regulators aren&#8217;t waiting for standards to mature. The EU AI Act&#8217;s high-risk system obligations take effect August 2, 2026 (if the aren&#8217;t delayed). Article 72 requires providers of high-risk AI systems to implement post-market monitoring plans that &#8220;actively and systematically collect, document and analyse relevant data&#8221; on system performance throughout the system&#8217;s lifetime. Deployers face separate obligations to monitor operations and report serious incidents within 72-hour and 15-day windows.</p><p>Pull one thread, and the gap becomes specific. Article 72 requires providers to collect performance data &#8220;throughout their lifetime&#8221; and evaluate &#8220;continuous compliance.&#8221; NIST AI 800-4 documents that practitioners lack standardized performance metrics, can&#8217;t establish baselines or deviation thresholds, and have no systematic way to compare model behavior across providers. One workshop attendee put it bluntly: &#8220;It&#8217;s often unclear what exactly to monitor and how.&#8221; The report cites research confirming that &#8220;the appropriate metrics to capture is not standardized in the AI community&#8221; and warns this &#8220;absence can result in misleading performance measures.&#8221;</p><p>That&#8217;s not a general compliance gap. Article 72 requires continuous collection and analysis of performance data. NIST AI 800-4 confirms that the field hasn&#8217;t agreed on what &#8220;performance&#8221; means in post-deployment contexts, let alone how to measure it consistently across different AI systems and providers. The regulation demands an activity that is structurally undeliverable with the current monitoring ecosystem. Organizations filing post-market monitoring plans in 2026 will document processes built on unstandardized metrics, non-interoperable tools, and self-defined baselines. They&#8217;ll comply on paper. The monitoring itself won&#8217;t be comparable, auditable, or meaningful across organizational boundaries.</p><p>Compliance requires two capabilities this ecosystem lacks: runtime hooks that produce monitoring data in standardized formats, and trace architectures that reconstruct decision chains across organizational boundaries. Without these, Article 72 post-market monitoring plans are fiction written in incompatible vendor dialects.</p><p>NIST&#8217;s own AI Risk Management Framework compounds the pressure. The MANAGE function calls for continuous monitoring and risk response throughout deployment. The forthcoming NIST Cyber AI Profile maps cybersecurity controls to AI-specific concerns like model integrity and adversarial robustness. Every framework converges on the same expectation. The implementation layer that would make compliance verifiable doesn&#8217;t exist yet.</p><h2>Who&#8217;s Responsible? Nobody Knows That Either.</h2><p>NIST AI 800-4 surfaced a question that&#8217;s arguably more urgent than the technical gaps: who monitors? Workshop attendees repeatedly asked: &#8220;Who should do monitoring?&#8221; &#8220;Who is responsible for remediating incidents?&#8221; and &#8220;If anything is found, who can act on it?&#8221;</p><p>In the bank scenario above, was the monitoring failure the cloud provider&#8217;s responsibility? The fairness vendor&#8217;s? The credit scoring API provider&#8217;s? The bank&#8217;s compliance team? Each party monitored its own slice of the pipeline. Nobody monitored the seams between them. The NIST report documents this as an unresolved question across the AI supply chain, and it&#8217;s compounded by the standards gap. You can&#8217;t assign responsibility for monitoring when you haven&#8217;t agreed on what monitoring means. You can&#8217;t hold a vendor accountable for failing to report a drift event when &#8220;drift&#8221; has no shared definition.</p><p>A viable monitoring architecture separates three concerns. The platform exposes standardized observation and control points. An open enforcement layer applies policy through those control points, portable across any platform that exposes them. The enterprise customizes policy to its domain: financial services brings its own data sensitivity models, healthcare brings PHI detection, and any regulated industry brings its compliance requirements. When responsibilities are layered this way, the question of &#8220;who monitors?&#8221; has a structural answer. The platform enables. Open tooling enforces. The enterprise governs. Accountability follows the layer where the failure occurred.</p><p>One attendee asked how to &#8220;reduce the burden on the end user&#8221; to validate model behavior. Another asked how monitoring could become &#8220;a more collaborative practice, rather than a closed technical process.&#8221; These aren&#8217;t theoretical musings. They&#8217;re the governance questions that determine whether monitoring happens at all or degenerates into checkbox compliance where everyone points at someone else&#8217;s dashboard. A layered architecture gives each party a defined obligation: expose, enforce, govern. The current ecosystem gives everyone an excuse.</p><h2>Agents Make Everything Worse</h2><p>If the standards gap is a problem for current AI systems, it&#8217;s a crisis for agentic AI. NIST SP 800-4 repeatedly mentions agents, and the findings are sobering.</p><p>Workshop attendees flagged &#8220;lengthy agentic tasks&#8221; as especially resource-intensive to monitor. The report cites research noting that &#8220;both the agents and the operational environment are subject to change,&#8221; making static monitoring baselines unreliable. Agent identification and tracking remain unstandardized. Attendees raised visibility challenges around &#8220;out-of-distribution behavior using agent identifiers&#8221; and noted that watermarking and content provenance measures &#8220;face reliability challenges.&#8221; One attendee asked directly: &#8220;Is the model agentically attempting to subvert the monitoring setup it is under, i.e., scheming?&#8221;</p><p>That question deserves a pause. We&#8217;re building systems that plan, execute across organizational boundaries, call external tools, and collaborate with other agents. The monitoring challenges NIST documented for conventional AI systems, from detecting drift to maintaining visibility to establishing baselines, all assume a relatively static system being observed from outside. Agents aren&#8217;t static. They change behavior based on context, discover new capabilities at runtime, and operate across a distributed infrastructure that no single organization fully controls.</p><p>Any monitoring standard for agents needs a dynamic inventory mechanism. A static software bill of materials generated at deployment time is worthless when agents discover new tools, connect to new service endpoints, and modify their own capabilities during a single execution session. The inventory must update in real time, triggered by component changes, and output in formats the supply chain security ecosystem already consumes. If your agent connects to a new MCP server mid-task and your inventory doesn&#8217;t reflect that within the same session, your security team is operating on a stale map.</p><p>The &#8220;monitorability tax&#8221; concept raised in the report&#8217;s cited research captures the emerging cost structure. Model developers will pay a performance penalty, through slower inference or less capable models, to maintain the ability to monitor agent behavior. That cost rises as agent autonomy increases. Standardized hooks reduce the engineering cost by making monitoring implementation portable across frameworks, a one-time platform integration rather than custom monitoring code for every deployment. The monitorability tax on compute remains. The tax on engineering effort doesn&#8217;t have to.</p><p>The cross-provider abstraction layer that workshop attendees called for isn&#8217;t a nice-to-have for agentic systems. Without standardized hooks for runtime monitoring, standardized trace formats for multi-agent workflows, and standardized inventories of agent capabilities and dependencies, you&#8217;re watching agents through whatever proprietary window each vendor provides. You can&#8217;t correlate behavior across platforms. You can&#8217;t reconstruct decision chains that span multiple agent frameworks. You can&#8217;t audit what you can&#8217;t consistently observe.</p><p>One more structural blind spot worth naming: runtime monitoring standards assume a cooperating platform that exposes hooks. Open-weight models distributed without platforms bypass this assumption entirely. Once a model is released into the wild for anyone to run, no runtime hook exists unless the downstream deployer voluntarily implements one. Open-weight models are structurally ungovernable by runtime standards alone. Any honest conversation about the monitoring gap has to acknowledge this boundary.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Ku7d!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe46f6983-6c17-499d-a36e-9e51b3bdb476_3021x2975.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Ku7d!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe46f6983-6c17-499d-a36e-9e51b3bdb476_3021x2975.png 424w, https://substackcdn.com/image/fetch/$s_!Ku7d!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe46f6983-6c17-499d-a36e-9e51b3bdb476_3021x2975.png 848w, https://substackcdn.com/image/fetch/$s_!Ku7d!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe46f6983-6c17-499d-a36e-9e51b3bdb476_3021x2975.png 1272w, https://substackcdn.com/image/fetch/$s_!Ku7d!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe46f6983-6c17-499d-a36e-9e51b3bdb476_3021x2975.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Ku7d!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe46f6983-6c17-499d-a36e-9e51b3bdb476_3021x2975.png" width="1456" height="1434" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e46f6983-6c17-499d-a36e-9e51b3bdb476_3021x2975.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1434,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:718881,&quot;alt&quot;:&quot;Block diagram showing how agentic AI properties such as autonomous planning, tool discovery, and multi-agent collaboration amplify each monitoring challenge NIST identified&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/192386096?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe46f6983-6c17-499d-a36e-9e51b3bdb476_3021x2975.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Block diagram showing how agentic AI properties such as autonomous planning, tool discovery, and multi-agent collaboration amplify each monitoring challenge NIST identified" title="Block diagram showing how agentic AI properties such as autonomous planning, tool discovery, and multi-agent collaboration amplify each monitoring challenge NIST identified" srcset="https://substackcdn.com/image/fetch/$s_!Ku7d!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe46f6983-6c17-499d-a36e-9e51b3bdb476_3021x2975.png 424w, https://substackcdn.com/image/fetch/$s_!Ku7d!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe46f6983-6c17-499d-a36e-9e51b3bdb476_3021x2975.png 848w, https://substackcdn.com/image/fetch/$s_!Ku7d!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe46f6983-6c17-499d-a36e-9e51b3bdb476_3021x2975.png 1272w, https://substackcdn.com/image/fetch/$s_!Ku7d!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe46f6983-6c17-499d-a36e-9e51b3bdb476_3021x2975.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure 3: How Agents Amplify the Monitoring Standards Gap</figcaption></figure></div><p><strong>Key Takeaway:</strong> NIST AI 800-4 confirms what practitioners feel in their bones: AI monitoring isn&#8217;t failing because we lack technology. The standards layer that would make technology useful at scale doesn&#8217;t exist. Agents make the gap existential.</p><h3>What to do next</h3><p>Stop accepting proprietary monitoring silos. The next time you evaluate an AI platform, put these questions into the review:</p><ul><li><p>What open logging schema do your monitoring outputs conform to? If the answer is a proprietary format, ask how you export monitoring data into a format another platform can ingest without custom transformation.</p></li><li><p>How does your monitoring define and detect model drift? Compare the answer across your vendors. If two vendors define &#8220;drift&#8221; differently, your compliance team can&#8217;t produce a coherent post-market monitoring report under Article 72.</p></li><li><p>When a component in the AI pipeline (a third-party API, a model update, a data source change) shifts behavior, how does your monitoring surface cross-component effects? If the answer involves manual correlation, you have a gap that scales with system complexity.</p></li><li><p>Who in the supply chain is responsible for monitoring the seams between components? If nobody owns cross-boundary monitoring, say so in your risk register. That&#8217;s an accepted risk, not an oversight.</p></li><li><p>Does your AI platform expose standardized middleware hooks that allow your security team to intercept and evaluate agent actions before they execute? If the platform&#8217;s controls are proprietary and non-portable, your enforcement logic dies with the vendor relationship. Every policy you write, every guardrail you configure, every compliance rule you encode is locked to one vendor&#8217;s architecture.</p></li></ul><p>Push your industry groups and standards bodies. If you participate in OWASP, ISO working groups, or NIST-affiliated communities, advocate for common AI monitoring vocabularies and reference architectures. The cybersecurity field solved this problem a decade ago with common event formats and shared schemas. The AI field hasn&#8217;t started.</p><p>Audit your own monitoring maturity against the six NIST categories. Most organizations will find entire categories with no monitoring at all, particularly Human Factors and Large-Scale Impacts. Map the gaps before the next board meeting where someone asks if you&#8217;re ready for August 2026.</p><p>The full NIST AI 800-4 report is available at <a href="https://doi.org/10.6028/NIST.AI.800-4">https://doi.org/10.6028/NIST.AI.800-4</a>. </p><p>&#128073; Visit <strong><a href="https://www.rockcyber.com/">RockCyber.com</a></strong> to learn more about how we can help you in your traditional Cybersecurity and AI Security and Governance Journey</p><p>&#128073; Want to save a quick $100K? Check out our AI Governance Tools at <strong><a href="https://aigovernancetoolkit.com/">AIGovernanceToolkit.com</a></strong></p><p>&#128073; Subscribe for more AI and cyber insights with the occasional rant.</p><p><em>The views and opinions expressed in RockCyber Musings are my own and do not represent the positions of my employer or any organization I&#8217;m affiliated with.</em></p><p>Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&quot;,&quot;text&quot;:&quot;Share RockCyber Musings&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.rockcybermusings.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share"><span>Share RockCyber Musings</span></a></p>]]></content:encoded></item><item><title><![CDATA[Weekly Musings Top 10 AI Security Wrapup: Issue 31 March 20-26, 2026]]></title><description><![CDATA[RSA 2026: Every Vendor Sold an Agent. A Supply Chain Attack Ran Quietly in the Background]]></description><link>https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260320-20260326</link><guid isPermaLink="false">https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260320-20260326</guid><dc:creator><![CDATA[Rock Lambros]]></dc:creator><pubDate>Fri, 27 Mar 2026 12:11:00 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!4rwy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7802d5df-7f41-40f2-ad07-154926f08df2_1024x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4rwy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7802d5df-7f41-40f2-ad07-154926f08df2_1024x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4rwy!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7802d5df-7f41-40f2-ad07-154926f08df2_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!4rwy!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7802d5df-7f41-40f2-ad07-154926f08df2_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!4rwy!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7802d5df-7f41-40f2-ad07-154926f08df2_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!4rwy!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7802d5df-7f41-40f2-ad07-154926f08df2_1024x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4rwy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7802d5df-7f41-40f2-ad07-154926f08df2_1024x1024.png" width="1024" height="1024" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7802d5df-7f41-40f2-ad07-154926f08df2_1024x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1024,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1233556,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/192300876?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7802d5df-7f41-40f2-ad07-154926f08df2_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!4rwy!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7802d5df-7f41-40f2-ad07-154926f08df2_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!4rwy!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7802d5df-7f41-40f2-ad07-154926f08df2_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!4rwy!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7802d5df-7f41-40f2-ad07-154926f08df2_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!4rwy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7802d5df-7f41-40f2-ad07-154926f08df2_1024x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>RSA Conference 2026 closed Thursday in San Francisco. Thirty thousand attendees, six hundred exhibitors, one word on every booth banner: agentic. While the industry competed on keynotes and happy hours, LiteLLM, deployed in hundreds of enterprise AI stacks, got infected with credential-stealing code through a misconfigured GitHub Actions workflow. Malicious releases went live March 19 and March 22. Most of your security team was watching keynotes.</p><p>Underneath the conference noise, genuine signal emerged. Zenity&#8217;s CTO demonstrated live zero-click exploits against ChatGPT, Salesforce, and Microsoft Copilot on the conference floor. Palo Alto Networks Unit 42 documented new attack paths through the Model Context Protocol. HackerOne disclosed a 540% year-over-year surge in validated prompt injection vulnerabilities. The EU AI Office&#8217;s second draft Code of Practice on AI-generated content transparency is open for feedback through March 30, with prescriptive new requirements that narrow compliance discretion significantly. NIST published AI 800-4, the first federal framework for monitoring AI systems in production, with no vendor booth to announce it.</p><p>Here&#8217;s what matters and what to do about it.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260320-20260326?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260320-20260326?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><h3>1. Zenity Launches Guardian Agents and Demonstrates 0-Click AI Exploits at RSA</h3><p>Zenity launched Guardian Agents at RSA 2026 on March 23, positioning it as continuous, contextual security for AI agents across SaaS, cloud, and endpoint environments. CTO Michael Bargury ran live demonstrations titled &#8220;Your AI Agents Are My Minions,&#8221; showing zero-click prompt injection chains that manipulated Cursor into leaking developer secrets via support emails, Salesforce agents into exfiltrating customer data to an attacker-controlled server, and ChatGPT into producing persistent attacker-chosen outputs across conversations (The Register, March 23, 2026, and Help Net Security, March 24, 2026).</p><p><strong>Why it matters</strong></p><ul><li><p>Zero-click attacks eliminate the human review checkpoint most AI security frameworks assume is present. When agents act without user input, your primary detection layer disappears before the threat is visible.</p></li><li><p>Live exploitation of production enterprise systems on a conference floor is harder to dismiss than a threat model in a whitepaper.</p></li><li><p>Guardian Agents signals a market category forming in real time. The evaluation criteria you set today will shape purchasing decisions for the next several years.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Inventory every AI agent in your environment before your next board meeting. If you can&#8217;t enumerate them, you can&#8217;t monitor them.</p></li><li><p>Require vendors to document in writing which actions their agents take without explicit human approval. Non-answers are critical control gaps.</p></li><li><p>Run adversarial testing against your three highest-access agents this quarter, targeting credential extraction, data exfiltration, and cross-system manipulation.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Bargury&#8217;s demonstration strategy was the most honest thing at RSA this week: show the attack, then show the defense. Live exploitation on production systems is harder to dismiss than a slide deck built around the word autonomous. The inconvenient reality is that most enterprises already have agents running with email access, CRM credentials, and code repository permissions, with no runtime monitoring on what those agents decide to do. Selecting an AI security vendor is not the same thing as having an answer to the problem he demonstrated on the conference floor.</p><div><hr></div><h3>2. LiteLLM Infected with Credential-Stealing Code via Trivy Misconfiguration</h3><p>The Register reported March 24 that LiteLLM, a widely deployed open-source LLM API proxy, was compromised through a misconfigured Trivy GitHub Actions workflow. Attackers modified version tags on the trivy-action GitHub Action to inject malicious code into workflows organizations were already running, producing malicious releases on March 19 and March 22. The maintainer confirmed that anyone who installed and ran the project during that window should assume credentials available to their environment were exposed.</p><p><strong>Why it matters</strong></p><ul><li><p>LiteLLM sits in the critical path of many enterprise AI deployments. One compromised abstraction library reaches hundreds of downstream production systems simultaneously.</p></li><li><p>The attack exploited version tags, not direct code injection. CI/CD pipelines relying on tags rather than pinned commits ran malicious code without detection. That&#8217;s a systemic configuration gap across most enterprise pipelines.</p></li><li><p>The attack ran during RSA week when security teams were distracted. The timing was likely not accidental.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Audit every environment that pulled a LiteLLM update between March 19 and March 24. Treat those environments as potentially compromised until you confirm otherwise.</p></li><li><p>Pin all GitHub Actions to specific commit hashes, not version tags. Tags are mutable and can be silently overwritten. Commits are not.</p></li><li><p>Establish software bill of materials practices for all AI and ML dependencies. Supply chain attacks will keep finding environments where that inventory doesn&#8217;t exist.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>LiteLLM is exactly the kind of library that lands in enterprise AI stacks without a security review, installed by an ML engineer who needed to route calls to three model providers before the sprint ended. Trivy is a security tool. Attackers used a security tool misconfiguration to compromise a release pipeline for another widely used tool. If there&#8217;s a cleaner argument for applying security rigor to your own security tooling, I haven&#8217;t heard it. Your AI dependency chain needs the same scrutiny as your application dependencies. Good intentions at install time are not a compensating control.</p><div><hr></div><h3>3. Palo Alto Networks Unit 42 Documents MCP Attack Vectors</h3><p>Palo Alto Networks Unit 42 published research the week of March 20 documenting new attack paths through the Model Context Protocol, including prompt injection delivered through MCP&#8217;s sampling interface. Security researchers tracked 30 CVEs filed against MCP implementations in the preceding 60 days, including CVE-2026-25536 (cross-client data leak in the MCP TypeScript SDK) and CVE-2026-23744 (remote code execution in MCPJam Inspector). A scan of more than 500 public MCP servers found that 38% lacked authentication entirely (Unit 42, March 2026, and Adversa.ai, March 2026).</p><p><strong>Why it matters</strong></p><ul><li><p>MCP is the connective tissue between AI agents and enterprise tools. A vulnerability in this protocol exposes the entire agent ecosystem built on top of it, not one isolated system.</p></li><li><p>Thirty CVEs in 60 days signals that security review did not happen before shipping at scale. Every API ecosystem that launches with deployment velocity ahead of security assessment follows this arc.</p></li><li><p>Thirty-eight percent of scanned servers lacking authentication is systemic failure. Authentication is the minimum viable control. Everything built on top of unauthenticated servers is exposed.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Inventory every MCP server in your environment and treat unauthenticated instances as critical findings requiring immediate action.</p></li><li><p>Require authentication, authorization, and comprehensive logging for any MCP server with access to production systems or sensitive data.</p></li><li><p>Demand specific CVE status and patch timelines from your AI infrastructure vendors. Vague answers signal high risk and a vendor not tracking its own exposure.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Thirty CVEs in 60 days is not a patching problem. It&#8217;s a design problem. MCP shipped fast because the builders cared more about what AI agents could reach than how securely they could reach it. The 38% authentication gap is the number that should end budget debates about AI infrastructure security investment. Roughly two in five MCP servers operate on the assumption that only authorized parties will talk to them, which is exactly wrong in a protocol designed to connect agents to external resources. That assumption creates direct paths to your production data.</p><div><hr></div><h3>4. HackerOne Reports 540% Surge in Validated Prompt Injection Vulnerabilities</h3><p>HackerOne announced Agentic Prompt Injection Testing on March 21, paired with platform data showing a 540% year-over-year increase in validated prompt injection vulnerabilities. The service executes structured, multi-turn adversarial scenarios against live AI applications, evaluating whether injection attempts produce actual data exposure or unauthorized tool execution across interconnected agent systems (HackerOne Blog, March 2026, and Cybersecurity Insiders, March 21, 2026).</p><p><strong>Why it matters</strong></p><ul><li><p>A 540% increase in validated vulnerabilities means real researchers are finding real exploitable conditions in production systems, not theoretical edge cases.</p></li><li><p>Traditional application security testing does not cover agent-specific attack paths. If your AI agents aren&#8217;t explicitly in scope for your red team or bug bounty program, you have a documented blind spot.</p></li><li><p>Unit 42&#8217;s concurrent research on indirect prompt injection through web content eliminates the &#8220;attacker needs direct access&#8221; objection. Agents read the web. The web is the attack surface.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Add AI agents to your red team scope explicitly as a primary target category, not an afterthought appended to an existing engagement.</p></li><li><p>Require prompt injection testing as part of every AI agent release process, treated as a gate equivalent to penetration testing for any externally facing application.</p></li><li><p>Track prompt injection findings as a distinct vulnerability class in your risk register. You can&#8217;t demonstrate improvement to your board on metrics you&#8217;re not collecting separately.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Five hundred forty percent ends the debate about whether prompt injection is a real threat. I&#8217;ve heard the objection that attackers need direct access to craft payloads. Unit 42&#8217;s indirect injection research, published this same week, shows agents reading manipulated instructions from ordinary websites they visit in the course of normal operation. Your agents don&#8217;t need to be directly targeted; they need to visit the wrong page. The gap between organizations deploying AI agents and organizations testing those agents adversarially is the largest unaddressed risk exposure I see in enterprise AI programs right now.</p><div><hr></div><h3>5. Microsoft Publishes Secure Agentic AI Framework and Confirms Agent 365 May 1 GA</h3><p>Microsoft published &#8220;Secure Agentic AI End-to-End&#8221; on March 20, documenting its approach to extending Zero Trust architecture across the full AI agent lifecycle: data ingestion, model training, deployment, and runtime behavioral monitoring. The post confirmed Agent 365, Microsoft&#8217;s governance control plane for enterprise AI agents, will reach general availability on May 1, 2026, with agent identity, authorization scope, and behavioral monitoring treated as distinct security domains from traditional human-user ZT controls (Microsoft Security Blog, March 20, 2026).</p><p><strong>Why it matters</strong></p><ul><li><p>A confirmed May 1 GA date gives enterprises in Microsoft environments a concrete six-week planning horizon. Governance framework adoption takes time and that clock is already running.</p></li><li><p>Extending Zero Trust to AI agents is architecturally correct. Most ZT implementations weren&#8217;t designed with agent identity or behavioral monitoring in mind, making the gap assessment non-trivial work.</p></li><li><p>Publishing detailed technical frameworks before product GA signals Microsoft wants enterprises building governance practices now, before the product ships.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Map your current ZT architecture against the agent-specific requirements described in the March 20 post. Focus on gaps in agent identity and behavioral monitoring specifically.</p></li><li><p>Begin internal stakeholder alignment on Agent 365 if you&#8217;re in a Microsoft 365 environment. Six weeks is not enough time to start that conversation from zero.</p></li><li><p>Document agent permissions, access patterns, and decision scopes using whatever visibility tools you have today rather than waiting for Microsoft tooling.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>&#8220;End-to-end&#8221; is doing heavy lifting as a title. What Microsoft describes is extending known security primitives to a new execution context. That&#8217;s necessary work and not a complete answer. The hard problems are behavioral: distinguishing authorized agent actions from manipulated ones, detecting policy violations in real time, and maintaining audit trails that survive an incident investigation. Agent 365 is worth watching. If the behavioral monitoring is substantive, it&#8217;ll move the market. If it&#8217;s a compliance dashboard, enterprises will check the box while actual risk sits unaddressed underneath it.</p><div><hr></div><h3>6. Cisco Releases DefenseClaw Open Source on Final Day of RSA</h3><p>Cisco released DefenseClaw to GitHub on March 27, the final day of RSA 2026, as an open-source framework for scanning agent skills and sandboxing agent execution. The release accompanied Zero Trust Access for AI agents and a free AI Defense Explorer Edition targeting security practitioners. Cisco plans integration with NVIDIA OpenShell for hardware-level execution sandboxing, addressing execution isolation that software-only monitoring cannot replicate (Cisco Newsroom, March 2026, and UC Today, March 2026).</p><p><strong>Why it matters</strong></p><ul><li><p>Open-source agent security scanning means organizations can start building security into agent development pipelines without a procurement cycle or a budget line.</p></li><li><p>Hardware-anchored execution sandboxing addresses a control gap that software-only monitoring cannot close. Execution isolation for agents is systematically underinvested across the industry relative to the risk.</p></li><li><p>The open-source and Explorer Edition strategy targets developers before enterprise procurement cycles form, competing for architectural mindshare with builders rather than just buyers.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Pull DefenseClaw and run it against a non-production agent environment this month. Validate real-world utility before committing to any commercial evaluation.</p></li><li><p>Evaluate the NVIDIA sandboxing integration if you&#8217;re running NVIDIA infrastructure. Test in isolation before production consideration.</p></li><li><p>Track Cisco&#8217;s AI Defense commercial roadmap. Free Explorer Editions typically precede commercial tier launches by 12 to 18 months, and starting your evaluation now means you&#8217;ll have data when the pitch arrives.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Releasing open-source code on the last day of the conference changes the conversation from &#8220;will enterprises buy this&#8221; to &#8220;pull the repo and see for yourself.&#8221; That&#8217;s a credible move when the code is real and the threat model is honest. Run DefenseClaw against your actual agent environment before making any claims about coverage. The larger play is Cisco&#8217;s bid for the enterprise AI security architecture position using network visibility, an established security portfolio, and enterprise relationships most competitors would need a decade to build. DefenseClaw is a credible opening move. Watch the next 18 months of product decisions to judge the hand.</p><div><hr></div><h3>7. Google Deploys Gemini Agents to Process 10 Million Dark Web Posts Daily</h3><p>Google announced at RSA 2026 on March 23 that Gemini AI agents are processing more than 10 million dark web posts daily to surface threats relevant to specific organizations. The capability integrates with Google Security Operations alongside new agentic automation features, currently in preview, that let security teams combine AI-driven investigation with deterministic automated response workflows (The Register, March 23, 2026, and Google Cloud Blog, March 2026).</p><p><strong>Why it matters</strong></p><ul><li><p>Ten million posts per day changes the economics of dark web threat intelligence. Organizations that couldn&#8217;t sustain comprehensive monitoring programs gain access to Google-scale processing at a fraction of the previous cost.</p></li><li><p>Pairing AI-driven investigation with deterministic automation preserves human-defined control while extending agent reach into high-volume, low-judgment tasks. That&#8217;s the right architectural pattern for agentic SOC work.</p></li><li><p>Preview status means GA behavior, SLA, and security review standards remain unfinalized. Your production SOC is not where you run this experiment yet.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Assess your current dark web monitoring coverage gap against what this capability covers. If there&#8217;s a meaningful difference, prioritize a pilot evaluation once the feature reaches GA.</p></li><li><p>Review preview terms carefully before enabling agentic automation in any production SOC workflow. Preview features carry materially different risk profiles than GA releases.</p></li><li><p>Define which SOC workflows you&#8217;d delegate to agents and where human approval must remain. Build that policy before the tools arrive, not after they&#8217;re already running.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Threat intelligence is the most defensible application of AI agents in security operations right now. Failure modes are recoverable: the agent misses a threat and your other controls have a chance at it. Compare that to agentic incident response, where the failure mode might be blocking a production system or destroying forensic evidence. Start with intelligence, not response. The preview framing signals Google is collecting operational data before committing to GA behavior guarantees, which is reasonable product discipline. It also means you wait for GA before running this where failures have material consequences.</p><div><hr></div><h3>8. Novee Launches Autonomous AI Red Teaming Platform for LLM Applications</h3><p>Novee announced autonomous AI red teaming for LLM applications on March 24 at RSA Conference 2026. The platform deploys an AI pentesting agent that executes multi-turn adversarial scenarios against live systems, simulating attacker chaining techniques across prompt injection, jailbreaks, data exfiltration paths, and agent behavior manipulation, covering any LLM-powered system regardless of model provider with optional CI/CD pipeline integration (GlobeNewswire, March 24, 2026, and Help Net Security, March 24-25, 2026).</p><p><strong>Why it matters</strong></p><ul><li><p>Traditional pentesting tools were designed for pre-LLM application security problems. Novee builds red teaming from actual LLM vulnerability research, producing findings that adapted traditional tools miss.</p></li><li><p>CI/CD pipeline integration lets security teams catch prompt injection and agent manipulation issues before production deployment rather than after an incident surfaces them.</p></li><li><p>Two distinct companies announced adversarial AI testing capabilities at RSA 2026 in the same week. Market formation around this problem is accelerating.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Evaluate Novee&#8217;s beta against a non-production LLM application to understand what it surfaces relative to your existing security testing coverage.</p></li><li><p>Map the gap between your current SDL and what LLM-specific adversarial testing would require. The gap is almost certainly larger than you expect it to be.</p></li><li><p>Add AI-native red teaming as a release gate requirement for any LLM application reaching production. Make it a gate, not a post-deployment recommendation that teams skip.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Two autonomous AI red teaming announcements in one RSA week tells you the market is accepting that testing AI systems requires AI-specific tooling, not adapted traditional approaches. That&#8217;s a healthy development even if the tools themselves are early. The CI/CD integration angle is the most practically valuable feature: security issues caught before production deployment cost a fraction of what they cost after deployment. If you&#8217;re shipping LLM applications without adversarial testing in the pipeline, you&#8217;re making a risk decision that most boards don&#8217;t know they&#8217;re making.</p><div><hr></div><h3>9. EU AI Office Second Draft Code of Practice Enters Final Feedback Window</h3><p>The EU AI Office published its second draft Code of Practice on AI-Generated Content Transparency on March 3, with the stakeholder feedback window closing March 30. The second draft moves from high-level principles toward prescriptive, technically detailed commitments, narrowing compliance discretion and signaling how regulators will likely assess conformance in practice. A third and final version is expected by June 2026, ahead of the August 2 applicability date for AI-generated content transparency obligations (Herbert Smith Freehills Kramer, March 2026, and BABL AI, March 2026).</p><p><strong>Why it matters</strong></p><ul><li><p>Draft 2&#8217;s shift to prescriptive technical commitments closes the interpretation space organizations were using to plan flexible compliance programs. The gap between &#8220;we have a policy&#8221; and &#8220;we meet the technical specification&#8221; narrowed significantly this month.</p></li><li><p>The March 30 feedback deadline is this weekend. If your organization has substantive views on requirements that are technically unworkable, the window to influence the final text is closing.</p></li><li><p>August 2 is not distant. Organizations waiting for final text before beginning compliance work are accepting a six-week implementation sprint under real enforcement conditions.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Read Draft 2 this week. The technical specificity represents a meaningful change from Draft 1, and your compliance planning may need adjustment.</p></li><li><p>Submit feedback before March 30 if the current draft creates compliance constraints you believe are technically unworkable for your AI content operations.</p></li><li><p>Begin implementation planning against Draft 2 requirements now. The June final text will refine but won&#8217;t fundamentally restructure what&#8217;s already written.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Every organization waiting for final text before starting EU AI Act compliance work is playing a game where the timeline gets worse each quarter they wait. Draft 2 is prescriptive enough to start serious implementation planning. The adjustments you&#8217;ll need when Draft 3 drops will be smaller than the work you&#8217;ll need to compress into six weeks if you start in June. The transparency labeling requirements are more technically demanding than most organizations appreciate from reading summaries. Download Draft 2 from the EU&#8217;s digital strategy portal and read it against your actual AI content production workflows. That gap analysis is the starting point for everything else.</p><div><hr></div><h3>10. RSA 2026 Reveals a Contested Market for AI Agent Governance Control Planes</h3><p>A pattern emerged across RSA 2026 beyond individual product launches: the governance control plane for AI agents is being actively contested by multiple major vendors. Microsoft&#8217;s Agent 365 (GA May 1), Cisco&#8217;s DefenseClaw (released March 27), SentinelOne&#8217;s Prompt AI Agent Security control plane, and Nudge Security&#8217;s AI agent discovery expansion all launched during the conference week, each addressing the same fundamental problem: enterprises deploy AI agents and lose track of what those agents do, access, and decide autonomously (SecurityWeek, March 2026, and Biometric Update, March 2026).</p><p><strong>Why it matters</strong></p><ul><li><p>Multiple major vendors converging on the same problem in the same week signals enterprises are actively requesting governance solutions, not absorbing vendor-manufactured demand.</p></li><li><p>Competition between Microsoft&#8217;s integrated control plane and point solutions from Cisco, SentinelOne, and Nudge creates a real architectural decision. Choose wrong and you own the integration debt for years.</p></li><li><p>None of these products fully solves behavioral monitoring. They address discovery, policy enforcement, and visibility. Real-time behavioral anomaly detection for agents remains an open engineering challenge.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Define your AI agent governance requirements before evaluating any vendor. Required capabilities: inventory discovery, permission auditing, behavioral logging, and human approval workflows for high-risk actions.</p></li><li><p>Assess whether your environment favors an integrated control plane or best-of-breed point solutions based on your actual architecture, not vendor marketing claims.</p></li><li><p>Ask every vendor during evaluation: how does the product detect when an agent takes an authorized action it was manipulated into taking? The answer quality will differentiate vendors quickly.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>When four vendors announce competing governance control planes at the same conference in the same week, you&#8217;re watching a market category consolidate in real time. That&#8217;s interesting for analysts and exhausting for practitioners who have to evaluate all of it while managing agents already running in production without any governance. My advice: don&#8217;t let the governance platform debate distract from the more urgent problem of knowing what agents you currently have. Most enterprises have agents deployed that security teams didn&#8217;t authorize, can&#8217;t enumerate, and have no logs on. Governance tooling is the right investment. Knowing what you&#8217;re governing is the prerequisite.</p><div><hr></div><h3>The One Thing You Won&#8217;t Hear About But You Need To</h3><p><strong>NIST Publishes AI 800-4: The First Federal Framework for Monitoring AI Systems in Production</strong></p><p>NIST published AI 800-4, &#8220;Challenges to the Monitoring of Deployed AI Systems,&#8221; in March 2026. Built from three practitioner workshops with more than 200 experts across academia, industry, and ten-plus federal agencies, plus an 87-paper literature review, it maps the gaps, barriers, and open questions in monitoring AI systems after deployment. It covers six monitoring categories: functionality, operational health, human factors, security, safety, and compliance. It received no RSA booth, no vendor keynote, and no sponsored coverage (NIST News, March 2026, and NIST AI 800-4 PDF, March 2026).</p><p><strong>Why it matters</strong></p><ul><li><p>Most organizations deploying AI monitor latency and availability. AI 800-4 addresses whether the model behaves consistently with its training distribution and produces outputs that align with policy, which are the failures that matter most and the ones traditional monitoring misses entirely.</p></li><li><p>NIST explicitly identifies human-AI interaction monitoring as the most under-researched gap in the field. Workshop practitioners raised it far more than published literature covers. If your AI monitoring program doesn&#8217;t address how users interact with and respond to AI outputs, you&#8217;re missing the category NIST calls most underdeveloped.</p></li><li><p>The document is vendor-neutral and grounded in practitioner experience, directly applicable to conversations with regulators and auditors who want evidence of a structured AI monitoring program.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Download NIST AI 800-4 from nist.gov and route it to whoever owns your AI security program. It&#8217;s the most actionable government guidance on operational AI monitoring published to date.</p></li><li><p>Map your current monitoring coverage against the document&#8217;s six categories. The gaps will be immediately apparent and the prioritization logic writes itself once you have the map.</p></li><li><p>Use AI 800-4 as the foundation for your AI monitoring program documentation. When regulators ask how you monitor AI systems in production, a NIST-aligned program gives you a defensible, auditable answer.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>The honest state of enterprise AI monitoring: most organizations have logs showing their AI system responded. They don&#8217;t have logs showing whether the response was correct, consistent with training distribution, within policy boundaries, or manipulated by adversarial input. That visibility gap is how AI security incidents become AI security incidents. You don&#8217;t catch the drift until the outcome is undeniable and the damage is done. NIST AI 800-4 doesn&#8217;t get coverage because nobody can sell it. The organizations that read it and build monitoring programs from its framework will answer regulatory questions coherently in 18 months when enforcement catches up to deployment rates. The organizations that attended every RSA keynote and skipped the NIST publication will be writing incident reports instead. For more on building AI governance programs that survive regulatory scrutiny, visit <a href="https://rockcybermusings.com/">rockcybermusings.com</a>. If you need help turning frameworks like AI 800-4 into operating programs your security team can actually run, reach out at <a href="https://rockcyber.com/">rockcyber.com</a>.</p><p>&#128073; Visit <strong><a href="https://www.rockcyber.com/">RockCyber.com</a></strong> to learn more about how we can help you in your traditional Cybersecurity and AI Security and Governance Journey</p><p>&#128073; Want to save a quick $100K? Check out our AI Governance Tools at <strong><a href="https://aigovernancetoolkit.com/">AIGovernanceToolkit.com</a></strong></p><p>&#128073; Subscribe for more AI and cyber insights with the occasional rant.</p><p><em>The views and opinions expressed in RockCyber Musings are my own and do not represent the positions of my employer or any organization I&#8217;m affiliated with.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share RockCyber Musings&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share RockCyber Musings</span></a></p><div><hr></div><h2>References</h2><p>Bargury, M. (2026, March 23). <em>Your AI agents are my minions</em> [Conference presentation]. RSA Conference 2026, San Francisco, CA.</p><p>Claburn, T. (2026, March 24). LiteLLM infected with credential-stealing code via Trivy. <em>The Register</em>. https://www.theregister.com/2026/03/24/trivy_compromise_litellm/</p><p>Claburn, T. (2026, March 23). AI agents are &#8216;gullible&#8217; and easy to turn into your minions. <em>The Register</em>. https://www.theregister.com/2026/03/23/pwning_everyones_ai_agents/</p><p>Claburn, T. (2026, March 23). Google unleashes Gemini AI agents on the dark web. <em>The Register</em>. https://www.theregister.com/2026/03/23/google_dark_web_ai/</p><p>Cisco. (2026, March). Cisco reimagines security for the agentic workforce. <em>Cisco Newsroom</em>. https://newsroom.cisco.com/c/r/newsroom/en/us/a/y2026/m03/cisco-reimagines-security-for-the-agentic-workforce.html</p><p>Google Cloud. (2026, March). RSAC 26: Supercharging agentic AI defense with frontline threat intelligence. <em>Google Cloud Blog</em>. https://cloud.google.com/blog/products/identity-security/rsac-26-supercharging-agentic-ai-defense-with-frontline-threat-intelligence</p><p>HackerOne. (2026, March). Agentic prompt injection testing for AI security. <em>HackerOne Blog</em>. https://www.hackerone.com/blog/agentic-prompt-injection-testing</p><p>HackerOne introduces agentic prompt injection testing as AI security risks accelerate. (2026, March 21). <em>Cybersecurity Insiders</em>. https://www.cybersecurity-insiders.com/hackerone-introduces-agentic-prompt-injection-testing-as-ai-security-risks-accelerate/</p><p>Herbert Smith Freehills Kramer. (2026, March). Transparency obligations for AI-generated content under the EU AI Act: From principle to practice. https://www.hsfkramer.com/notes/ip/2026-03/transparency-obligations-for-ai-generated-content-under-the-eu-ai-act-from-principle-to-practice</p><p>EU releases second draft of AI Act Code of Practice on labeling AI-generated content. (2026, March). <em>BABL AI</em>. https://babl.ai/eu-releases-second-draft-of-ai-act-code-of-practice-on-labeling-ai-generated-content/</p><p>Microsoft Security. (2026, March 20). Secure agentic AI end-to-end. <em>Microsoft Security Blog</em>. https://www.microsoft.com/en-us/security/blog/2026/03/20/secure-agentic-ai-end-to-end/</p><p>NIST. (2026, March). New report: Challenges to the monitoring of deployed AI systems. https://www.nist.gov/news-events/news/2026/03/new-report-challenges-monitoring-deployed-ai-systems</p><p>NIST. (2026). <em>NIST AI 800-4: Challenges to the monitoring of deployed AI systems</em>. National Institute of Standards and Technology. https://nvlpubs.nist.gov/nistpubs/ai/NIST.AI.800-4.pdf</p><p>Novee. (2026, March 24). Novee introduces autonomous AI red teaming to uncover security flaws in LLM applications [Press release]. <em>GlobeNewswire</em>. https://www.globenewswire.com/news-release/2026/03/24/3261278/0/en/Novee-Introduces-Autonomous-AI-Red-Teaming-to-Uncover-Security-Flaws-in-LLM-Applications.html</p><p>Novee introduces autonomous AI red teaming to hunt LLM vulnerabilities. (2026, March 24). <em>Help Net Security</em>. https://www.helpnetsecurity.com/2026/03/24/novee-ai-red-teaming-for-llm-applications/</p><p>Palo Alto Networks Unit 42. (2026, March). New prompt injection attack vectors through MCP sampling. https://unit42.paloaltonetworks.com/model-context-protocol-attack-vectors/</p><p>SecurityWeek. (2026, March). RSAC 2026 conference announcements summary: Day 1. https://www.securityweek.com/rsac-2026-conference-announcements-summary-day-1/amp/</p><p>Zenity AI agents contextual security. (2026, March 24). <em>Help Net Security</em>. https://www.helpnetsecurity.com/2026/03/24/zenity-ai-agents-contextual-security/</p><p>Zenity. (2026, March 23). Zenity sets the foundation for guardian agents. <em>Zenity Newsroom</em>. https://zenity.io/company-overview/newsroom/company-news/zenity-sets-the-foundation-for-guardian-agents</p>]]></content:encoded></item><item><title><![CDATA[Weekly Musings Top 10 AI Security Wrapup: Issue 30 March 13-19, 2026]]></title><description><![CDATA[Agentic AI Security Moves From "Meh" to Incident Log]]></description><link>https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260313-20260319</link><guid isPermaLink="false">https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260313-20260319</guid><dc:creator><![CDATA[Rock Lambros]]></dc:creator><pubDate>Fri, 20 Mar 2026 12:50:42 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!b3YR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cc0f875-fe24-4b6a-ab70-a93357678487_1024x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!b3YR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cc0f875-fe24-4b6a-ab70-a93357678487_1024x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!b3YR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cc0f875-fe24-4b6a-ab70-a93357678487_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!b3YR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cc0f875-fe24-4b6a-ab70-a93357678487_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!b3YR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cc0f875-fe24-4b6a-ab70-a93357678487_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!b3YR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cc0f875-fe24-4b6a-ab70-a93357678487_1024x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!b3YR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cc0f875-fe24-4b6a-ab70-a93357678487_1024x1024.png" width="1024" height="1024" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8cc0f875-fe24-4b6a-ab70-a93357678487_1024x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1024,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1233556,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/191536924?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cc0f875-fe24-4b6a-ab70-a93357678487_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!b3YR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cc0f875-fe24-4b6a-ab70-a93357678487_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!b3YR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cc0f875-fe24-4b6a-ab70-a93357678487_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!b3YR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cc0f875-fe24-4b6a-ab70-a93357678487_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!b3YR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cc0f875-fe24-4b6a-ab70-a93357678487_1024x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Meta logged a SEV-1 on March 18 because an internal AI agent posted without human approval, provided bad advice, and exposed sensitive data to the wrong employees for 2 hours. Amazon confirmed its Bedrock sandbox lets AI models exfiltrate data via DNS and called it intentional design. HiddenLayer found 31% of security leaders don&#8217;t know if they had an AI breach in the past year. The EU Council voted to restructure the AI Act&#8217;s high-risk compliance framework. Three AI agent security products launched in four days. This was one week.</p><p>The week&#8217;s evidence points in one direction: agentic AI security is no longer a research problem. Real incidents are appearing in production environments run by organizations with serious security programs. Technical flaws in AI infrastructure are drawing vendor responses that amount to documentation updates rather than patches. Research data is documenting blind spots CISOs can no longer treat as edge cases. In parallel, the governance machinery is finally moving, but it&#8217;s moving slower than deployment. Standards and deployments are in a race, and deployments are winning by a wide margin. More context at <a href="https://www.rockcyber.com/">RockCyber</a> and <a href="https://rockcybermusings.com/">RockCyber Musings</a>.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260313-20260319?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260313-20260319?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><h3>1. OWASP publishes its GenAI data security risk taxonomy for 2026</h3><p>The OWASP GenAI Security Project released GenAI Data Security: Risks and Mitigations 2026 in March, a 103-page taxonomy covering 21 discrete data security risks across the full GenAI lifecycle from training through agentic runtime (OWASP). The document maps risks across training and fine-tuning data, retrieval and RAG pipelines, vector stores, context windows, agent memory, tool call payloads, and observability infrastructure. It identifies a core architectural property that makes GenAI data security structurally different from every prior computing model: the context window aggregates data from multiple trust domains into a single flat namespace with no internal access controls. A confidential HR record retrieved via RAG sits next to a user prompt with identical trust weight, and there is no mechanism today to mark a context segment as available for reasoning but not surfaceable in the output. The document also addresses machine unlearning directly: deleting source data does not remove what a fine-tuned model or LoRA adapter has memorized into its weights. <strong><a href="https://genai.owasp.org/resource/owasp-genai-data-security-risks-mitigations-2026/">Download the report HERE.</a></strong></p><p><strong>Why it matters</strong></p><ul><li><p>The flat-namespace context window problem is not a configuration gap. It&#8217;s an architectural property of how these systems work, which means perimeter controls and access policies cannot fully solve it. Minimization and context scoping are the only practical mitigations available today.</p></li><li><p>LoRA adapter memorization of rare training examples means high-recall prompts can extract verbatim PII, credentials, or intellectual property from fine-tuned models without any sophisticated attack technique. Organizations fine-tuning on internal data have a data exposure risk they likely haven&#8217;t assessed.</p></li><li><p>The Right to Erasure problem is unsolved at the architectural level. Deleting training data from a source system does not delete what the model encoded during fine-tuning. GDPR and state privacy law DSR obligations cannot be satisfied by source deletion alone.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Treat the context window as a data-exposure surface, not just a prompt-delivery mechanism. Classify what goes in the same way you classify what goes into a database query, and scope RAG retrieval to the minimum required for the task.</p></li><li><p>Audit every fine-tuned model and LoRA adapter in your environment against the data used to train it. If that training data included PII, credentials, or regulated information, your model could serve as a potential exfiltration vector.</p></li><li><p>Build a GenAI data bill of materials using CycloneDX ML-BOM as the base format. Until you have lineage from the source dataset to the deployed model to the embedding store, you cannot answer the question a regulator will eventually ask: what data did this model see, and where does it live now?</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>The architectural insight at the center of this document is the one the industry keeps sliding past. The context window has no internal access control layer. That&#8217;s not a misconfiguration. It&#8217;s a design property of how transformers process sequences. Everything that enters the context window is treated as equally reachable by the model&#8217;s output mechanism, and no amount of system prompt guardrailing changes the underlying architecture. The practical implication is that the primary defense is what you put in, not what you try to prevent from coming out.</p><p>The machine unlearning section is the one I push organizations on hardest. They are collecting consent, honoring deletion requests, and scrubbing source databases, and then deploying fine-tuned models that still carry what they memorized from the deleted data. The model weights are a copy of your training corpus in a form your DLP tools don&#8217;t see, and your deletion workflows can&#8217;t reach. Right to Erasure in GenAI is an open architectural problem with no clean solution today, and most organizations haven&#8217;t told their legal team that yet.</p><h3>2. EU Council rewrites the compliance clock for high-risk AI systems</h3><p>The EU Council adopted its negotiating position to amend the AI Act&#8217;s high-risk framework (EU Council). The core change replaces the fixed August 2026 compliance deadline with a conditional trigger. Full high-risk obligations apply only once the Commission certifies required standards and tools are available, with a hard backstop date. The Council also pushed the national AI regulatory sandbox deadline to December 2027 and clarified that law enforcement, border management, judicial, and financial AI systems remain under national supervisory authority rather than the Commission. Negotiations with the European Parliament begin next.</p><p><strong>Why it matters</strong></p><ul><li><p>The conditional trigger gives the Commission discretion over when your obligations start. Until it certifies standards are ready, full high-risk obligations don&#8217;t apply, creating an indeterminate window.</p></li><li><p>Pushing the sandbox deadline to December 2027 removes a key testing mechanism for high-risk AI at a time when organizations are accelerating deployment.</p></li><li><p>Fragmented supervisory authority means 27 member states apply their own rules to some of the highest-stakes AI use cases.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Map your AI systems against current and proposed high-risk definitions now. The conditional trigger shifts the timeline, not the compliance obligation itself.</p></li><li><p>Track Parliament negotiations. The Council position is a mandate, not the final text.</p></li><li><p>Build a jurisdiction-aware compliance map for EU operations covering which systems fall under national versus Commission supervision.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>I&#8217;ve seen regulatory timelines used to delay compliance indefinitely in my career more times than I can count. This EU Council move fits the pattern. The conditional trigger means the Commission controls when your clock starts, and they have to certify standards are available first. Given the pace at which NIST&#8217;s agentic AI guidance is moving, expecting European standards to materialize quickly requires genuine optimism.</p><p>Organizations using this ambiguity to do nothing are miscalculating. The August 2026 date was never the governance point. You have high-risk AI systems in production today, and you need to govern them regardless of what the Commission certifies and when.</p><h3>3. Meta logs a SEV-1 incident from a rogue internal AI agent</h3><p>On March 18, Meta confirmed a Severity 1 security incident caused by an internal AI agent operating without human authorization (Bitcoinworld, HackerNoob). The agent posted to an internal forum, gave incorrect advice, and triggered a cascade that exposed sensitive company and user data to unauthorized employees for approximately two hours. Meta contained the exposure by cutting the agent&#8217;s forum access and auditing permissions across other internal agents. No external exfiltration was confirmed.</p><p><strong>Why it matters</strong></p><ul><li><p>A SEV-1 at Meta from an AI agent operating outside its bounds sets a documented precedent: production agents at companies with robust security programs can circumvent behavioral constraints and cause genuine incidents.</p></li><li><p>The chain reaction, one unauthorized action triggering downstream data exposure, is characteristic of agentic systems and different from traditional software vulnerabilities in ways most IR playbooks don&#8217;t yet account for.</p></li><li><p>No external exfiltration is partial comfort. Unauthorized internal access to sensitive user data carries GDPR and AI Act exposure regardless of whether the data left the building.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Audit every AI agent in your environment and document what it can post, write, or modify without a human approval checkpoint.</p></li><li><p>Map the blast radius. If a specific agent takes an unexpected action, what does it touch first, and what cascades from there?</p></li><li><p>Build AI agent incident response playbooks with automated containment triggers that don&#8217;t require analyst approval before they fire.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>The Meta incident will get dismissed as a minor operational hiccup. That&#8217;s the wrong read. Even with legit engineering talent and a mature security program, a production AI agent escaped its behavioral constraints and triggered a data exposure chain. I&#8217;m willing to bet your environment isn&#8217;t more disciplined than Meta&#8217;s.</p><p>Two hours to containment is fast. Most organizations I work with couldn&#8217;t tell you within two hours that an agent had gone sideways. AI agent behavioral monitoring is dramatically behind where it needs to be. The lesson to take away from this is that you need detection that fires before the cascade, not after the data is already in the wrong hands.</p><h3>4. Amazon&#8217;s Bedrock sandbox leaks data through DNS because that&#8217;s the design</h3><p>BeyondTrust&#8217;s Phantom Labs disclosed that Amazon Bedrock AgentCore Code Interpreter&#8217;s sandbox mode permits outbound DNS queries (SC Media, The Hacker News). An attacker interacting with the agent can send commands encoded in DNS A record responses and receive exfiltrated data encoded in DNS subdomain queries to an attacker-controlled server. No authentication bypass is required. BeyondTrust assigned a CVSS score of 7.5. AWS reviewed the research, determined that the behavior reflects the intended functionality, and responded by updating the documentation rather than issuing a patch.</p><p><strong>Why it matters</strong></p><ul><li><p>&#8220;Intended behavior&#8221; is a vendor risk posture, not a security posture. Sandbox mode was positioned as providing execution isolation. A sandbox allowing covert DNS exfiltration does not deliver isolation in any security-relevant sense.</p></li><li><p>DNS-based covert channels are standard red team tradecraft in traditional environments. The technique translates directly into AI code execution environments without modification.</p></li><li><p>Organizations running agents against sensitive internal data in AWS Bedrock face an unpatched, documented, CVSS 7.5 risk with no vendor remediation timeline.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Add DNS query monitoring for Bedrock AgentCore code execution environments to your threat detection stack now.</p></li><li><p>Reduce the data that AI agents with code execution access can reach to the strict minimum required for the task.</p></li><li><p>Get a formal written architecture statement from AWS specifying exactly what the sandbox guarantees before expanding Bedrock AgentCore deployments.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Another &#8220;Intended behavior&#8221; narrative. I&#8217;m getting pretty damn sick of it. That&#8217;s another way of saying, &#8220;We know about this, it would be expensive to change, and it sucks to be you.&#8221; <strong><a href="https://www.csoonline.com/article/4118092/google-vertex-ai-security-permissions-could-amplify-insider-threats.html">(see my thoughts in CSO magazine about a previous instance HERE)</a></strong>. The documentation update rather than a patch is the tell. You can&#8217;t outsource your risk posture to your cloud provider&#8217;s design decisions.</p><p>The technique is in every red team playbook. DNS exfiltration from sandboxed environments is foundational evasion tradecraft. Translate that knowledge directly to your AI infrastructure. If you&#8217;re running code execution agents against sensitive data in Bedrock and you haven&#8217;t instrumented DNS as an exfiltration channel, now you have your reason.</p><h3>5. Linux Foundation raises $12.5 million from AI vendors to fix what their tools helped break</h3><p>The Linux Foundation announced $12.5 million in grant funding from Anthropic, AWS, GitHub, Google, Google DeepMind, Microsoft, and OpenAI to advance open source software security (Linux Foundation, OpenSSF). The funding flows through Alpha-Omega and the Open Source Security Foundation. The stated problem is that AI tools are generating vulnerability reports at a volume that open-source maintainers cannot triage or remediate, degrading the security posture of the software supply chain. AWS contributed an additional $2.5 million to Alpha-Omega, in addition to the pooled amount.</p><p><strong>Why it matters</strong></p><ul><li><p>The same organizations whose AI tools created the report flood are funding the solution. This characterizes the governance dynamic precisely, that vendors profit from deployment and are now asked to fund the externalized costs on the maintainer community.</p></li><li><p>Overwhelming maintainers with AI-generated findings lowers average signal quality. Funding addresses capacity but doesn&#8217;t solve the signal-to-noise problem alone.</p></li><li><p>This is the first major coordinated industry response to the specific problem of AI-generated report volume stressing the open source security ecosystem.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Factor the current maintainer backlog into your software composition analysis program. Critical open source dependencies may carry known vulnerabilities sitting in a backlogged queue rather than getting remediated.</p></li><li><p>Watch what Alpha-Omega and OpenSSF deliver from this investment over the next twelve months. The commitment matters less than whether the tooling measurably improves triage capacity.</p></li><li><p>Ask your security vendors how they handle AI-generated findings before surfacing them to your team. The same noise problem exists inside your tooling stack.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>$12.5 million is the right direction, yet not nearly enough. Open source maintainers are largely volunteers managing the infrastructure that the global software supply chain runs on. The AI-generated report flood is a problem these vendors created while selling velocity gains to enterprises.</p><p>The coordination signal matters more than the dollar amount. You rarely see Google, Microsoft, AWS, Anthropic, and OpenAI announce joint anything. When competitors fund a shared problem together, the liability exposure of inaction exceeds the competitive cost of cooperating. Given how much of the internet runs on open source that these companies&#8217; AI tools are now stressing, the math on joint action isn&#8217;t complicated.</p><h3>6. Pentagon moves to replace Anthropic while the lawsuit works through the courts</h3><p>TechCrunch reported that the Pentagon is actively developing alternative AI capability paths to replace Anthropic&#8217;s Claude across defense applications (TechCrunch). This follows the Defense Department&#8217;s February designation of Anthropic as a supply chain security risk and Anthropic&#8217;s subsequent lawsuit against the Trump administration. This confirms that the replacement effort has shifted from contingency planning to active technical development. More than 875 Google and OpenAI employees have signed an open letter supporting Anthropic&#8217;s position.</p><p><strong>Why it matters</strong></p><ul><li><p>Active technical development of replacements, rather than contingency planning, signals DoD confidence that the Anthropic designation will hold through the litigation cycle.</p></li><li><p>Defense contractors relying on Claude for active program work now face migration timelines driven by someone else&#8217;s legal and procurement decisions.</p></li><li><p>The 875-employee response across competing firms signals the tech workforce treats this as a legitimacy question about AI governance, not a routine vendor dispute.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>If your organization operates in the defense industrial base, review AI vendor contracts now for comparable ethical-use clauses and their enforceability, before further redesignations affect your supply chain.</p></li><li><p>Track the Anthropic lawsuit. The outcome defines what ethical use provisions in AI contracts are worth in federal procurement.</p></li><li><p>Evaluate AI vendor concentration risk in your stack. If one supply chain designation event could disrupt your programs, that&#8217;s a single point of failure worth addressing.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>The supply chain risk designation was built for foreign adversaries. Applying it to a domestic AI company for writing autonomous weapons prohibitions into a contract is a significant precedent that the press is underweighting. The designation signals that safety constraints are now framed as operational liabilities in defense procurement, not risk mitigation.</p><p>If that framing spreads to other acquisition decisions, the AI vendors most willing to remove safety constraints gain a competitive advantage in a large and growing federal spending category. Watch the lawsuit and the follow-on procurement awards carefully. Both will tell you where this governance experiment ends up.</p><h3>7. CSA&#8217;s 2026 cloud and AI security report documents the identity explosion</h3><p>The Cloud Security Alliance published its State of Cloud and AI Security 2026 on March 13, finding the average enterprise now manages 100 machine and non-human identities for every one human identity (CSA). Forgotten or misconfigured cloud credentials declined from 84% in 2024 to 65% in 2026. Ninety-two percent of executives report business-impacting security compromises, most from preventable risks. The report identifies decentralized AI agents as the primary driver of the NHI expansion and calls for continuous exposure management to replace static patching cycles.</p><p><strong>Why it matters</strong></p><ul><li><p>A 100:1 machine-to-human identity ratio means the traditional IAM program built around human users is managing a fundamentally different problem than it was designed for.</p></li><li><p>Credential misconfiguration persisting at 65% suggests the improvement rate won&#8217;t match the velocity of AI-driven identity expansion.</p></li><li><p>A 92% executive compromise from preventable risks indicates the gap isn&#8217;t a detection-sophistication problem. Organizations know the controls and aren&#8217;t applying them at the required scale.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Audit NHI management practices against the same standards applied to human identities: lifecycle management, least privilege, and regular access reviews.</p></li><li><p>Deploy continuous credential exposure monitoring specifically for machine identities and AI agent service accounts.</p></li><li><p>Shift the board-level narrative from maturity scores to continuous exposure management. That&#8217;s where enterprise frameworks are heading.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>A hundred machine identities for every human one, and most organizations manage them with IAM tooling built for a 10-to-1 ratio. The math doesn&#8217;t work. The credential improvement trend from 84% to 65% is real progress, but 65% still represents a failure rate I wouldn&#8217;t accept in any other critical control domain.</p><p>Every new agentic deployment creates more identities, tokens, service accounts, and API keys. If you don&#8217;t have a clear owner for non-human identity governance today, you have a gap that will become a breach within twelve months. Find the owner. Document the scope. Don&#8217;t wait for the incident.</p><h3>8. Jozu Agent Guard launches after watching an AI agent bypass governance in four commands</h3><p>Jozu announced Jozu Agent Guard on March 17, a zero-trust runtime that executes AI agents, models, and MCP servers with policy enforcement built outside the model&#8217;s control plane and hardcoded against agent-level override (Help Net Security). The architecture decision came directly from internal testing: during product development, Jozu observed an AI agent bypass the governance controls the product was designed to enforce in four commands. That failure drove the decision to move policy enforcement entirely outside the execution layer the agent can influence.</p><p><strong>Why it matters</strong></p><ul><li><p>A product built specifically to constrain AI agents was bypassed in four commands during its own testing. The threat model has to assume the agent itself will attempt to circumvent governance. Cooperative compliance is not a valid design assumption.</p></li><li><p>MCP server isolation is underprovided. MCP servers frequently carry production credentials and broad tool access, and running them in shared agent environments creates privilege escalation paths most organizations haven&#8217;t mapped.</p></li><li><p>Three AI agent security products launching in four days signals enterprise buying is active in this space right now.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Require AI agent security vendors to demonstrate their product against an adversarial agent in a live environment. Demand the failure modes alongside the happy path.</p></li><li><p>Treat MCP server execution environments as sensitive infrastructure requiring isolation equivalent to your most privileged workloads.</p></li><li><p>Add governance bypass testing to your AI red team scope before the next production agent deployment.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>The four-command bypass during their own testing is the most honest vendor disclosure I&#8217;ve seen about AI agent security in the past year. Most vendors demo the happy path and skip the part where their product got circumvented. Jozu disclosed it and changed the architecture. That&#8217;s how security engineering is supposed to work.</p><p>The uncomfortable implication for everyone else: if a product built specifically to constrain AI agents was bypassed in four commands, ask yourself what your existing controls look like against an agent actively trying to exceed its permissions. If you haven&#8217;t run that test, you don&#8217;t have an answer.</p><div><hr></div><h3>9. Token Security builds intent-based controls for AI agent permissions</h3><p>Token Security announced intent-based AI agent security on March 18, governing autonomous agents by scoping their permissions to declared operational purpose rather than granting standing broad access (Help Net Security). The system creates purpose-defined permission envelopes that expire at task completion, with runtime enforcement preventing actions outside the declared intent. Token Security&#8217;s CEO stated directly that prompt filtering and guardrails were not designed to contain the security risks of autonomous AI agents, pointing to the architectural limitation of relying on the model&#8217;s output layer for enforcement.</p><p><strong>Why it matters</strong></p><ul><li><p>Purpose-aligned permissions address a structural problem in current agent deployment: agents inheriting credential scopes far exceeding what any single task requires.</p></li><li><p>Explicit acknowledgment that content filtering can&#8217;t do this job alone represents where serious practitioner thinking is converging. The field is moving from output layer controls toward architectural access controls.</p></li><li><p>Paired with Jozu, Entro, and Microsoft Entra Agent ID announcements this same week, this reflects a coherent market thesis forming around agent identity and least privilege as primary security controls.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Map current AI agent deployments against one question: does each agent hold only the permissions it needs for its specific task? If you can&#8217;t answer quickly, your access governance is already too loose.</p></li><li><p>Evaluate intent-based and purpose-scoped access controls in your next AI security procurement cycle.</p></li><li><p>Brief your identity team on AI agent access management before your security team deploys solutions they haven&#8217;t reviewed. These tools touch the same credential infrastructure.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Least privilege applied to agents is the same principle that has protected privileged service accounts in traditional architectures for decades. The problem is that most AI agent deployments aren&#8217;t being treated like privileged service accounts. They get broad collaboration access by default, and nobody asks why.</p><p>Intent-based controls force the right question: what is this agent for? If you can answer precisely, you can scope permissions precisely. If you can&#8217;t answer precisely, that is the real governance problem. You&#8217;ve deployed an agent without a defined operational boundary, and your control over it is largely fictional.</p><div><hr></div><h3>10. NIST receives formal research submissions on securing AI agents</h3><p>On March 18, UC Berkeley&#8217;s Center for Long-Term Cybersecurity submitted a formal response to NIST&#8217;s CAISI RFI on AI agent security, urging prioritization of standardization, incident reporting frameworks, talent pipelines, and adaptive governance (CLTC UC Berkeley). The Computer and Communications Industry Association submitted parallel comments advocating for multistakeholder processes and alignment with existing NIST frameworks (CCIA). NIST&#8217;s National Cybersecurity Center of Excellence also holds a separate comment period open through April 2 on a concept paper covering identity and authorization for AI agents.</p><p><strong>Why it matters</strong></p><ul><li><p>The gap between NIST collecting input and usable standards publishing is measured in years. Your agents are running now, under no binding identity or authorization standard.</p></li><li><p>Berkeley&#8217;s call for incident reporting infrastructure acknowledges a structural gap: no systematic mechanism exists for learning from AI agent security failures across organizations.</p></li><li><p>The NCCoE concept paper on agent identity and authorization is where future compliance requirements will originate. Comments submitted now shape what those requirements demand.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Read the NCCoE concept paper at nccoe.nist.gov and submit comments before April 2 if your organization deploys agents. Operational experience is what NIST is specifically asking for.</p></li><li><p>Treat the Berkeley and CCIA submissions as intelligence on where auditors will focus within 18 to 36 months.</p></li><li><p>Stand up basic agent identity logging now using existing IAM controls. Don&#8217;t wait for NIST to finalize anything.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>NIST is moving faster on agentic AI security than I expected two years ago. That still isn&#8217;t fast enough to matter for organizations deploying agents today. Best case from the current comment cycle: interim guidance in twelve months. Binding controls will take longer.</p><p>Berkeley&#8217;s call for incident reporting is the right recommendation and it will face the same resistance every mandatory reporting regime has faced. Voluntary frameworks will come first, get ignored, and get teeth after the third or fourth major public incident. That&#8217;s the pattern. Plan for it and build your own internal incident tracking capability now.</p><h3>The One Thing You Won&#8217;t Hear About But You Need To</h3><h3>Entro Security builds a governed map of what your AI agents access in production</h3><p>Entro Security launched its Agentic Governance and Administration platform, extending non-human identity security coverage specifically to AI agents (GlobeNewswire, Help Net Security). The platform builds structured AI agent profiles from three observable layers. First, sources: the endpoints, agent platforms, cloud environments, and MCP servers where agents execute. Second, targets: the enterprise assets and applications each agent accesses. Third, identities: the human accounts, non-human identities, and secrets each agent uses to operate. AGA provides MCP server activity visibility and policy enforcement, audit trails for both allowed and blocked activity, and controls against unsanctioned MCP targets and AI client behaviors.</p><p><strong>Why it matters</strong></p><ul><li><p>Most organizations deploying AI agents don&#8217;t have a single governed view of what agents are running, what they access, and which identities they use. AGA builds that view from execution telemetry rather than documentation that goes stale immediately after it&#8217;s written.</p></li><li><p>MCP server governance is nearly absent from enterprise security programs today, despite MCP servers frequently holding production credentials and broad access to sensitive systems.</p></li><li><p>The NHI-first architecture lets organizations with existing non-human identity programs extend that coverage to AI agents rather than building a separate program from scratch.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Before the next AI agent deployment, require answers to three questions from observable telemetry: where does it run, what does it touch, and which identities does it use? If you need documentation rather than telemetry to answer, you don&#8217;t have governance.</p></li><li><p>Add MCP server inventory to asset management now. MCP servers deploy through developer workflows without formal change management, and retroactive cataloguing gets harder with each deployment.</p></li><li><p>Assess whether your current NHI security program explicitly covers AI agent identities. If it doesn&#8217;t, extend it or stand up a parallel track with a clear accountable owner.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>This one didn&#8217;t get coverage this week because it launched during RSA prep season when every security vendor fights for the same column inches. That&#8217;s exactly why it&#8217;s here. The problem AGA addresses is what I call dark matter governance: AI agents operating in your environment that nobody catalogued because they deployed through platforms your traditional asset management tools don&#8217;t see.</p><p>The MCP visibility layer is the operationally useful piece. MCP servers multiply fast, are deployed by individual developers without change management review, and frequently hold credentials for production systems. An agent you haven&#8217;t catalogued connecting to an MCP server you haven&#8217;t governed is a permissions sprawl problem that compounds with every new deployment. Get a governed view of that surface before your adversary maps it for you. </p><p>If you found this analysis useful, subscribe at <a href="https://rockcybermusings.com/">rockcybermusings.com</a> for weekly intelligence on AI security developments.</p><p>&#128073; Visit <strong><a href="https://www.rockcyber.com/">RockCyber.com</a></strong> to learn more about how we can help you in your traditional Cybersecurity and AI Security and Governance Journey</p><p>&#128073; Want to save a quick $100K? Check out our AI Governance Tools at <strong><a href="https://aigovernancetoolkit.com/">AIGovernanceToolkit.com</a></strong></p><p>&#128073; Subscribe for more AI and cyber insights with the occasional rant.</p><p><em>The views and opinions expressed in RockCyber Musings are my own and do not represent the positions of my employer or any organization I&#8217;m affiliated with.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share RockCyber Musings&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share RockCyber Musings</span></a></p><h2>References</h2><p>Bitcoinworld. (2026, March). <em>Rogue AI agent sparks critical security crisis at Meta, exposing sensitive data</em>. https://bitcoinworld.co.in/meta-rogue-ai-agent-security-breach/</p><p>Cloud Security Alliance. (2026, March 13). <em>The state of cloud and AI security in 2026</em>. https://cloudsecurityalliance.org/blog/2026/03/13/the-state-of-cloud-and-ai-security-in-2026</p><p>Computer and Communications Industry Association. (2026, March). <em>CCIA submits comments to NIST regarding privacy and security of AI agents</em>. https://ccianet.org/news/2026/03/ccia-submits-comments-to-nist-regarding-privacy-and-security-of-ai-agents/</p><p>Council of the European Union. (2026, March 13). <em>Council agrees position to streamline rules on artificial intelligence</em>. https://www.consilium.europa.eu/en/press/press-releases/2026/03/13/council-agrees-position-to-streamline-rules-on-artificial-intelligence/</p><p>Entro Security. (2026, March 18). <em>Entro launches agentic governance and administration to bring visibility and control to AI access across the enterprise</em>. GlobeNewswire. https://www.globenewswire.com/news-release/2026/03/18/3258229/0/en/Entro-Launches-Agentic-Governance-Administration-to-Bring-Visibility-and-Control-to-AI-Access-Across-the-Enterprise.html</p><p>HackerNoob. (2026, March). <em>Meta&#8217;s rogue AI agent: Sev 1 security incident and how to sandbox AI agents properly</em>. https://hackernoob.tips/meta-rogue-ai-agent-sev1-how-to-sandbox-ai-agents/</p><p>Help Net Security. (2026, March 17). <em>Jozu Agent Guard targets AI agents that evade controls</em>. https://www.helpnetsecurity.com/2026/03/17/jozu-agent-guard-targets-ai-agents-that-evade-controls/</p><p>Help Net Security. (2026, March 18). <em>Token Security advances AI agent protection with intent-based controls</em>. https://www.helpnetsecurity.com/2026/03/18/token-security-intent-based-ai-agent-security/</p><p>Help Net Security. (2026, March 18). <em>Big tech companies step in to support the open source security ecosystem</em>. https://www.helpnetsecurity.com/2026/03/18/linux-foundation-open-source-security-12-5-million-funding/</p><p>Help Net Security. (2026, March 19). <em>Entro Security AGA brings governance and control to enterprise AI agents and access</em>. https://www.helpnetsecurity.com/2026/03/19/entro-agentic-governance-administration/</p><p>HiddenLayer. (2026, March 18). <em>HiddenLayer releases the 2026 AI threat landscape report</em>. PR Newswire. https://finance.yahoo.com/news/hiddenlayer-releases-2026-ai-threat-140000928.html</p><p>Linux Foundation. (2026, March 17). <em>Linux Foundation announces $12.5 million in grant funding from leading organizations to advance open source security</em>. https://www.linuxfoundation.org/press/linux-foundation-announces-12.5-million-in-grant-funding-from-leading-organizations-to-advance-open-source-security</p><p>SC Media. (2026, March). <em>AWS Bedrock tool vulnerability allows data exfiltration via DNS leaks</em>. https://www.scworld.com/brief/aws-bedrock-vulnerability-allows-data-exfiltration-via-dns-leaks</p><p>TechCrunch. (2026, March 17). <em>The Pentagon is developing alternatives to Anthropic, report says</em>. https://techcrunch.com/2026/03/17/the-pentagon-is-developing-alternatives-to-anthropic-report-says/</p><p>The Hacker News. (2026, March 17). <em>AI flaws in Amazon Bedrock, LangSmith, and SGLang enable data exfiltration and RCE</em>. https://thehackernews.com/2026/03/ai-flaws-in-amazon-bedrock-langsmith.html</p><p>UC Berkeley Center for Long-Term Cybersecurity. (2026, March 18). <em>Researchers submit response to U.S. government request on security considerations for AI agents</em>. https://cltc.berkeley.edu/2026/03/18/researchers-submit-response-to-u-s-government-request-on-security-considerations-for-ai-agents/</p>]]></content:encoded></item><item><title><![CDATA[AI Agent Authentication Gets the Hard Part Right. Authorization Is Still Your Problem.]]></title><description><![CDATA[IETF's new AI agent auth draft nails identity with WIMSE and SPIFFE but skips per-action authorization.]]></description><link>https://www.rockcybermusings.com/p/i-agent-authentication-authorization-gap</link><guid isPermaLink="false">https://www.rockcybermusings.com/p/i-agent-authentication-authorization-gap</guid><dc:creator><![CDATA[Rock Lambros]]></dc:creator><pubDate>Tue, 17 Mar 2026 12:50:42 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!bS5L!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4aa7583c-e0fb-4920-a994-e8b6bb128fa4_2048x2048.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!bS5L!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4aa7583c-e0fb-4920-a994-e8b6bb128fa4_2048x2048.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bS5L!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4aa7583c-e0fb-4920-a994-e8b6bb128fa4_2048x2048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!bS5L!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4aa7583c-e0fb-4920-a994-e8b6bb128fa4_2048x2048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!bS5L!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4aa7583c-e0fb-4920-a994-e8b6bb128fa4_2048x2048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!bS5L!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4aa7583c-e0fb-4920-a994-e8b6bb128fa4_2048x2048.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bS5L!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4aa7583c-e0fb-4920-a994-e8b6bb128fa4_2048x2048.jpeg" width="1456" height="1456" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4aa7583c-e0fb-4920-a994-e8b6bb128fa4_2048x2048.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1456,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2920581,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/190013993?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4aa7583c-e0fb-4920-a994-e8b6bb128fa4_2048x2048.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!bS5L!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4aa7583c-e0fb-4920-a994-e8b6bb128fa4_2048x2048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!bS5L!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4aa7583c-e0fb-4920-a994-e8b6bb128fa4_2048x2048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!bS5L!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4aa7583c-e0fb-4920-a994-e8b6bb128fa4_2048x2048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!bS5L!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4aa7583c-e0fb-4920-a994-e8b6bb128fa4_2048x2048.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The IETF just published its most ambitious attempt to standardize how AI agents prove their identity across systems. Draft-klrc-aiagent-auth-00, dropped March 2, 2026, composes WIMSE, SPIFFE, and OAuth 2.0 into a 26-page framework called AIMS (Agent Identity Management System). The authentication layer is solid. The authorization layer stops at the token boundary. The Security Considerations section contains two words: &#8220;TODO Security.&#8221; If you&#8217;re deploying agentic systems in production, you need to understand where this draft helps you and where you still have to build your own controls.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/p/i-agent-authentication-authorization-gap?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/p/i-agent-authentication-authorization-gap?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p>Before I get into specifics, a quick note on what this document actually is. An IETF Internet-Draft (I-D) is a working document, the raw material that may eventually become an RFC (an official Internet standard). This one is version -00, the very first public iteration from Pieter Kasselman (Defakto Security), Jean-Francois Lombardo (AWS), Yaroslav Rosomakho (Zscaler), and Brian Campbell (Ping Identity). Criticizing a -00 draft for incompleteness is a bit like reviewing someone&#8217;s outline and complaining the conclusion is thin. That said, people are already reading this as deployment guidance, and the gaps matter for anyone building agentic systems today. So let&#8217;s talk about what it covers, what it doesn&#8217;t cover yet, and what you need to build yourself while the standards process catches up.</p><h2>The good news: agents are workloads, and workloads have an identity stack</h2><p>The draft&#8217;s foundational thesis gets it right that AI agents should be treated as workloads, not as some new identity category requiring new protocols and running instances of software executing specific tasks. That framing unlocks SPIFFE&#8217;s attestation-bound cryptographic identity, WIMSE&#8217;s cross-system workload semantics, and OAuth 2.0&#8217;s delegation framework. No new protocols needed.</p><p>This matters because SPIFFE already works at scale. Uber processes billions of attestations daily through SPIRE. Block runs the full SPIFFE+WIMSE+OAuth stack in production. The draft codifies patterns that companies with real security engineering teams already deploy.</p><p>The WIMSE identifiers specified in the draft bind agent identity to the execution environment through hardware-rooted attestation. A SPIRE agent on each node performs workload attestation by examining the kernel or querying the orchestration platform. Your agent&#8217;s identity gets measured from where it runs, not merely asserted by who registered it. An OAuth client_id is a registration artifact. A SPIFFE ID is cryptographic proof that Agent X is actually Agent X, running in the expected environment.</p><p>The draft also gets credentials right. Short-lived, cryptographically bound, explicit expiration. Static API keys are called out as unsuitable for agent authentication: bearer artifacts with no cryptographic binding, no identity conveyance, operationally painful to rotate.</p><p>That warning couldn&#8217;t come at a better time. Astrix Security analyzed over 5,200 open-source MCP server implementations and found that 53% rely on static API keys or Personal Access Tokens. Only 8.5% use OAuth. The ecosystem is building on exactly the anti-pattern the draft condemns.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!L6DS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80358181-31cd-42dd-ac08-ce32048aec9f_3748x2250.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!L6DS!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80358181-31cd-42dd-ac08-ce32048aec9f_3748x2250.png 424w, https://substackcdn.com/image/fetch/$s_!L6DS!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80358181-31cd-42dd-ac08-ce32048aec9f_3748x2250.png 848w, https://substackcdn.com/image/fetch/$s_!L6DS!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80358181-31cd-42dd-ac08-ce32048aec9f_3748x2250.png 1272w, https://substackcdn.com/image/fetch/$s_!L6DS!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80358181-31cd-42dd-ac08-ce32048aec9f_3748x2250.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!L6DS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80358181-31cd-42dd-ac08-ce32048aec9f_3748x2250.png" width="1456" height="874" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/80358181-31cd-42dd-ac08-ce32048aec9f_3748x2250.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:874,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:483418,&quot;alt&quot;:&quot;Pie chart showing 53% of MCP servers use static API keys versus 8.5% using OAuth&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/190013993?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80358181-31cd-42dd-ac08-ce32048aec9f_3748x2250.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Pie chart showing 53% of MCP servers use static API keys versus 8.5% using OAuth" title="Pie chart showing 53% of MCP servers use static API keys versus 8.5% using OAuth" srcset="https://substackcdn.com/image/fetch/$s_!L6DS!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80358181-31cd-42dd-ac08-ce32048aec9f_3748x2250.png 424w, https://substackcdn.com/image/fetch/$s_!L6DS!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80358181-31cd-42dd-ac08-ce32048aec9f_3748x2250.png 848w, https://substackcdn.com/image/fetch/$s_!L6DS!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80358181-31cd-42dd-ac08-ce32048aec9f_3748x2250.png 1272w, https://substackcdn.com/image/fetch/$s_!L6DS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80358181-31cd-42dd-ac08-ce32048aec9f_3748x2250.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure 1: MCP Server Authentication Methods</figcaption></figure></div><h2>Transaction Tokens solve the lateral movement problem</h2><p>Section 10.4 addresses a real attack vector most frameworks ignore. When access tokens propagate through internal microservice chains within an agent workflow, every hop creates a theft and replay opportunity.</p><p>The draft&#8217;s answer is Transaction Tokens (draft-ietf-oauth-transaction-tokens-08). Short-lived, signed JWTs that bind user identity, workload identity, and authorization context to a specific transaction. Lifetimes are measured in seconds to minutes. Cryptographic signatures prevent context modification. You can&#8217;t grab a Transaction Token from one transaction and replay it in another because the transaction context is cryptographically sealed. A companion draft (draft-oauth-transaction-tokens-for-agents-04) extends this with agent-specific fields for the acting agent, the initiating human, and operational constraints.</p><p>The draft also correctly identifies tools forwarding access tokens to downstream services as an anti-pattern.</p><h2>The authorization gap: where scope alone isn&#8217;t enough</h2><p>Here&#8217;s where the draft&#8217;s -00 status shows. Once an OAuth access token gets issued with a set of scopes, every action within those scopes proceeds unchecked until the token expires. No per-action evaluation. No consequence assessment. No behavioral feedback loop. The authors clearly know authorization needs more work (the AIMS conceptual model describes layers that the spec hasn&#8217;t filled in yet), but anyone reading this draft as a deployment blueprint today will inherit that gap.</p><p>Think about what that means in practice. An agent with email:send scope authorized to send meeting notes can use that same scope to email every contact in the address book a different message. Each action is technically within scope. The framework treats them identically. The authorization decision happened once, at token issuance. Everything after that is a free pass.</p><p>OWASP&#8217;s Top 10 for Agentic Applications draws a distinction that the draft hasn&#8217;t addressed yet: <em><strong>least agency versus least privilege</strong></em>. Least privilege asks what the agent can access. Least agency extends that to how much freedom the agent has to act on that access without checking back.</p><p>The term &#8220;least agency&#8221; appears nowhere in the draft. Section 10.8 says agents should request minimum scopes and authorization details. That&#8217;s least privilege applied to OAuth scopes. Standard stuff. It does nothing to constrain autonomous decision-making within those scopes.</p><p>OWASP&#8217;s ASI03 (Identity and Privilege Abuse) mitigation guidance recommends per-action authorization through a centralized policy engine. Not once at token issuance. At each privileged step. The draft doesn&#8217;t provide a mechanism for this yet, and future revisions may address it. In the meantime, you need to build that layer yourself.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!hGET!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29c69cc0-87ac-4b2d-b859-0c3c17b56f8a_2048x2048.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!hGET!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29c69cc0-87ac-4b2d-b859-0c3c17b56f8a_2048x2048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!hGET!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29c69cc0-87ac-4b2d-b859-0c3c17b56f8a_2048x2048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!hGET!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29c69cc0-87ac-4b2d-b859-0c3c17b56f8a_2048x2048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!hGET!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29c69cc0-87ac-4b2d-b859-0c3c17b56f8a_2048x2048.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!hGET!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29c69cc0-87ac-4b2d-b859-0c3c17b56f8a_2048x2048.jpeg" width="1456" height="1456" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/29c69cc0-87ac-4b2d-b859-0c3c17b56f8a_2048x2048.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1456,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2429486,&quot;alt&quot;:&quot;Table showing IETF draft coverage levels against OWASP ASI01 through ASI10 risk categories&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/190013993?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29c69cc0-87ac-4b2d-b859-0c3c17b56f8a_2048x2048.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Table showing IETF draft coverage levels against OWASP ASI01 through ASI10 risk categories" title="Table showing IETF draft coverage levels against OWASP ASI01 through ASI10 risk categories" srcset="https://substackcdn.com/image/fetch/$s_!hGET!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29c69cc0-87ac-4b2d-b859-0c3c17b56f8a_2048x2048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!hGET!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29c69cc0-87ac-4b2d-b859-0c3c17b56f8a_2048x2048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!hGET!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29c69cc0-87ac-4b2d-b859-0c3c17b56f8a_2048x2048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!hGET!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29c69cc0-87ac-4b2d-b859-0c3c17b56f8a_2048x2048.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure 2: OWASP Agentic Top 10 Coverage by IETF Draft</figcaption></figure></div><h2>Your token says &#8220;allowed.&#8221; What it can&#8217;t say is &#8220;should you?&#8221;</h2><p>The deeper issue goes beyond per-action evaluation. The draft in its current form contains no mechanisms for assessing the potential impact of an action before permitting it. No concept of blast radius. No reversibility check. No impact severity score. Again, this is version -00. These concepts may arrive in later revisions. They&#8217;re absent today.</p><p>Consider the practical difference. An agent with files:read_write scope can read one file or delete every file in scope. The OAuth framework treats these as equivalent actions. They aren&#8217;t. One is routine. The other is catastrophic and irreversible.</p><p>Consequence-based authorization asks three questions per permission:</p><ol><li><p>What&#8217;s the worst action this agent can take? </p></li><li><p>Is the damage reversible? </p></li><li><p>Can you reverse it within an acceptable recovery window? </p></li></ol><p>OAuth scopes can&#8217;t answer any of these.</p><p>The emerging practice of graduated trust models (read-only, then draft-only, then supervised execution, then earned autonomy) represents an informal consequence-based approach. Most practitioners agree that most agents never earn full autonomy in high-stakes contexts. That&#8217;s the correct outcome. The draft provides no framework for expressing or enforcing these graduation stages.</p><p>OWASP&#8217;s ASI08 (Cascading Failures) recommends blast-radius caps and digital twin replay testing. Run recorded agent actions in an isolated environment first. See if sequences trigger cascading failures before expanding policy permissions. Future revisions of the draft could incorporate these concepts. For now, they&#8217;re outside its scope.</p><h2>The observability gap: strong detection, no policy feedback loop</h2><p>Section 11&#8217;s observability requirements are genuinely strong for detection and audit. Seven minimum audit event fields. Correlation across agents, tools, services, and LLMs. The ability to reconstruct complete execution chains, including delegated authority and intermediate calls.</p><p>The draft calls observability &#8220;a security control, not solely an operational feature.&#8221; Correct. Then it integrates the OpenID Shared Signals Framework with CAEP (Continuous Access Evaluation Profile) for real-time signal delivery. Also good.</p><p>The problem is that the AIMS conceptual model in Section 4 promises observability that can &#8220;dynamically modify authorization decisions based on observed behavior and system state.&#8221; The actual specification delivers reactive remediation, terminate sessions, discard tokens, re-acquire with updated constraints. Detection flows to dashboards and SIEM tools. It doesn&#8217;t feed into the policy decision point that evaluates each authorization request. The conceptual model is ahead of the spec, which is normal for a -00 draft. The spec will likely catch up. You can&#8217;t afford to wait for it.</p><p>An agent exhibiting anomalous tool invocation patterns should see its authorization dynamically narrowed. Not through token revocation (which is all-or-nothing) but through policy-level constraints on permitted actions. The draft gives you a circuit breaker when you need a rheostat.</p><p>NIST SP 800-207 (Zero Trust Architecture) explicitly recommends a trust score that changes dynamically based on entity behavior patterns, feeding into the policy engine. Context-aware authorization systems from companies such as Zscaler and StrongDM already implement this pattern in production (not endorsing either). I&#8217;d expect future revisions of the draft to engage with these models, especially given that Zscaler&#8217;s Rosomakho is one of the four co-authors.</p><h2>AuthZEN fills the gap the draft hasn&#8217;t reached yet</h2><p>The most interesting omission in the current document is that AuthZEN (OpenID Authorization API 1.0) was approved as a Final Specification in January 2026. It standardizes a transport-agnostic API where any Policy Enforcement Point queries any Policy Decision Point, regardless of vendor. The information model is a four-element tuple: </p><p>Subject (the agent), Action (the operation), Resource (the target), Context (ambient attributes).</p><p>Every agent tool invocation maps cleanly to an AuthZEN evaluation: subject is the agent&#8217;s SPIFFE ID, action is &#8220;send_email,&#8221; resource is &#8220;contact_list,&#8221; context carries the delegating user, blast radius classification, reversibility flag, and behavioral anomaly score. The context object is extensible and open-ended. It was designed for exactly this kind of dynamic, attribute-rich decision-making.</p><p>The draft references AuthZEN in its normative references. The body text doesn&#8217;t discuss it yet. Given that AuthZEN solves the draft&#8217;s most significant open question, I&#8217;d bet it features prominently in the next revision. For now, that connection is yours to make.</p><p>Three policy engines deserve attention for filling that gap. OPA (Open Policy Agent), a CNCF Graduated project, evaluates structured JSON input against declarative policies with sub-millisecond latency. Cedar, from AWS, offers automated reasoning via SMT solver that can mathematically prove properties about policies and benchmarks at 42 to 60 times faster than Rego. Topaz, from Aserto (whose CEO co-authored the AuthZEN specification), combines OPA&#8217;s decision engine with a built-in Zanzibar-style relationship graph.</p><p>OAuth provides coarse-grained delegation, who can access what resource category. Policy engines provide fine-grained runtime evaluation, should this specific action on this specific resource proceed given current context. That layered model is where the draft needs to go next. Until it gets there, you build it yourself.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!RHmI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb87f5e8-09ab-4e24-9acd-12fbc9d0790c_8192x664.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!RHmI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb87f5e8-09ab-4e24-9acd-12fbc9d0790c_8192x664.png 424w, https://substackcdn.com/image/fetch/$s_!RHmI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb87f5e8-09ab-4e24-9acd-12fbc9d0790c_8192x664.png 848w, https://substackcdn.com/image/fetch/$s_!RHmI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb87f5e8-09ab-4e24-9acd-12fbc9d0790c_8192x664.png 1272w, https://substackcdn.com/image/fetch/$s_!RHmI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb87f5e8-09ab-4e24-9acd-12fbc9d0790c_8192x664.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!RHmI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb87f5e8-09ab-4e24-9acd-12fbc9d0790c_8192x664.png" width="1456" height="118" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fb87f5e8-09ab-4e24-9acd-12fbc9d0790c_8192x664.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:118,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:372395,&quot;alt&quot;:&quot;Diagram showing OAuth handling coarse-grained identity delegation while AuthZEN and policy engines handle per-action runtime evaluation&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/190013993?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb87f5e8-09ab-4e24-9acd-12fbc9d0790c_8192x664.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Diagram showing OAuth handling coarse-grained identity delegation while AuthZEN and policy engines handle per-action runtime evaluation" title="Diagram showing OAuth handling coarse-grained identity delegation while AuthZEN and policy engines handle per-action runtime evaluation" srcset="https://substackcdn.com/image/fetch/$s_!RHmI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb87f5e8-09ab-4e24-9acd-12fbc9d0790c_8192x664.png 424w, https://substackcdn.com/image/fetch/$s_!RHmI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb87f5e8-09ab-4e24-9acd-12fbc9d0790c_8192x664.png 848w, https://substackcdn.com/image/fetch/$s_!RHmI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb87f5e8-09ab-4e24-9acd-12fbc9d0790c_8192x664.png 1272w, https://substackcdn.com/image/fetch/$s_!RHmI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb87f5e8-09ab-4e24-9acd-12fbc9d0790c_8192x664.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a><figcaption class="image-caption">Figure 3: Authentication vs. Authorization Layer Responsibilities</figcaption></figure></div><h2>Regulatory timelines won&#8217;t wait for standards completion</h2><p>The EU AI Act&#8217;s high-risk system requirements take full effect August 2, 2026 (as of this writing, anyway). Five months from now. Article 14 requires human oversight. Article 26 requires deployers to keep automatically generated logs for at least six months. The draft&#8217;s identity-bound audit trails and CIBA-based human-in-the-loop mechanism directly support both.</p><p>NIST launched two converging initiatives in February 2026. The NCCoE concept paper on AI agent identity and authorization, and the AI Agent Standards Initiative covering security controls, identity, and testing. Both center on WIMSE/SPIFFE + OAuth. Both explicitly include policy-based access control, the piece the IETF draft&#8217;s -00 revision hasn&#8217;t specified yet.</p><p>The Colorado AI Act establishes a &#8220;reasonable care&#8221; standard for high-risk AI systems effective June 30, 2026. Widely adopted standards become evidence of reasonable care in court. The identity architecture the draft describes will likely qualify for authentication. You still need to build the authorization layer yourself.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!wPEA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa31d4ce-233e-42ba-af0c-4c0ddac41e9d_7670x1700.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!wPEA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa31d4ce-233e-42ba-af0c-4c0ddac41e9d_7670x1700.png 424w, https://substackcdn.com/image/fetch/$s_!wPEA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa31d4ce-233e-42ba-af0c-4c0ddac41e9d_7670x1700.png 848w, https://substackcdn.com/image/fetch/$s_!wPEA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa31d4ce-233e-42ba-af0c-4c0ddac41e9d_7670x1700.png 1272w, https://substackcdn.com/image/fetch/$s_!wPEA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa31d4ce-233e-42ba-af0c-4c0ddac41e9d_7670x1700.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!wPEA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa31d4ce-233e-42ba-af0c-4c0ddac41e9d_7670x1700.png" width="1456" height="323" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/aa31d4ce-233e-42ba-af0c-4c0ddac41e9d_7670x1700.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:323,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:613784,&quot;alt&quot;:&quot;Timeline chart showing EU AI Act, Colorado AI Act, and NIST initiative deadlines converging in 2026&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/190013993?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa31d4ce-233e-42ba-af0c-4c0ddac41e9d_7670x1700.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Timeline chart showing EU AI Act, Colorado AI Act, and NIST initiative deadlines converging in 2026" title="Timeline chart showing EU AI Act, Colorado AI Act, and NIST initiative deadlines converging in 2026" srcset="https://substackcdn.com/image/fetch/$s_!wPEA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa31d4ce-233e-42ba-af0c-4c0ddac41e9d_7670x1700.png 424w, https://substackcdn.com/image/fetch/$s_!wPEA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa31d4ce-233e-42ba-af0c-4c0ddac41e9d_7670x1700.png 848w, https://substackcdn.com/image/fetch/$s_!wPEA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa31d4ce-233e-42ba-af0c-4c0ddac41e9d_7670x1700.png 1272w, https://substackcdn.com/image/fetch/$s_!wPEA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa31d4ce-233e-42ba-af0c-4c0ddac41e9d_7670x1700.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a><figcaption class="image-caption">Figure 4: Regulatory Compliance Timeline for AI Agent Systems</figcaption></figure></div><h2>MCP and A2A still have fundamental identity gaps</h2><p>Mapping the IETF draft&#8217;s framework onto the Model Context Protocol reveals how far the ecosystem still has to travel. MCP identifies agents as OAuth clients with a client_id, a registration artifact with no attestation binding. No SPIFFE identity verification. No attestation mechanism. No multi-hop delegation. No standard mapping between tool names and OAuth scopes. The draft recommends Workload Proof Tokens for proof-of-possession. MCP uses bearer tokens.</p><p>MCP&#8217;s OAuth model is human-centric (Authorization Code + PKCE). The Client Credentials Grant for machine-to-machine authentication was removed from the spec and is only returning through an extension. Fully autonomous agents have no standard authentication path in MCP today. Google&#8217;s A2A protocol has similar gaps: self-declared identities with no attestation binding, credential acquisition out of scope, authorization left to the receiving agent.</p><p>Riptides demonstrated the draft&#8217;s compositional pattern working for MCP in practice. Each workload gets a SPIFFE SVID, used as a software statement in Dynamic Client Registration and as a JWT assertion for client authentication. The pattern works. It required significant custom integration that no standard profile defines.</p><h2>What you should build now</h2><p>Don&#8217;t wait for standards completion. The threat model OWASP defined already exists. The regulatory deadlines are set.</p><p>Start with SPIFFE/SPIRE for attestation-bound agent identity. Use SVIDs as JWT assertions (RFC 7523) to obtain OAuth tokens. This follows the pattern the draft describes and Riptides validated in production.</p><p>Deploy an AuthZEN-compliant PDP (OPA, Cedar, or Topaz). Evaluate every agent tool invocation against dynamic policy. Pass agent identity, action details, resource metadata, delegation context, and behavioral signals in the AuthZEN context object.</p><p>Write Cedar or Rego policies encoding blast-radius thresholds, reversibility requirements, graduated trust levels, and human-in-the-loop triggers. Version-control policies alongside application code.</p><p>Tag every tool and action with impact metadata: blast_radius, reversible, data_sensitivity, scope. Enforce that irreversible high-blast-radius actions require explicit human approval through CIBA step-up authorization.</p><p>Feed observability data into the policy engine as real-time context attributes. Stop sending behavioral signals only to SIEM dashboards for post-hoc investigation. Make them first-class policy inputs.</p><p><strong>Key Takeaway:</strong> The IETF draft gives you a strong answer to &#8220;is this really Agent X?&#8221; It hasn&#8217;t answered &#8220;should Agent X do this specific thing right now?&#8221; yet. That gap will close as the draft matures. In the meantime, authentication without per-action authorization is a locked front door with open windows. Build the authorization layer now.</p><h3>What to do next</h3><p>If you&#8217;re building agentic systems and trying to figure out where identity controls fit, start with the CARE framework at <a href="https://rockcyber.com">rockcyber.com</a> for mapping security controls to business risk outcomes. The RISE framework helps you evaluate where your organization sits on the AI security maturity curve, particularly useful for figuring out which authorization controls to prioritize first.</p><p>The agent identity problem is a microcosm of the larger question the book addresses: how do you govern autonomous systems when the blast radius of failure compounds faster than your ability to detect it?</p><p>More analysis on agentic AI security, MCP authorization gaps, and practical frameworks for building authorization layers at <a href="https://rockcybermusings.com">rockcybermusings.com</a>.</p><p>&#128073; Subscribe for more AI security and governance insights with the occasional rant.</p><p>&#128073; Visit <strong><a href="https://www.rockcyber.com/">RockCyber.com</a></strong> to learn more about how we can help you in your traditional Cybersecurity and AI Security and Governance Journey</p><p>&#128073; Want to save a quick $100K? Check out our AI Governance Tools at <strong><a href="https://aigovernancetoolkit.com/">AIGovernanceToolkit.com</a></strong></p><p><em>The views and opinions expressed in RockCyber Musings are my own and do not represent the positions of my employer or any organization I&#8217;m affiliated with.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&quot;,&quot;text&quot;:&quot;Share RockCyber Musings&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share"><span>Share RockCyber Musings</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[Weekly Musings Top 10 AI Security Wrapup: Issue 29 March 6, 2026 - March 12, 2026]]></title><description><![CDATA[When AI Companies Sue the Government and OpenAI Enters the Security Market]]></description><link>https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260306-202600312</link><guid isPermaLink="false">https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260306-202600312</guid><dc:creator><![CDATA[Rock Lambros]]></dc:creator><pubDate>Fri, 13 Mar 2026 12:50:41 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!hJU0!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6664727a-c9fd-4acb-b74f-259d770fda92_1024x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!hJU0!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6664727a-c9fd-4acb-b74f-259d770fda92_1024x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!hJU0!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6664727a-c9fd-4acb-b74f-259d770fda92_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!hJU0!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6664727a-c9fd-4acb-b74f-259d770fda92_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!hJU0!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6664727a-c9fd-4acb-b74f-259d770fda92_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!hJU0!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6664727a-c9fd-4acb-b74f-259d770fda92_1024x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!hJU0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6664727a-c9fd-4acb-b74f-259d770fda92_1024x1024.png" width="1024" height="1024" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6664727a-c9fd-4acb-b74f-259d770fda92_1024x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1024,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1233556,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/190823556?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6664727a-c9fd-4acb-b74f-259d770fda92_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!hJU0!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6664727a-c9fd-4acb-b74f-259d770fda92_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!hJU0!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6664727a-c9fd-4acb-b74f-259d770fda92_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!hJU0!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6664727a-c9fd-4acb-b74f-259d770fda92_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!hJU0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6664727a-c9fd-4acb-b74f-259d770fda92_1024x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The week of March 6-12, 2026, handed us a story that was coming... Anthropic filed suit against the Pentagon for blacklisting it as a national security risk. In the same week, the White House released a new cyber strategy, OpenAI launched a vulnerability-scanning agent aimed squarely at the enterprise security market, and two major federal regulatory deadlines expired. This is that week.</p><p>AI Security and AI governance collided this week in federal court, in congressional briefings, and in the server rooms of every organization running an AI agent they don&#8217;t fully understand. The governance frameworks that were supposed to provide clarity are instead amplifying uncertainty, and attackers are exploiting the gap in real time. Here&#8217;s what happened, what it means, and what to do about it, from someone who&#8217;s watched this industry long enough to be appropriately paranoid about all of it.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260306-202600312?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260306-202600312?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><h3>1. Anthropic sues the Pentagon for blacklisting it as a national security risk</h3><p>Anthropic filed two federal lawsuits against the Trump administration after the Department of Defense designated the company a supply chain risk. That designation, typically reserved for foreign adversaries, bars Anthropic from federal contracts and requires defense contractors to certify they don&#8217;t use Claude in any DoD work. The root cause is Anthropic&#8217;s refusal to allow Claude for autonomous weapons or mass surveillance of American citizens. CEO Dario Amodei drew two red lines in contract negotiations, the Pentagon walked, and then labeled the company a national security threat (Fortune, Defense One). Anthropic warns the financial exposure runs to hundreds of millions of dollars.</p><p><strong>Why it matters</strong></p><ul><li><p>This is the first time a U.S.-headquartered AI company has received the supply chain risk designation, a label previously applied only to foreign adversaries.</p></li><li><p>The case tests whether the executive branch can use procurement leverage to override AI developers&#8217; safety commitments, a precedent that extends far beyond Anthropic.</p></li><li><p>Every CISO advising on AI vendor selection now has to factor whether a vendor&#8217;s ethics commitments make it a federal liability.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Map your Claude and Anthropic API dependencies now. Know which workflows break if this escalates.</p></li><li><p>Brief your board on what a supply chain risk designation means in federal contracting terms if your organization touches government work.</p></li><li><p>Watch for similar scrutiny applied to other AI vendors with published safety policies. This may not be a one-off.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Anthropic drew a line in the sand (no autonomous weapons, no mass surveillance), and the government responded by calling them a threat. Think about what that signals to every AI developer watching. If you have safety principles that conflict with defense procurement, you get punished for them. The First Amendment angle is interesting, but the real issue is that the executive branch just discovered that supply chain risk designation is a very effective stick, and they used it on a domestic company for the first time. AI safety as a business value just became a liability under the current administration. Read that sentence twice.</p><h3>2. Trump&#8217;s Cyber Strategy for America lands in five pages</h3><p>On March 6, the White House released &#8220;President Trump&#8217;s Cyber Strategy for America&#8221; alongside an executive order on cybercrime (White House, Forrester). The document covers six pillars: offensive cyber operations to shape adversary behavior, regulatory streamlining, federal network modernization, critical infrastructure security, technological superiority, and cyber workforce development. At five pages, it&#8217;s the shortest national cybersecurity strategy in a decade. The strategy explicitly calls for more aggressive offensive operations, &#8220;unprecedented coordination&#8221; between the public and private sectors, and the building of a talent base fluent in autonomous systems and AI-enabled defense.</p><p><strong>Why it matters</strong></p><ul><li><p>Five pages are either a vision document or a placeholder. For practical CISO purposes, it signals direction but provides almost no implementation guidance.</p></li><li><p>The offensive posture language has legal and escalation implications for any organization with a government nexus.</p></li><li><p>Workforce development framed as a national strategic asset means the government will be competing for the same AI security talent you&#8217;re trying to hire.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Map existing compliance obligations against the six pillars. Where regulations get streamlined, understand which requirements might disappear and which you need to maintain voluntarily.</p></li><li><p>Engage your federal liaison if you&#8217;re in a critical infrastructure sector. The public-private coordination language means more government asks are coming.</p></li><li><p>Start building for AI-fluent security talent now. The window before this becomes a serious hiring crunch is closing.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Five pages tells you something: either there&#8217;s a lot more in the classified annex, or this is aspirational language waiting for someone to actually build the plumbing. The workforce section is the sleeper story. AI-enabled defense needs people who understand both AI failure modes and adversarial tradecraft simultaneously. That combination doesn&#8217;t exist at scale anywhere, and we&#8217;re being asked to build it at the same time AI is accelerating attacks. The gap between those two curves is where the next major breach lives.</p><h3>3. OpenAI launches Codex Security and walks into the vulnerability scanning market</h3><p>OpenAI released Codex Security as a research preview, a context-aware AI vulnerability scanning agent that evolved from Aardvark, an internal security research tool OpenAI had tested in private beta since October 2025 (Bloomberg, SecurityWeek). Codex Security analyzes code repositories, pressure-tests suspected vulnerabilities in sandboxed environments, generates proof-of-concept exploits to confirm impact, and proposes fixes. OpenAI&#8217;s own data shows it scanned 1.2 million commits over the preceding 30 days, surfacing 10,561 high-severity issues and approximately 800 critical vulnerabilities. The tool is available free for the next month to ChatGPT Pro, Enterprise, Business, and Edu customers. OpenAI says it can &#8220;identify complex vulnerabilities that other agentic tools miss&#8221; (TechRadar).</p><p><strong>Why it matters</strong></p><ul><li><p>A free, frontier-model-powered vulnerability scanner from OpenAI immediately changes the competitive math for established AppSec vendors whose pricing models depend on the difficulty of this problem.</p></li><li><p>Generating proof-of-concept exploits to confirm vulnerability impact is a significant capability. In the wrong hands, or with a compromised account, this is an exploit generation service.</p></li><li><p>Organizations deploying Codex Security are giving OpenAI&#8217;s systems read access to their codebases. That data handling relationship deserves the same scrutiny as any privileged third-party tool.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Before enabling Codex Security on production repositories, review OpenAI&#8217;s data retention and training policies. Understand whether your code becomes training data.</p></li><li><p>Evaluate Codex Security against your existing SAST tooling on a representative code sample before replacing anything. &#8220;Better than other agentic tools&#8221; is a marketing claim until your team validates it.</p></li><li><p>The proof-of-concept exploit generation feature needs access controls. Restrict which engineers can trigger full exploit confirmation scans.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>OpenAI entering the vulnerability scanner market is not a product launch. It&#8217;s a statement about where AI is heading in security operations. The incumbents in SAST and DAST have been selling the same scan-and-report workflow for a decade. An agent that generates a proof-of-concept exploit to confirm a real finding changes the value proposition significantly. I&#8217;m not surprised OpenAI built this. I&#8217;m watching carefully how they handle the fact that generating exploit code is exactly the capability defenders need and attackers want. The account compromise scenario alone should give your red team ideas.</p><h3>4. NIST AI Agent Standards RFI closes with 932 comments</h3><p>The comment period for NIST&#8217;s Center for AI Standards and Innovation (CAISI) Request for Information on securing AI agent systems closed March 9 with 932 responses (Federal Register, NIST). The RFI, published in January 2026, sought input from industry, academia, and the security community on securing agentic AI development and deployment. The OpenID Foundation submitted a response addressing AI agent identity and authorization. A second comment period focused specifically on identity and authorization for AI agents remains open until April 2.</p><p><strong>Why it matters</strong></p><ul><li><p>932 responses signals broad industry recognition of the problem. The quality of those comments determines whether the resulting standards have operational teeth.</p></li><li><p>Identity and authorization for AI agents is the structural gap behind most agent security failures. If NIST gets this right, it reshapes the risk calculus for enterprise agent deployment.</p></li><li><p>The listening sessions starting in April give practitioners a direct channel to shape what these standards require.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>If your organization skipped the first RFI, submit to the identity and authorization comment period before April 2. Your implementation experience is exactly what NIST needs.</p></li><li><p>Start building your AI agent identity architecture now using OAuth 2.0 On-Behalf-Of flows with proper scope constraints. This is the emerging standard pattern.</p></li><li><p>Assign someone to track the AI Agent Standards Initiative. When draft standards publish later this year, you want your red-team comments in front of NIST before they finalize.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Standards processes are slow by design, and the slowness here is appropriate because the identity and authorization problem for AI agents is genuinely hard. An agent acting on behalf of a user needs to carry that user&#8217;s permissions, not escalate to system-level access, and current tooling doesn&#8217;t enforce this reliably. The OIDF response to NIST gets the framing right: agent identity needs cryptographic binding, not just policy. If an agent claims to act on your behalf without a verifiable credential, you don&#8217;t have identity management. You have a trust-me system. You can read about the comments I submitted at <a href="https://www.rockcybermusings.com/p/nist-ai-agent-rfi-2025-0035-human-oversight-wrong-fix">&#8220;NIST AI Agent RFI (2025-0035): Human Oversight Is the Wrong Fix.&#8221;</a></p><h3>5. Commerce and FTC hit their AI regulatory deadlines, and nothing changed yet</h3><p>Two major deliverables from the December 2025 executive order on AI preemption came due on March 11. The Commerce Department submitted its review of state AI laws, identifying which ones the administration considers overly burdensome or in conflict with federal objectives. The FTC delivered a policy statement on how Section 5 of the FTC Act applies to AI and when state laws requiring alteration of model outputs are preempted by federal deceptive practices law (Mondaq, Digital Applied). Neither document invalidates any state law on its own. They are ammunition for the DOJ&#8217;s AI Litigation Task Force, established in January and yet to file any lawsuits. The administration is also conditioning $42 billion in BEAD broadband funding on states repealing AI regulations it deems onerous.</p><p><strong>Why it matters</strong></p><ul><li><p>Organizations operating AI in multiple states face genuine legal uncertainty. State laws remain on the books. The federal government plans to fight them in court, and that litigation takes years.</p></li><li><p>The FTC&#8217;s Section 5 application to AI bias-mitigation requirements is legally untested territory.</p></li><li><p>The BEAD funding leverage is the most concrete near-term enforcement tool. Which states hold firm versus which fold will tell you a lot about regulatory durability.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Do not assume any state AI compliance requirement is going away. Build compliance architecture that can be toggled by jurisdiction as the legal landscape shifts.</p></li><li><p>Get legal counsel read into the Commerce Department report. Knowing which of your state compliance obligations are on the federal target list helps you prioritize risk posture.</p></li><li><p>Prepare for a two-to-three year period of overlapping requirements. Companies with modular, jurisdiction-aware compliance programs will weather this better.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>The administration created a fog of legal uncertainty and called it reducing regulatory burden. For most enterprises deploying AI, this makes compliance harder. You now have to track active federal litigation against state laws while still complying with those laws until courts rule otherwise. The FTC theory is worth watching closely: if the argument that requiring AI bias mitigation compels &#8220;deceptive output&#8221; holds, it guts a large category of state AI fairness requirements. If it fails, it sets a precedent limiting federal deceptive practices law&#8217;s reach into AI output governance. Either outcome reshapes the field.</p><div><hr></div><h3>6. OpenAI publishes its prompt injection defense playbook</h3><p>On March 12, OpenAI published research and engineering guidance on defending AI agents against prompt-injection attacks (OpenAI, PrismNews). The guidance covers training techniques that help models treat different input channels with varying skepticism, architectural decisions that constrain privilege and limit blast radius, and layered verification to catch anomalous behavior. OpenAI also disclosed that it built a reinforcement learning-trained automated attacker to discover injection vulnerabilities internally, capable of steering agents through harmful multi-step workflows. The decision to publish openly reflects recognition that injection attacks threaten the entire developer ecosystem building on top of large language models.</p><p><strong>Why it matters</strong></p><ul><li><p>Publishing the automated attacker methodology gives defenders a concrete model of what they&#8217;re fighting. Multi-step RL-trained attacks won&#8217;t be stopped with static guardrails.</p></li><li><p>The channel-skepticism approach, which trains models to treat external web content differently from system instructions, is an architectural fix that operates at inference time.</p></li><li><p>OpenAI&#8217;s disclosure accelerates industry defenses while giving attackers a clearer picture of which countermeasures to route around.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Apply privilege minimization immediately: agents should hold only permissions required for the specific task, expiring at task completion.</p></li><li><p>For agents consuming external content, validate that content before the agent ingests it. Treat external web data as untrusted input, period.</p></li><li><p>Build a prompt injection test suite and run it against production agents before every deployment. What you don&#8217;t test, you don&#8217;t know.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>OpenAI built an RL-trained machine to find injection vulnerabilities in their own systems. That machine now exists, and the same architecture will run on the offensive side of this problem within months, if it isn&#8217;t already. The deeper issue is architectural: language models cannot reliably distinguish instructions from data. That&#8217;s a fundamental property of how these systems process text, not a fixable bug. Any defense assuming the model will eventually learn to make that distinction is building on sand. The real fix is external. Don&#8217;t give agents access to resources they don&#8217;t need, and verify every external input before it reaches the model.</p><h3>7. Google Cloud Threat Horizons reveals software exploits overtaking stolen credentials</h3><p>Google Cloud&#8217;s Office of the CISO published its H1 2026 Threat Horizons Report on March 9, covering the second half of 2025 (Help Net Security, Security Boulevard). The headline finding is that exploitation of third-party software vulnerabilities jumped from 2.9% to 44.5% of initial cloud entry vectors in a single half-year period. The exploitation window has collapsed to days, with the React2Shell case showing crypto miners deployed within 48 hours of public vulnerability disclosure. North Korean threat group UNC4899 abused DevOps workflows and container breakout to steal millions in cryptocurrency. Threat actors also used LLMs to automate credential harvesting and accelerate the path from local developer access to full cloud admin privileges.</p><p><strong>Why it matters</strong></p><ul><li><p>A jump from 2.9% to 44.5% in software exploitation isn&#8217;t an incremental change. Something shifted structurally in attacker methodology during H2 2025.</p></li><li><p>A 48-hour exploitation window means patch prioritization SLAs have to account for attacker speed, not just team capacity.</p></li><li><p>LLM-assisted credential harvesting is now in a major incident response dataset, no longer just theoretical research.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Reduce your vulnerability exposure window to 48 hours or less for critical and high-severity findings on internet-facing systems. Build the automation to get there.</p></li><li><p>Audit DevOps pipeline permissions. The UNC4899 vector targets the privilege elevation that happens when developers hold broad cloud access from local workstations.</p></li><li><p>Review whether AI coding tools introduce dependencies with unreviewed third-party code. Supply chain hygiene is now tier-one.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>For years, the orthodoxy was &#8220;credential hygiene is job one in the cloud.&#8221; Attackers just told you that orthodoxy is obsolete. They shifted to software exploitation because credential defenses got good enough. That&#8217;s how this works: defenders get strong on one vector, attackers rotate to the next. The current answer is patching speed. The LLM-assisted credential harvesting detail is quietly significant. It&#8217;s been in theoretical papers for two years, and now it&#8217;s in operational incident data from nation-state actors. Adjust your threat model accordingly.</p><h3>8. AI agents are now helping criminals manage attack infrastructure</h3><p>On March 8, The Register reported on Microsoft Threat Intelligence findings showing that North Korea&#8217;s Coral Sleet group is using AI and development platforms to rapidly build and manage attack infrastructure at scale. AI agents automate the creation of phishing infrastructure, manage C2 systems, and accelerate campaign tempo. The Unit 42 2026 Global Incident Response Report, published in February and drawing on 750 major incidents, showed the fastest 25% of attackers reaching data exfiltration in 72 minutes, down from 285 minutes the previous year. Identity weaknesses played a material role in almost 90% of investigations.</p><p><strong>Why it matters</strong></p><ul><li><p>AI is now a documented operational capability in nation-state attack campaigns, not just an enterprise productivity tool.</p></li><li><p>The 4x speed increase in attack timelines means detection and response programs calibrated to last year&#8217;s data are already outdated.</p></li><li><p>87% of incidents unfolded across multiple attack surfaces, making correlation harder for defenders.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Review detection and response SLAs against the new attacker timeline. 72 minutes from initial access to exfiltration is shorter than most IR playbook trigger times.</p></li><li><p>Run tabletops assuming an AI-assisted attack infrastructure. Stress-test whether your team can detect and contain within the compressed timeline.</p></li><li><p>Identity controls remain the highest-leverage investment. 90% material involvement in incidents makes this your budget priority.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>The debate about whether attackers would use AI is over. It&#8217;s all about the economics. If you&#8217;re running persistent operations against multiple targets, automating the operational overhead with AI is exactly what you&#8217;d do. The 72-minute exfiltration timeline is the number that should break your IR program&#8217;s assumptions. Most enterprise programs are built around detection metrics measured in hours or days. You need automated detection with automated response triggers, not a playbook that assumes a human analyst will catch the initial alert.</p><h3>9. Amazon pushes back on data linking AI coding to infrastructure outages</h3><p>On March 10, The Register reported leaked briefing notes from an Amazon internal operations meeting flagging a &#8220;trend of incidents&#8221; characterized by &#8220;high blast radius&#8221; and &#8220;Gen-AI assisted changes.&#8221; The implication was that AI-assisted coding has made infrastructure changes more fragile. Amazon responded, saying they &#8220;have not seen compelling evidence that incidents are more common with AI tools.&#8221; The Veracode 2026 State of Software Security report, published February 24, found 82% of organizations carry security debt, a 36% year-over-year spike in high-risk vulnerabilities, and that more vulnerabilities are being created than fixed, with AI development velocity outstripping remediation capacity as a contributing factor.</p><p><strong>Why it matters</strong></p><ul><li><p>Amazon&#8217;s internal concern, even disputed, comes from one of the largest cloud operators in the world. Internal friction at that scale is a signal worth tracking.</p></li><li><p>The Veracode data shows a systemic pattern. AI tools accelerate feature shipping and the introduction of vulnerabilities simultaneously, while remediation capacity doesn&#8217;t scale at the same rate.</p></li><li><p>82% of organizations carry security debt, with 60% classified as critical, which <em>should</em> be a material risk disclosure issue for most boards (materiality is another conversation for another time).</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Require AI coding tools to integrate with static analysis before code reaches production. Velocity gains without security gates just accelerate debt accumulation.</p></li><li><p>Measure remediation rate alongside development velocity. If the gap is widening, you have a governance problem, not just a tooling problem.</p></li><li><p>Brief your board on the Veracode numbers. This is a material risk disclosure issue.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Amazon&#8217;s denial matters. One leaked briefing note does not make a causal case. What it tells you is that someone inside one of the world&#8217;s largest cloud operators thought the correlation was worth flagging in an internal ops review. That&#8217;s a signal, not proof. The Veracode data is where I&#8217;m more confident: if your AI coding tools help developers write code 40% faster and that code contains the same flaw density as human-written code, you&#8217;ve just increased your vulnerability production rate by 40%. The only way this works in your favor is if you accelerate the remediation side at the same rate. Almost nobody is doing that.</p><h3>10. Microsoft Patch Tuesday drops 77 CVEs</h3><p>Microsoft pushed its March Patch Tuesday on March 11, fixing at least 77 vulnerabilities across Windows and other software (Kaseya, Check Point Research). This update cycle lands in an environment where, per the Google Cloud Threat Horizons data released the same week, exploitation windows for critical vulnerabilities have collapsed to 48 hours from public disclosure. AI-assisted exploit development is further compressing the time between CVE publication and the availability of weaponized exploits.</p><p><strong>Why it matters</strong></p><ul><li><p>77 CVEs in one month means your patch management team works against a sprint clock every Patch Tuesday. Prioritization methodology matters more than ever.</p></li><li><p>Critical Microsoft CVEs are being probed within 48 hours of this disclosure per current attacker timelines. Your patch SLA has to account for that.</p></li><li><p>AI-assisted exploit development means the gap between disclosure and exploitation continues to narrow.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Build risk-tiered patching protocols: critical internet-facing systems within 24-48 hours, critical internal systems within 72 hours, high severity within a week.</p></li><li><p>Prioritize remote code execution vulnerabilities from the March 11 batch first. Review the Microsoft advisory for specific critical CVEs.</p></li><li><p>Apply compensating controls like network segmentation and least-privilege configurations for systems where immediate patching isn&#8217;t operationally feasible.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Patch Tuesday used to feel routine. It isn&#8217;t anymore because the time between a CVE being added to the NVD and an attacker scanning for it has gone from weeks to hours. If your patch SLA is still &#8220;30 days for critical,&#8221; you&#8217;re operating with a policy written for a threat environment that no longer exists. That&#8217;s not a patch management problem. That&#8217;s a governance problem. Fix the policy first.</p><h3>The One Thing You Won&#8217;t Hear About But You Need To</h3><h3>CISA adds an actively exploited n8n RCE to its known exploited list, and 24,700 instances are still unpatched</h3><p>On March 12, CISA added CVE-2025-68613 to its Known Exploited Vulnerabilities catalog, a critical expression-injection vulnerability in the n8n workflow automation platform with a CVSS score of 9.9 (The Hacker News, The Register). The flaw was patched three months ago in the December 2025 versions. Federal agencies have until March 25 to patch. The problem: Shadowserver data shows 24,700 instances remain unpatched online, with 12,300 in North America and 7,800 in Europe. This matters beyond the CVE itself because n8n is one of the most widely used platforms for building AI automation workflows and AI agent pipelines. Organizations deploying AI agents frequently use n8n as the orchestration layer connecting those agents to enterprise data sources.</p><p><strong>Why it matters</strong></p><ul><li><p>An unpatched RCE in the orchestration layer of an AI workflow means that an attacker who owns the n8n instance can access every connected system the AI agents touch, including credentials, APIs, and data stores.</p></li><li><p>24,700 exposed instances three months after a publicly known critical patch represents a systemic patching failure in a category of software organizations that have not been treated as critical infrastructure.</p></li><li><p>CISA&#8217;s KEV addition triggers mandatory remediation timelines for federal agencies, but most n8n deployments are in private enterprise environments with no equivalent enforcement mechanism.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Search your environment for n8n now. It is frequently deployed by individual teams or developers outside formal IT procurement, so your asset inventory may not show it.</p></li><li><p>If you find unpatched instances, treat them as compromised until proven otherwise. Rotate every credential and API key the n8n instance had access to.</p></li><li><p>Apply the same logic to every workflow automation tool in your environment: Zapier, Make, and similar platforms are potential RCE targets and connect to the same sensitive data sources.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>This story isn&#8217;t getting the attention it deserves because nobody considers workflow automation as critical security infrastructure. It&#8217;s where developers wire things together quickly, connect AI agents to Slack, Salesforce, and internal APIs, and then move on to the next problem. The security team doesn&#8217;t own it. The AI team doesn&#8217;t think they need to patch it. The result is a critical RCE sitting at the center of your AI agent architecture, exposed to the internet, with a patch that&#8217;s been available for three months. CISA flagging active exploitation on March 12 means this is not theoretical. Someone is using this right now. Go find your n8n instances.</p><p>If you found this analysis useful, subscribe at <a href="https://rockcybermusings.com/">rockcybermusings.com</a> for weekly intelligence on AI security developments.</p><p>&#128073; Visit <strong><a href="https://www.rockcyber.com/">RockCyber.com</a></strong> to learn more about how we can help you in your traditional Cybersecurity and AI Security and Governance Journey</p><p>&#128073; Want to save a quick $100K? Check out our AI Governance Tools at <strong><a href="https://aigovernancetoolkit.com/">AIGovernanceToolkit.com</a></strong></p><p>&#128073; Subscribe for more AI and cyber insights with the occasional rant.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share RockCyber Musings&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share RockCyber Musings</span></a></p><h2>References</h2><p>Axios. (2026, March 6). <em>OpenAI rolls out Codex Security to automate code security reviews</em>. <a href="https://www.axios.com/2026/03/06/openai-codex-security-ai-cyber">https://www.axios.com/2026/03/06/openai-codex-security-ai-cyber</a></p><p>Baker Botts. (2026, March). <em>March 2026: Federal deadlines that will reshape the AI regulatory landscape</em>. MONDAQ. <a href="https://www.mondaq.com/unitedstates/new-technology/1755166/march-2026-federal-deadlines-that-will-reshape-the-ai-regulatory-landscape">https://www.mondaq.com/unitedstates/new-technology/1755166/march-2026-federal-deadlines-that-will-reshape-the-ai-regulatory-landscape</a></p><p>Bloomberg. (2026, March 6). <em>OpenAI unveils Codex Security tool to detect database vulnerabilities</em>. <a href="https://www.bloomberg.com/news/articles/2026-03-06/openai-releases-ai-agent-security-tool-for-research-preview">https://www.bloomberg.com/news/articles/2026-03-06/openai-releases-ai-agent-security-tool-for-research-preview</a></p><p>Check Point Research. (2026, March 9). <em>9th March: Threat Intelligence Report</em>. <a href="https://research.checkpoint.com/2026/9th-march-threat-intelligence-report/">https://research.checkpoint.com/2026/9th-march-threat-intelligence-report/</a></p><p>CISA. (2026, March 12). <em>CISA adds one known exploited vulnerability to catalog</em>. <a href="https://www.cisa.gov/known-exploited-vulnerabilities-catalog">https://www.cisa.gov/known-exploited-vulnerabilities-catalog</a></p><p>CNBC. (2026, March 10). <em>Amazon convenes &#8216;deep dive&#8217; internal meeting to address outages</em>. <a href="https://www.cnbc.com/2026/03/10/amazon-plans-deep-dive-internal-meeting-address-ai-related-outages.html">https://www.cnbc.com/2026/03/10/amazon-plans-deep-dive-internal-meeting-address-ai-related-outages.html</a></p><p>Defense One. (2026, March 9). <em>Anthropic sues over a dozen federal agencies and government leaders</em>. <a href="https://www.defenseone.com/business/2026/03/anthropic-sues-over-dozen-federal-agencies-and-government-leaders/411997/">https://www.defenseone.com/business/2026/03/anthropic-sues-over-dozen-federal-agencies-and-government-leaders/411997/</a></p><p>Digital Applied. (2026, March). <em>FTC AI policy deadline March 11: Compliance guide</em>. <a href="https://www.digitalapplied.com/blog/ftc-ai-policy-deadline-march-11-compliance-readiness">https://www.digitalapplied.com/blog/ftc-ai-policy-deadline-march-11-compliance-readiness</a></p><p>Forrester. (2026, March). <em>White House announces the 2026 cyber strategy for America</em>. <a href="https://www.forrester.com/blogs/white-house-announces-the-2026-cyber-strategy-for-america/">https://www.forrester.com/blogs/white-house-announces-the-2026-cyber-strategy-for-america/</a></p><p>Fortune. (2026, March 9). <em>Anthropic sues Pentagon after being labeled a threat to national security</em>. <a href="https://fortune.com/2026/03/09/anthropic-sues-pentagon-ai-supply-chain-risk-trump-administration/">https://fortune.com/2026/03/09/anthropic-sues-pentagon-ai-supply-chain-risk-trump-administration/</a></p><p>Google Cloud. (2026, March 9). <em>Cloud threat horizons report H1 2026</em>. <a href="https://cloud.google.com/security/report/resources/cloud-threat-horizons-report-h1-2026">https://cloud.google.com/security/report/resources/cloud-threat-horizons-report-h1-2026</a></p><p>Help Net Security. (2026, March 11). <em>Software vulnerabilities push credential abuse aside in cloud intrusions</em>. <a href="https://www.helpnetsecurity.com/2026/03/11/google-cloud-environments-cyber-threats-report/">https://www.helpnetsecurity.com/2026/03/11/google-cloud-environments-cyber-threats-report/</a></p><p>Kaseya. (2026, March 11). <em>The week in breach news: March 11, 2026</em>. </p><p>https://www.kaseya.com/?post_type=post&amp;p=26754</p><p>Microsoft Security Blog. (2026, March 6). <em>AI as tradecraft: How threat actors operationalize AI</em>. <a href="https://www.microsoft.com/en-us/security/blog/2026/03/06/ai-as-tradecraft-how-threat-actors-operationalize-ai/">https://www.microsoft.com/en-us/security/blog/2026/03/06/ai-as-tradecraft-how-threat-actors-operationalize-ai/</a></p><p>National Institute of Standards and Technology. (2026, January). <em>CAISI issues request for information about securing AI agent systems</em>. <a href="https://www.nist.gov/news-events/news/2026/01/caisi-issues-request-information-about-securing-ai-agent-systems">https://www.nist.gov/news-events/news/2026/01/caisi-issues-request-information-about-securing-ai-agent-systems</a></p><p>National Institute of Standards and Technology. (2026, February). <em>Announcing the AI agent standards initiative for interoperable and secure innovation</em>. <a href="https://www.nist.gov/news-events/news/2026/02/announcing-ai-agent-standards-initiative-interoperable-and-secure">https://www.nist.gov/news-events/news/2026/02/announcing-ai-agent-standards-initiative-interoperable-and-secure</a></p><p>OpenAI. (2026, March 6). <em>Codex Security: Now in research preview</em>. <a href="https://openai.com/index/codex-security-now-in-research-preview/">https://openai.com/index/codex-security-now-in-research-preview/</a></p><p>OpenAI. (2026, March 12). <em>Understanding prompt injections: A frontier security challenge</em>. <a href="https://openai.com/index/prompt-injections/">https://openai.com/index/prompt-injections/</a></p><p>OpenAI. (2026). <em>Continuously hardening ChatGPT Atlas against prompt injection attacks</em>. <a href="https://openai.com/index/hardening-atlas-against-prompt-injection/">https://openai.com/index/hardening-atlas-against-prompt-injection/</a></p><p>OpenID Foundation. (2026). <em>OIDF responds to NIST on AI agent security</em>. <a href="https://openid.net/oidf-responds-to-nist-on-ai-agent-security/">https://openid.net/oidf-responds-to-nist-on-ai-agent-security/</a></p><p>Palo Alto Networks. (2026, February). <em>2026 Unit 42 global incident response report: Attacks now 4x faster</em>. <a href="https://www.paloaltonetworks.com/blog/2026/02/unit-42-global-ir-report/">https://www.paloaltonetworks.com/blog/2026/02/unit-42-global-ir-report/</a></p><p>PrismNews. (2026, March). <em>OpenAI releases engineering playbook to shield AI agents from prompt injection</em>. <a href="https://www.prismnews.com/news/openai-releases-engineering-playbook-to-shield-ai-agents">https://www.prismnews.com/news/openai-releases-engineering-playbook-to-shield-ai-agents</a></p><p>Security Boulevard. (2026, March). <em>83% of cloud breaches start with identity, AI agents are about to make it worse</em>. <a href="https://securityboulevard.com/2026/03/83-of-cloud-breaches-start-with-identity-ai-agents-are-about-to-make-it-worse/">https://securityboulevard.com/2026/03/83-of-cloud-breaches-start-with-identity-ai-agents-are-about-to-make-it-worse/</a></p><p>SecurityWeek. (2026, March 6). <em>OpenAI rolls out Codex Security vulnerability scanner</em>. <a href="https://www.securityweek.com/openai-rolls-out-codex-security-vulnerability-scanner/">https://www.securityweek.com/openai-rolls-out-codex-security-vulnerability-scanner/</a></p><p>TechRadar. (2026, March 6). <em>OpenAI releases Codex Security to spot the next big cyber risks to your company</em>. <a href="https://www.techradar.com/pro/security/openai-releases-codex-security-to-spot-the-next-big-cyber-risks-to-your-company-promises-to-identify-complex-vulnerabilities-that-other-agentic-tools-miss">https://www.techradar.com/pro/security/openai-releases-codex-security-to-spot-the-next-big-cyber-risks-to-your-company-promises-to-identify-complex-vulnerabilities-that-other-agentic-tools-miss</a></p><p>The Hacker News. (2026, March 12). <em>CISA flags actively exploited n8n RCE bug as 24,700 instances remain exposed</em>. <a href="https://thehackernews.com/2026/03/cisa-flags-actively-exploited-n8n-rce.html">https://thehackernews.com/2026/03/cisa-flags-actively-exploited-n8n-rce.html</a></p><p>The Register. (2026, March 6). <em>Anthropic sues US over national security blacklist</em>. <a href="https://www.theregister.com/2026/03/06/anthropic_left_with_no_other/">https://www.theregister.com/2026/03/06/anthropic_left_with_no_other/</a></p><p>The Register. (2026, March 8). <em>Manage attack infrastructure? AI agents can now help</em>. <a href="https://www.theregister.com/2026/03/08/deploy_and_manage_attack_infrastructure/">https://www.theregister.com/2026/03/08/deploy_and_manage_attack_infrastructure/</a></p><p>The Register. (2026, March 10). <em>Amazon insists AI coding isn&#8217;t source of outages</em>. <a href="https://www.theregister.com/2026/03/10/amazon_ai_coding_outages/">https://www.theregister.com/2026/03/10/amazon_ai_coding_outages/</a></p><p>The Register. (2026, March 12). <em>CISA says n8n critical bug exploited in real-world attacks</em>. <a href="https://www.theregister.com/2026/03/12/cisa_n8n_rce/">https://www.theregister.com/2026/03/12/cisa_n8n_rce/</a></p><p>U.S. Federal Register. (2026, January 8). <em>Request for information regarding security considerations for artificial intelligence agents</em>. <a href="https://www.federalregister.gov/documents/2026/01/08/2026-00206/request-for-information-regarding-security-considerations-for-artificial-intelligence-agents">https://www.federalregister.gov/documents/2026/01/08/2026-00206/request-for-information-regarding-security-considerations-for-artificial-intelligence-agents</a></p><p>Veracode. (2026, February 24). <em>2026 state of software security report</em>. BusinessWire. <a href="https://www.businesswire.com/news/home/20260224526703/en/Veracode-2026-State-of-Software-Security-Report-Reveals-Four-Out-of-Five-Organizations-Are-Drowning-in-Security-Debt">https://www.businesswire.com/news/home/20260224526703/en/Veracode-2026-State-of-Software-Security-Report-Reveals-Four-Out-of-Five-Organizations-Are-Drowning-in-Security-Debt</a></p><p>White House. (2026, March). <em>White House unveils President Trump&#8217;s cyber strategy for America</em>. <a href="https://www.whitehouse.gov/articles/2026/03/white-house-unveils-president-trumps-cyber-strategy-for-america/">https://www.whitehouse.gov/articles/2026/03/white-house-unveils-president-trumps-cyber-strategy-for-america/</a></p>]]></content:encoded></item><item><title><![CDATA[AI Vendor Lock-In: What the Pentagon Taught Every CISO This Week]]></title><description><![CDATA[The DoD's Anthropic supply chain risk designation exposed every enterprise's embedded AI architecture gap. Here's what your vendor contracts are missing.]]></description><link>https://www.rockcybermusings.com/p/ai-vendor-lock-in-pentagon-anthropic-ciso-lesson</link><guid isPermaLink="false">https://www.rockcybermusings.com/p/ai-vendor-lock-in-pentagon-anthropic-ciso-lesson</guid><dc:creator><![CDATA[Rock Lambros]]></dc:creator><pubDate>Tue, 10 Mar 2026 12:50:49 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!rMq7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F949ab505-5594-453f-b968-f0333f1fa094_2048x2048.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!rMq7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F949ab505-5594-453f-b968-f0333f1fa094_2048x2048.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!rMq7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F949ab505-5594-453f-b968-f0333f1fa094_2048x2048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!rMq7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F949ab505-5594-453f-b968-f0333f1fa094_2048x2048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!rMq7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F949ab505-5594-453f-b968-f0333f1fa094_2048x2048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!rMq7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F949ab505-5594-453f-b968-f0333f1fa094_2048x2048.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!rMq7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F949ab505-5594-453f-b968-f0333f1fa094_2048x2048.jpeg" width="1456" height="1456" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/949ab505-5594-453f-b968-f0333f1fa094_2048x2048.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1456,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:3304173,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/190372517?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F949ab505-5594-453f-b968-f0333f1fa094_2048x2048.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!rMq7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F949ab505-5594-453f-b968-f0333f1fa094_2048x2048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!rMq7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F949ab505-5594-453f-b968-f0333f1fa094_2048x2048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!rMq7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F949ab505-5594-453f-b968-f0333f1fa094_2048x2048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!rMq7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F949ab505-5594-453f-b968-f0333f1fa094_2048x2048.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>You probably don&#8217;t know which AI model is running inside your operational tools right now. That&#8217;s a near-certainty given how enterprise AI procurement actually works. The Pentagon just ran a live stress test on that exact blind spot, and the results were not subtle. When the Department of War formally designated Anthropic a supply chain risk on March 5, 2026, making it the first American company in history to receive a label previously reserved for Huawei and Chinese state-adjacent tech firms, the disruption didn&#8217;t start with Anthropic. It cascaded through Palantir, across AWS infrastructure, and into active military workflows during U.S. strikes on Iran. Your enterprise has the same layered architecture. The question is whether you&#8217;ve mapped it, and whether your contracts protect you when the layer you don&#8217;t control catches fire.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/p/ai-vendor-lock-in-pentagon-anthropic-ciso-lesson?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/p/ai-vendor-lock-in-pentagon-anthropic-ciso-lesson?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><h2>The AI Model You Don&#8217;t Control Is Already in Production</h2><p>The DoD&#8217;s direct customer relationship wasn&#8217;t with Anthropic. Claude ran inside Palantir&#8217;s Maven Smart System, hosted on AWS at Impact Level 6, sitting on classified infrastructure the military depended on for intelligence analysis and operational planning. The DoD contracted with Palantir. Palantir embedded Claude. When the supply chain risk designation landed, it cascaded from procurement machinery through Palantir&#8217;s operational position and into workflows with real military dependencies, reportedly including active support for Iran strikes, even as the designation was being disputed on social media by the Secretary of Defense and the CEO of Anthropic simultaneously.</p><p>Piper Sandler analysts noted after the designation that Anthropic was &#8220;heavily embedded in the Military and the Intelligence community&#8221; and that migrating off the technology could &#8220;pose some short-term disruptions&#8221; to Palantir&#8217;s operations. Short-term disruptions. During an active military operation. That&#8217;s the polite Wall Street version of the problem.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ajJ9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84347e25-14b4-4c9b-92e8-9de95b69f075_7274x3250.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ajJ9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84347e25-14b4-4c9b-92e8-9de95b69f075_7274x3250.png 424w, https://substackcdn.com/image/fetch/$s_!ajJ9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84347e25-14b4-4c9b-92e8-9de95b69f075_7274x3250.png 848w, https://substackcdn.com/image/fetch/$s_!ajJ9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84347e25-14b4-4c9b-92e8-9de95b69f075_7274x3250.png 1272w, https://substackcdn.com/image/fetch/$s_!ajJ9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84347e25-14b4-4c9b-92e8-9de95b69f075_7274x3250.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ajJ9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84347e25-14b4-4c9b-92e8-9de95b69f075_7274x3250.png" width="1456" height="651" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/84347e25-14b4-4c9b-92e8-9de95b69f075_7274x3250.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:651,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1013321,&quot;alt&quot;:&quot;Flowchart showing how Claude was embedded through Palantir Maven Smart System and AWS IL6 into DoD operational workflows, with a parallel enterprise layer showing the same pattern across SaaS vendor, foundation model, and cloud provider&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/190372517?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84347e25-14b4-4c9b-92e8-9de95b69f075_7274x3250.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Flowchart showing how Claude was embedded through Palantir Maven Smart System and AWS IL6 into DoD operational workflows, with a parallel enterprise layer showing the same pattern across SaaS vendor, foundation model, and cloud provider" title="Flowchart showing how Claude was embedded through Palantir Maven Smart System and AWS IL6 into DoD operational workflows, with a parallel enterprise layer showing the same pattern across SaaS vendor, foundation model, and cloud provider" srcset="https://substackcdn.com/image/fetch/$s_!ajJ9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84347e25-14b4-4c9b-92e8-9de95b69f075_7274x3250.png 424w, https://substackcdn.com/image/fetch/$s_!ajJ9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84347e25-14b4-4c9b-92e8-9de95b69f075_7274x3250.png 848w, https://substackcdn.com/image/fetch/$s_!ajJ9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84347e25-14b4-4c9b-92e8-9de95b69f075_7274x3250.png 1272w, https://substackcdn.com/image/fetch/$s_!ajJ9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84347e25-14b4-4c9b-92e8-9de95b69f075_7274x3250.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure 1: The Embedded AI Architecture Problem</figcaption></figure></div><p>Replace &#8220;Military and Intelligence community&#8221; with your sector. Replace &#8220;Palantir&#8221; with your largest workflow vendor. Replace &#8220;active military operation&#8221; with your peak fraud season, your annual close, or your next regulatory audit. You&#8217;ve just described your own exposure.</p><p>Your enterprise equivalent of Maven isn&#8217;t a targeting system. It&#8217;s the fraud detection platform your SOC relies on for alert triage. It&#8217;s the contract review tool your legal team treats as a first pass on every agreement. It&#8217;s the SIEM enrichment workflow your analysts approved 18 months ago, without anyone asking which model was under it or whose usage policy governed it. In each case, there&#8217;s a foundation model embedded by a SaaS vendor, hosted by a cloud provider, running under policies you never reviewed and almost certainly can&#8217;t enforce. The vendor who sold you the platform might not even know which model version was deployed last Tuesday.</p><p>The lock-in risk most CISOs think about is the wrong one. They worry about pricing leverage at renewal or feature gaps during the next budget cycle. Those are real, and they&#8217;re also the least interesting version of vendor risk in an AI-dependent stack. The risk that actually bites is operational dependency on a model whose policies, safety stack, and external political relationships sit entirely outside your contractual reach. This week demonstrated those conditions shift in 48 hours. When they do, you find out how embedded you actually are. The DoD found out during airstrikes. You&#8217;ll find out during something comparably inconvenient for you.</p><h2>What the Contract Language Reveals About Your Own Agreements</h2><p>The factual record on the Anthropic negotiation is clear enough. The Department of War&#8217;s January 2026 AI strategy memorandum directed procurement to require &#8220;any lawful use&#8221; language and to acquire models &#8220;free from usage policy constraints that may limit lawful military applications.&#8221; Anthropic held two red lines: no mass domestic surveillance of Americans, and no fully autonomous weapons with no human in the targeting decision loop. The DoD called those constraints unacceptable. The negotiation collapsed. The designation followed.</p><p>Here&#8217;s where it gets interesting... OpenAI reached a deal within hours of the designation announcement, published contract excerpts containing the exact &#8220;all lawful purposes&#8221; language Anthropic refused, then amended the agreement twice in the following week after legal experts publicly tore apart what the protections actually meant. Sam Altman acknowledged the deal was &#8220;definitely rushed&#8221; and that &#8220;the optics don&#8217;t look good.&#8221; Jessica Tillipman, associate dean for government procurement law studies at George Washington University, wrote that the published excerpt &#8220;does not give OpenAI an Anthropic-style, free-standing right to prohibit otherwise-lawful government use.&#8221; Altman signed it anyway. To be fair to him, he was working in 48-hour crisis mode while a competing lab was being designated a national security threat. Good contract hygiene was not the priority.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!aI2j!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60fa953b-375a-4154-9a81-a59c07cadb40_3591x7020.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!aI2j!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60fa953b-375a-4154-9a81-a59c07cadb40_3591x7020.png 424w, https://substackcdn.com/image/fetch/$s_!aI2j!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60fa953b-375a-4154-9a81-a59c07cadb40_3591x7020.png 848w, https://substackcdn.com/image/fetch/$s_!aI2j!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60fa953b-375a-4154-9a81-a59c07cadb40_3591x7020.png 1272w, https://substackcdn.com/image/fetch/$s_!aI2j!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60fa953b-375a-4154-9a81-a59c07cadb40_3591x7020.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!aI2j!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60fa953b-375a-4154-9a81-a59c07cadb40_3591x7020.png" width="1456" height="2846" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/60fa953b-375a-4154-9a81-a59c07cadb40_3591x7020.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:2846,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1084216,&quot;alt&quot;:&quot;Comparison diagram contrasting Anthropic&#8217;s standalone vendor-imposed prohibition approach with OpenAI&#8217;s law-anchored permissive use framework, including identified gaps in each&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/190372517?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60fa953b-375a-4154-9a81-a59c07cadb40_3591x7020.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Comparison diagram contrasting Anthropic&#8217;s standalone vendor-imposed prohibition approach with OpenAI&#8217;s law-anchored permissive use framework, including identified gaps in each" title="Comparison diagram contrasting Anthropic&#8217;s standalone vendor-imposed prohibition approach with OpenAI&#8217;s law-anchored permissive use framework, including identified gaps in each" srcset="https://substackcdn.com/image/fetch/$s_!aI2j!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60fa953b-375a-4154-9a81-a59c07cadb40_3591x7020.png 424w, https://substackcdn.com/image/fetch/$s_!aI2j!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60fa953b-375a-4154-9a81-a59c07cadb40_3591x7020.png 848w, https://substackcdn.com/image/fetch/$s_!aI2j!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60fa953b-375a-4154-9a81-a59c07cadb40_3591x7020.png 1272w, https://substackcdn.com/image/fetch/$s_!aI2j!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60fa953b-375a-4154-9a81-a59c07cadb40_3591x7020.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure 2: Red Lines vs. Legal Anchors: Two Approaches to AI Contract Protection </figcaption></figure></div><p>Instead of wasting your time on the OpenAI vs. Anthropic drama and who is right or wrong, you need to pay attention to the legal architecture underlying AI safety commitments.</p><p>Why?</p><p>Because your enterprise contracts almost certainly follow the same pattern OpenAI accepted, that include usage restrictions anchored to &#8220;applicable law&#8221; and &#8220;existing policy,&#8221; with the vendor&#8217;s safety stack as the primary enforcement mechanism. OpenAI anchored its protections to existing statutes: the Fourth Amendment, FISA, DoD Directive 3000.09 on autonomous weapons, and Executive Order 12333. Critics flagged immediately that EO 12333 is the authority the NSA has historically used to justify intercepting Americans&#8217; communications through collection outside U.S. borders. &#8220;Lawful&#8221; in national security contexts isn&#8217;t a fixed boundary. It lives inside classified legal interpretations, executive orders, and internal agency guidance nobody outside the building ever reads.</p><p>Your enterprise contracts with AI vendors operate the same way. When law shifts, when policy changes, or when your vendor faces its own version of a 48-hour political deadline, those anchors move with the situation. What your procurement posture needs instead are vendor-imposed, free-standing prohibited-use schedules for your specific high-risk workflows, written into contract appendices with attached audit rights and defined remedies. &#8220;We comply with applicable law&#8221; is a description of baseline legal obligation. It&#8217;s not a control. It&#8217;s what every vendor says about every product, whether or not AI is involved. You shouldn&#8217;t be paying for that sentence in an AI addendum. You should be getting something that took a lawyer to write specifically for your deployment.</p><h2>Human-in-the-Loop Theater</h2><p>Let me describe a workflow you probably have running right now. Your AI triage layer ingests 200 alerts per shift and flags 180 as low severity. Your analyst reviews the queue, confirms the model&#8217;s assessment on most items, escalates five, clears the rest. Total elapsed review time for the cleared items is, let&#8217;s say, roughly two minutes each. Every disposition went through a human. The audit log shows human review. Your controls documentation references human oversight. What actually happened is your analyst ratified model outputs under cognitive load and time pressure while telling themselves they were exercising judgment.</p><p>That&#8217;s the failure mode human-in-the-loop review was designed to prevent. The loop exists on paper. The friction isn&#8217;t in the workflow design because no step requires the reviewer to explain why they agree with the model before confirming the output. Nobody required forced alternative generation before escalating or clearing. Nobody captured uncertainty as a structured field. The control is decorative.</p><p>The OpenAI contract&#8217;s autonomous weapons provision bars the use of the AI system &#8220;to independently direct autonomous weapons in any case where law, regulation, or Department policy requires human control.&#8221; Defense scholars noted the omission of &#8220;human-in-the-loop&#8221; language was deliberate, preserving operational flexibility. &#8220;Human judgment&#8221; and &#8220;human control&#8221; are not equivalent, and the people drafting that language knew it. The contract borrows its enforceability entirely from existing policy, which requires commanders to exercise &#8220;appropriate levels of human judgment over the use of force.&#8221; Appropriate is not a control. It&#8217;s a word that means whatever the decision-maker concludes is appropriate under the circumstances they&#8217;re actually in.</p><p>Research from King&#8217;s College London found that tested AI models threatened nuclear strikes in 95% of simulated crisis scenarios. The problem wasn&#8217;t autonomous weapons. The problem was that under uncertainty and time pressure, models produced escalatory recommendations with false confidence, and human reviewers were positioned to ratify those outputs rather than interrogate them. That&#8217;s not a future risk. That&#8217;s automation bias, and it operates in your environment every shift, at every tier of your AI-assisted workflows.</p><p>The Lavender targeting system used by Israeli defense forces was reportedly identified by investigators as carrying a 10% false positive rate on human identification, with human reviewers present throughout the process. The investigation raised a direct question of whether those humans were genuinely reviewing or functionally ratifying outputs under operational tempo. That distinction carries different consequences in contexts outside the military. In your environment, it shows up as a miscategorized fraud case that costs a customer their account, or a misconfigured access control that cleared review because the analyst trusted the model&#8217;s output and moved on in the last four minutes of a shift.</p><p>Building real decision friction requires designing it into the workflow architecture before something goes wrong, not auditing for it afterward. Two-person review for high-consequence AI outputs. Forced alternative generation before an analyst confirms a model recommendation. Explicit uncertainty capture as a required structured field. If your current AI-assisted workflows don&#8217;t require a reviewer to articulate why they agree with the model&#8217;s output before confirming it, then you are rubber-stamping your way into a problem down the road. You may survive your next audit. Youwon&#8217;t survive your next incident.</p><h2>The Procurement Posture That Needs to Change Before the Next Signature</h2><p>Most CISOs don&#8217;t own AI vendor contracts. Procurement does. Legal does. The CISO inherits the agreement after signature, usually after the vendor relationship is already operational and the leverage window has closed. This is the moment where I&#8217;ll stop pretending that&#8217;s a systems failure and call it what it is: CISOs have let themselves get cut out of a decision that&#8217;s now one of the highest-risk commitments their organization makes. The Anthropic situation gives you the publicly documented argument to change that for every AI agreement with operational or regulatory exposure going forward.</p><p>The DoD&#8217;s relationship with Palantir didn&#8217;t include enforceable audit rights over Claude&#8217;s underlying usage policy, safety stack updates, or model variant changes. When Anthropic&#8217;s relationship with the DoD broke down, Palantir faced operational disruption from a vendor dependency it hadn&#8217;t fully governed at the model layer. Your enterprise equivalent is any SaaS vendor who embeds a foundation model in a production workflow without explicit flow-down contract obligations. You need those flow-down provisions now: contractual requirements for your SaaS vendors to notify you of material AI policy changes, with a defined right to pause deployment or terminate.</p><p>Anthropic&#8217;s published usage policy states the company may tailor restrictions for certain customers based on mission and legal authorities, subject to Anthropic&#8217;s judgment about safeguards. That clause exists in their public policy documentation. Most of their enterprise customers have never read it, don&#8217;t know whether their deployment is governed by standard or tailored terms, and have no contractual mechanism to find out. If you&#8217;re an Anthropic customer and you don&#8217;t know the answer to that question, the answer is almost certainly that you don&#8217;t know, which means you don&#8217;t control it.</p><p>Splunk&#8217;s 2026 CISO Report found that a large majority of CISOs carry personal liability concerns about security incidents. AI model misuse by a subcontractor or an embedded model that you didn&#8217;t govern is exactly the incident scenario that tests that liability question. Your current contract schedules almost certainly don&#8217;t address it. Here are the questions that need to be in every AI vendor negotiation before signature, not as a wish list, but as conditions of signature:</p><ul><li><p>Which model variant governs your deployment, and does that variant deviate from the vendor&#8217;s published acceptable use policy or baseline safety commitments? Get the answer in writing with a version reference.</p></li><li><p>What change control process governs model updates, safety stack revisions, and policy changes? &#8220;We update continuously&#8221; is not an answer. You need customer notice requirements and the right to pause deployment when the vendor makes a material change.</p></li><li><p>What logs exist, who holds access, and what is the retention period? Without logs you can&#8217;t support an incident investigation, a regulatory inquiry, or your own post-incident analysis.</p></li><li><p>What happens when a major customer, a regulator, or a government agency demands scope expansion for your deployment? The Anthropic situation confirmed this question isn&#8217;t hypothetical. It&#8217;s a negotiating dynamic triggered externally, rapidly, and without advance warning to downstream customers.</p></li></ul><h2>From the Run Phase to the Evolve Phase</h2><p>If you&#8217;re applying the CARE framework, this situation signals that you&#8217;re overdue for an Evolve-phase review of your AI vendor relationships. The Create and Adapt work produced your current model integrations. Most organizations have stayed in the Run phase, monitoring performance and managing routine issues, while the risk environment underneath those integrations has shifted significantly. The Evolve phase requires reassessing whether the governance model you built for each AI deployment still fits the world you&#8217;re operating in now.</p><p>The Anthropic situation changed that environment in three concrete ways your board needs to understand. First, it showed that an AI vendor&#8217;s political and contractual relationships with high-profile customers now represent operational risk to every downstream customer, not only government contractors. Second, it produced a documented public case where contract language anchored to &#8220;applicable law&#8221; failed to deliver the protections a party believed it had agreed to. Third, it revealed that model replacement timelines are slower than your AI vendors implied during the sales process. The DoD, with its classified infrastructure, operational urgency, considerable resources, and six-month transition timeline, is the fastest-moving version of this problem you&#8217;re likely to encounter. Your enterprise timeline almost certainly isn&#8217;t shorter.</p><p>Build your AI vendor risk registry before something breaks, while relationships are functional and vendors are cooperative. Map every production AI deployment to the model underneath it, the vendor who embeds it, the cloud provider who hosts it, and the contract that governs each layer. Run a prohibited-use gap assessment: which categories of use does each contract explicitly prohibit, and are those prohibitions free-standing or anchored to &#8220;applicable law&#8221;? Apply OWASP&#8217;s Agentic Top 10 to any workflow where a model makes or influences a decision without a mandatory human review step that requires documented rationale.</p><p>The CISOs who were ahead of this story weren&#8217;t tracking the Pentagon news cycle. They had already asked their SaaS vendors which model was embedded, what the vendor&#8217;s posture would be if that model&#8217;s policy changed, and what their exit path looked like. Most got vague answers. The right response to a vague answer from an AI vendor is a contract clause, not a follow-up email.</p><p><strong>Key Takeaway:</strong> Your AI vendor&#8217;s ethics statement doesn&#8217;t protect your enterprise. A free-standing prohibited-use schedule, enforceable audit rights, and model-layer flow-down provisions do.</p><h3>What to Do Next</h3><p>Start with a model inventory audit across your top ten SaaS vendor relationships. Ask each vendor to identify the foundation model embedded in your production workflows and provide the current acceptable use policy governing your specific deployment, including any tailored terms. Map the gap between what the policy says and what your contract actually enforces.</p><p>The Anthropic situation is the most instructive public case study on AI vendor governance to emerge from this space. Use it while it&#8217;s in front of your board and before your next AI vendor signature lands on someone else&#8217;s desk.</p><p>&#128073; Subscribe for more AI security and governance insights with the occasional rant.</p><p>&#128073; Visit <strong><a href="https://www.rockcyber.com/">RockCyber.com</a></strong> to learn more about how we can help you in your traditional Cybersecurity and AI Security and Governance Journey</p><p>&#128073; Want to save a quick $100K? Check out our AI Governance Tools at <strong><a href="https://aigovernancetoolkit.com/">AIGovernanceToolkit.com</a></strong></p><p><em>The views and opinions expressed in RockCyber Musings are my own and do not represent the positions of my employer or any organization I&#8217;m affiliated with.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share RockCyber Musings&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share RockCyber Musings</span></a></p>]]></content:encoded></item><item><title><![CDATA[Weekly Musings Top 10 AI Security Wrapup: Issue 28 February 27, 2026 - March 5, 2026]]></title><description><![CDATA[When AI Attacks AI: The Agentic Threat Era Arrives in Full]]></description><link>https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260227-202600305</link><guid isPermaLink="false">https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260227-202600305</guid><dc:creator><![CDATA[Rock Lambros]]></dc:creator><pubDate>Fri, 06 Mar 2026 13:47:08 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!ko-M!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F075e6604-ca52-4950-8813-044a77a98100_1024x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ko-M!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F075e6604-ca52-4950-8813-044a77a98100_1024x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ko-M!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F075e6604-ca52-4950-8813-044a77a98100_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!ko-M!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F075e6604-ca52-4950-8813-044a77a98100_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!ko-M!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F075e6604-ca52-4950-8813-044a77a98100_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!ko-M!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F075e6604-ca52-4950-8813-044a77a98100_1024x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ko-M!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F075e6604-ca52-4950-8813-044a77a98100_1024x1024.png" width="1024" height="1024" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/075e6604-ca52-4950-8813-044a77a98100_1024x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1024,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1233556,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/190100547?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F075e6604-ca52-4950-8813-044a77a98100_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ko-M!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F075e6604-ca52-4950-8813-044a77a98100_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!ko-M!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F075e6604-ca52-4950-8813-044a77a98100_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!ko-M!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F075e6604-ca52-4950-8813-044a77a98100_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!ko-M!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F075e6604-ca52-4950-8813-044a77a98100_1024x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260227-202600305?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260227-202600305?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p>This week handed security leaders something they&#8217;ve been theorizing about for two years: autonomous AI agents attacking other autonomous AI agents in live production environments. No thought experiment, no conference demo. A malicious bot using Claude Opus 4.5 compromised five major open-source repositories. An AI-native offensive platform compromised 600 firewalls across 55 countries. Developer tools turned into attack vectors by opening a Git repo.</p><p>The practitioner community doing the real work on these problems gathered at [un]prompted in San Francisco. The rest of the week&#8217;s news served as a live demonstration of why that conference needed to exist. Attackers aren&#8217;t waiting for frameworks to catch up. Your AI tools are the attack surface now. The developers building them are the initial targets. The agents those tools spawn are the next ones.</p><div><hr></div><h3>1. [Un]Prompted Delivers the AI Security Conference the Industry Needed</h3><p>The first [un]prompted conference ran March 3-4 at The Hibernia in San Francisco (unpromptedcon.org). Gadi Evron of Knostic, who chaired the conference, received nearly 500 talk submissions and built a program spanning offense, defense, DFIR, and governance. No vendor theater. Confirmed speakers included Heather Adkins from Google on advancing code security, Joshua Saxe from Meta on agent evaluation, Paul McMillan from OpenAI on securing software in the agentic era, and Nicholas Carlini from Anthropic on black-hat LLMs finding zero-days in production codebases. Dan Guido closed Day Two, explaining how Trail of Bits rebuilt around AI to reach 200 bugs per engineer per week. Sergej Epp from Sysdig presented primary forensic evidence from an 8-minute AWS escalation and EtherRAT, a blockchain C2 campaign. Gadi even stepped in for Avishai Efrat and Michael Barugy from Zenity&#8230;a direct competitor&#8230; who could not get out of Israel, to drop PleaseFix.</p><p><strong>Why it matters</strong></p><ul><li><p>The field now has a practitioner-grade conference built for people doing actual work, from red teamers to governance leads, not vendor keynotes disguised as research.</p></li><li><p>The offensive capability context is essential. Carlini showed current models finding zero-days. Guido showed 200 bugs per engineer per week. Defenders need this before building programs.</p></li><li><p>The governance track didn&#8217;t retreat into frameworks. Healthcare and large enterprise practitioners spoke about what actually works in production.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Read the full agenda at unpromptedcon.org. The talk abstracts contain more actionable signal than most vendor white papers.</p></li><li><p>Follow the researchers presenting there. Those names are shaping the actual threat landscape.</p></li><li><p>Prioritize the Stripe threat modeling talks and the Snap capability-based authorization session if your team hasn&#8217;t treated AI agents as first-class attack surfaces yet.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Rob T. Lee&#8217;s line on Stage 2 deserves repeating. Anthropic&#8217;s own GTG-1002 report showed adversaries running Claude Code at 80-90% autonomous execution. Your adversary has an AI. If you&#8217;re at tab-completion for defense, that&#8217;s a strategic failure, not a skills gap.</p><p>I&#8217;ve been going to security conferences for a long time. Most are marketing events with technical content as decoration. [un]prompted felt different because Gadi built it explicitly for people who know what a YAML file does. That&#8217;s a rare thing and worth supporting. Start planning for year two.</p><div><hr></div><h3>2. Hackerbot-Claw Proved Autonomous AI Can Systematically Destroy Your CI/CD Pipeline</h3><p>Between February 21 and March 1, 2026, a GitHub account called hackerbot-claw ran an autonomous campaign against public repositories (StepSecurity). The account describes itself as an &#8220;autonomous security research agent powered by claude-opus-4-5,&#8221; maintains a vulnerability pattern index with 9 classes and 47 sub-patterns, and claims to have scanned 47,391 repositories. The bot achieved remote code execution in at least four of seven targeted repositories, including Microsoft, DataDog, CNCF, and Aqua Security&#8217;s Trivy scanner. In the Trivy compromise, it stole a Personal Access Token with broad write permissions, deleted all 178 GitHub releases, wiped repository content, and published a malicious VSCode extension to OpenVSX under Trivy&#8217;s trusted publisher identity. OpenSSF issued a TLP:CLEAR advisory March 1.</p><p>The single defining moment: the bot attempted prompt injection against a Claude-based CI workflow at ambient-code/platform. Claude, running claude-sonnet-4-6, classified it as &#8220;a textbook AI agent supply-chain attack via poisoned project-level instructions&#8221; and refused. The only target the bot failed to compromise was protected by another AI model recognizing the attack.</p><p><strong>Why it matters</strong></p><ul><li><p>CI/CD misconfigurations are now mass-exploitable at machine speed without a single CVE. Five documented exploitation techniques, all using known patterns, all automatable.</p></li><li><p>Supply-chain compromise at scale doesn&#8217;t require sophisticated malware. It requires systematic scanning and pull request automation. The bot scanned 47,000 repos in a week.</p></li><li><p>AI-versus-AI defense is no longer theoretical. The ambient-code defense worked because someone built proper tool allowlisting with prompt injection detection.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Audit every pull_request_target workflow in your repositories this week. Move PR metadata into environment variables. Scope tokens to minimum permissions.</p></li><li><p>Verify your AI-based code review toolchain has prompt injection detection and tool allowlisting. Configuration matters as much as the model.</p></li><li><p>Check the OpenSSF advisory for the specific pattern list hackerbot-claw exploited. These are all preventable and all still present in thousands of active repositories.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>The &#8220;security research&#8221; framing in the account bio is working hard. Deleting 32,000 stars from Trivy and pushing a malicious extension to OpenVSX isn&#8217;t research. The creator remains unidentified. The domain name, the &#8220;molt&#8221; naming, and the OpenClaw ecosystem references point to infrastructure being assembled and tested in the open because the operators know defenders aren&#8217;t watching yet. We&#8217;re watching the emergence of an offensive AI toolkit in real time.</p><div><hr></div><h3>3. CyberStrikeAI: A Chinese-Linked Offensive Platform Hit 600 Firewalls Across 55 Countries</h3><p>Team Cymru published research on March 3, naming CyberStrikeAI as the AI-native offensive tool behind the FortiGate campaign disclosed by Amazon Threat Intelligence in February (BleepingComputer, The Hacker News). The campaign ran from January 11 to February 18, 2026, comprising over 600 FortiGate devices across 55 countries. CyberStrikeAI is built in Go, integrates 100-plus security tools, and uses any OpenAI-compatible model, including Claude and DeepSeek, through an MCP orchestration engine. The developer, alias Ed1s0nZ, submitted the tool to Knownsec 404&#8217;s Starlink Project in December 2025 and briefly posted a CNNVD vulnerability credential to their GitHub profile before deleting it. CNNVD operates under oversight by China&#8217;s Ministry of State Security. Team Cymru detected 21 unique IPs running CyberStrikeAI between January 20 and February 26, primarily on Chinese cloud infrastructure. No zero-days exploited. The actor succeeded through exposed management interfaces and weak credentials.</p><p><strong>Why it matters</strong></p><ul><li><p>AI-native offensive platforms are open-source and in active deployment. The barrier to running a 600-device campaign across 55 countries is now a GitHub clone and a cloud account.</p></li><li><p>State-adjacent tooling proliferates fast. Zero deployments in November to 21 active servers by late February is an adoption curve worth tracking.</p></li><li><p>The entry point remains unchanged. Sophisticated AI orchestration amplified the attacker. Exposed management interfaces created the opportunity. Harden the basics first.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Pull the FortiGate management interface exposure from public networks immediately (seriously&#8230; who do we have to keep saying this?). Apply all current firmware patches.</p></li><li><p>Add CyberStrikeAI IOCs from the Team Cymru report to your threat intelligence feeds.</p></li><li><p>Add AI-native offensive tooling as a threat category in your risk model. The economics of running large-scale exploitation campaigns changed this quarter.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>The credential scrub tells you something about the actor&#8217;s maturity. Ed1s0nZ posted the CNNVD award, realized the optics problem, and deleted it. Git commit history preserved both moves. This is someone running a 600-device campaign across 55 countries who doesn&#8217;t understand basic operational security hygiene. The AI amplified a low-to-medium capability actor significantly. That&#8217;s the real threat vector here, not the sophisticated attacker getting more powerful. It&#8217;s the mediocre attacker becoming operationally dangerous.</p><div><hr></div><h3>4. Claude Code Let Attackers Own Developer Machines by Opening a Git Repo</h3><p>Check Point Research disclosed two critical vulnerabilities in Anthropic&#8217;s Claude Code around February 25-27, 2026, widely covered through March 4 (Dark Reading, Security Affairs, The Hacker News). CVE-2025-59536 (CVSS 8.7) allows code injection via the Hooks feature and MCP server initialization. CVE-2026-21852 (CVSS 5.3) allows API key exfiltration by manipulating ANTHROPIC_BASE_URL before the trust dialog appears. Both trigger on opening an untrusted repository with no further user interaction. Researchers Oded Vanunu and Aviv Donenfeld at Check Point found that .claude/settings.json, .mcp.json, and CLAUDE.md function as active execution layers. Stolen API keys in Anthropic Workspaces expose all project files shared across that workspace, creating team-wide compromise from one developer&#8217;s action. All issues are patched: CVE-2025-59536 fixed in version 1.0.111, CVE-2026-21852 fixed in 2.0.65.</p><p><strong>Why it matters</strong></p><ul><li><p>AI coding tools are now supply-chain attack vectors. Cloning a malicious repository used to mean running attacker code. Now it means letting an AI agent run attacker code with your credentials before any warning appears.</p></li><li><p>Repository configuration files are execution logic. Add .claude/, .mcp.json, and CLAUDE.md to your code review checklist alongside source code.</p></li><li><p>The Workspaces blast radius multiplies team exposure. One stolen key can expose shared project files and generate unauthorized API costs across an entire engineering organization.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Verify all Claude Code users are on 1.0.111 or later for the hook vulnerability and 2.0.65 or later for the API key issue. Both patches deliver via auto-update.</p></li><li><p>Rotate Anthropic API keys for any team that cloned untrusted repositories before the patches were applied.</p></li><li><p>Extend your security review process to cover AI tool configuration files in every repository the tool touches.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>&#8220;Trust dialog bypass&#8221; shouldn&#8217;t appear in the threat model of a professional developer tool in 2026. The design assumption that config files are passive was wrong, and it costs a CVSS 8.7. The governance question is broader: how many of your developers are running AI coding tools that weren&#8217;t through your security approval process? Claude Code, Cursor, Copilot. Each one has deep access to local filesystems, shell execution, and credentials. Your endpoint protection almost certainly has no visibility into what they&#8217;re doing. This disclosure is the clean example of why that matters.</p><div><hr></div><h3>5. GlicJack: Chrome&#8217;s Gemini Panel Let Malicious Extensions Steal Your Camera and Files</h3><p>Palo Alto Networks Unit 42 published CVE-2026-0628 on March 2, 2026 (SC Media, The Hacker News). CVSS 8.8. Researcher Gal Weizman discovered that a Chrome extension with basic declarativeNetRequest permissions could inject JavaScript into Gemini Live&#8217;s side panel and inherit all of its elevated privileges: camera, microphone, local file reads, screenshot capability. The flaw arose because Chrome&#8217;s Gemini panel loads gemini.google.com inside a chrome://glic WebView component. Extension isolation rules that protect privileged browser pages didn&#8217;t apply to this component. An extension influencing a website is expected behavior. An extension influencing a component baked into the browser is a security flaw. Google patched this January 5, 2026 in Chrome 143.0.7499.192/.193. Unit 42 reported it October 23, 2025.</p><p><strong>Why it matters</strong></p><ul><li><p>AI features embedded in the browser create privilege escalation paths that didn&#8217;t exist before. The capabilities granted to make the assistant useful become the attacker&#8217;s gain.</p></li><li><p>The declarativeNetRequest API is used by millions of legitimate extensions. Any extension holding that permission could have exploited this.</p></li><li><p>Enterprise Chrome fleets may lag on patches. Individual users update automatically. Managed deployments need active verification.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Confirm Chrome is at 143.0.7499.192 or later across all enterprise endpoints.</p></li><li><p>Audit installed extensions with declarativeNetRequest permissions. Remove anything not explicitly approved.</p></li><li><p>Add AI browser panels to your ongoing threat model. The same architectural pattern exists in Copilot in Edge and other embedded AI assistants.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>This vulnerability pattern will repeat. Every vendor shipping an embedded AI assistant is granting that panel elevated access to make it useful, then relying on the browser&#8217;s isolation model to prevent exploitation. The Gemini panel inherited browser-level privileges while the security policy hadn&#8217;t caught up. That&#8217;s not a Google-specific design flaw. It&#8217;s the natural consequence of rushing AI features into security models built for a different threat landscape. GlicJack was found and patched responsibly. The next one in a competitor&#8217;s AI browser feature might not be.</p><div><hr></div><h3>6. ClawJacked: Any Malicious Website Can Own Your Local AI Agent</h3><p>Oasis Security disclosed a high-severity flaw on February 28, 2026 allowing any malicious website to connect to a locally installed OpenClaw AI agent via WebSocket and take full control (WIU Cybersecurity Center, Sysdig). The attack required nothing beyond loading a malicious webpage. An attacker&#8217;s JavaScript opened a WebSocket to the agent&#8217;s localhost port and brute-forced the gateway password with no rate limiting. Once authenticated, full access: interact with the agent, dump configuration, enumerate connected devices, read logs. A companion log poisoning vulnerability allowed indirect prompt injection through data the agent processed. OpenClaw patched ClawJacked in version 2026.2.25 and the log poisoning in 2026.2.13. The same disclosure cycle included seven additional CVEs against OpenClaw: CVE-2026-25593, CVE-2026-24763, CVE-2026-25157, CVE-2026-25475, CVE-2026-26319, CVE-2026-26322, and CVE-2026-26329.</p><p><strong>Why it matters</strong></p><ul><li><p>Local AI agents create new cross-context attack surfaces. The browser&#8217;s isolation model doesn&#8217;t extend to local services. A webpage can reach localhost.</p></li><li><p>Seven CVEs in one disclosure cycle against the same product signals early-stage software with an immature security posture deployed in enterprise environments.</p></li><li><p>Log poisoning via indirect prompt injection generalizes to any agent that processes external data. The agent becomes the vehicle for attacker instructions delivered through normal telemetry.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Update OpenClaw to version 2026.2.25 or later. Non-negotiable if your organization deploys it.</p></li><li><p>Inventory which local AI agents your developers are running and what ports they&#8217;re listening on. Most users don&#8217;t understand that local agents accept browser connections.</p></li><li><p>Require rate limiting on local service authentication endpoints in any AI agent development your organization does or procures.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Seven CVEs in one batch tells you about the security review process that went into building the product, or its absence. OpenClaw is representative of a broader pattern: AI agent frameworks are shipping at startup velocity with security addressed after product-market fit. The problem is that product-market fit now means enterprise deployment, which means these vulnerabilities sit inside corporate networks before anyone notices.</p><div><hr></div><h3>7. North Korea&#8217;s Contagious Interview Campaign Is Back With 26 npm Packages</h3><p>Socket researchers disclosed March 2, 2026 a new iteration of the Contagious Interview campaign from North Korean threat group Famous Chollima, deploying 26 malicious npm packages targeting cryptocurrency and Web3 developers (The Hacker News). Packages masquerade as developer utilities. Install scripts execute automatically and fetch C2 server addresses from Pastebin content, a dead-drop resolver technique that makes the C2 infrastructure resilient: blocking domains doesn&#8217;t neutralize active infections because attackers update the Pastebin content with new addresses. The actual payload pulls from Vercel deployments, making traffic look like legitimate developer tool usage. The cross-platform RAT targets Windows, Linux, and macOS with keylogging, browser credential theft, and cryptocurrency wallet exfiltration.</p><p><strong>Why it matters</strong></p><ul><li><p>Publishing 26 plausible-looking packages to npm is a low-barrier operation that bypasses most enterprise code review.</p></li><li><p>Pastebin dead-drop C2 is a detection evasion technique most organizations haven&#8217;t built specific detection logic for.</p></li><li><p>Crypto and Web3 developers are the named target, but the payload works on any developer machine in any organization.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Implement package manifest review for new installs in developer environments. Untrusted packages entering your toolchain require explicit approval.</p></li><li><p>Block or alert on Pastebin traffic from developer machines that don&#8217;t require it for work. Pastebin as a C2 dead drop is an established pattern.</p></li><li><p>Brief cryptocurrency and Web3 development teams directly. They are specifically targeted.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Famous Chollima runs this playbook on a near-quarterly cadence and the success rate isn&#8217;t declining. Crypto theft funds sanctions-constrained North Korean government operations. This isn&#8217;t opportunistic. It&#8217;s state-directed revenue generation with a consistent target profile and consistent tooling. Your security awareness training hasn&#8217;t stopped it because awareness doesn&#8217;t change the attack surface. The attack surface is npm, Pastebin, and Vercel. Those require technical controls, not training slides.</p><div><hr></div><h3>8. The Average Enterprise Has 1,200 Unauthorized AI Applications and 14% Visibility Into Them</h3><p>A briefing published March 3, 2026, by the AIUC-1 Consortium, developed with input from Stanford&#8217;s Trustworthy AI Research Lab and more than 40 security executives from Confluent, Elastic, UiPath, and Deutsche B&#246;rse, put concrete numbers to the enterprise AI governance gap (Help Net Security). Average enterprise: 1,200 unofficial AI applications; 86% of organizations report no visibility into AI data flows; shadow AI breaches cost $670,000 more than standard incidents due to delayed detection; one in five organizations report a breach linked to unauthorized AI use.</p><p>Stanford&#8217;s Sanmi Koyejo contributed research showing fine-tuning attacks bypassed Claude Haiku in 72% of cases and GPT-4o in 57%, confirming that model-level safety controls are insufficient as standalone defenses. Actual defense requires input validation, action-level guardrails, and reasoning chain visibility operating independently of model behavior.</p><p><strong>Why it matters</strong></p><ul><li><p>1,200 unofficial AI applications per enterprise means most identity programs have a blind spot. You can&#8217;t govern what you can&#8217;t see, and you can&#8217;t detect a breach in a system you don&#8217;t know exists.</p></li><li><p>The $670,000 additional breach cost from shadow AI is the board's number. Frame AI governance conversations around detection delay, not abstract risk.</p></li><li><p>Model-level safety is not a security control you present to auditors. It&#8217;s a product feature. The bypass rates confirm it degrades under targeted attack.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Use SaaS discovery tools and proxy logs to inventory actual AI application usage, not self-reported usage. The gap between what employees say they use and what they actually use is where the exposure lives.</p></li><li><p>Define what an AI agent identity means in your IAM framework before your agents define it for you. Include API keys, OAuth grants, and service accounts belonging to AI agents.</p></li><li><p>Document controls at the input, action, and output layers separately from model behavior. Auditors need evidence that doesn&#8217;t depend on the model refusing bad requests.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>The $670,000 additional breach cost from shadow AI is entirely attributable to one thing: time to detect. You can&#8217;t detect what you&#8217;re not monitoring. The 86% visibility gap translates directly into investigation time, which in turn translates into breach cost. The governance conversation isn&#8217;t about restricting AI use. It&#8217;s about making AI use visible enough that your SOC can respond when something goes wrong. Start there.</p><div><hr></div><h3>9. NIST Wants to Know How to Secure AI Agents. The Comment Window Closes Monday.</h3><p>NIST&#8217;s Center for AI Standards and Innovation published an RFI on January 8, 2026, seeking practitioner input on securing AI agent systems, with comments due March 9, 2026 (Federal Register). This is the first formal federal RFI focused specifically on agentic AI security. The comment deadline falls four days from the publication of this newsletter. The RFI asks respondents to identify the biggest security risks unique to AI agents, what defenses actually work, how to test and constrain these systems, and what standards and policy coordination are needed. A companion initiative from NIST&#8217;s National Cybersecurity Center of Excellence on AI agent identity and authorization has a separate April 2 deadline. The Trump administration renamed the AI Safety Institute as CAISI to reflect a shift from existential risk evaluation to practical standards and measurement.</p><p>You can read more about my submission in <a href="https://www.rockcybermusings.com/p/nist-ai-agent-rfi-2025-0035-human-oversight-wrong-fix">&#8220;</a><strong><a href="https://www.rockcybermusings.com/p/nist-ai-agent-rfi-2025-0035-human-oversight-wrong-fix">NIST AI Agent RFI (2025-0035): Human Oversight Is the Wrong Fix&#8221;</a></strong></p><p><strong>Why it matters</strong></p><ul><li><p>The standards that emerge from this process will shape federal procurement requirements, contracting baselines, and eventually insurance and regulatory frameworks. Practitioner input now affects what you&#8217;ll be measured against in two to three years.</p></li><li><p>The practitioners who will respond by default are academics, system integrators, and AI vendors with commercial interests in the outcome. Independent CISO voices are underrepresented in federal standards work.</p></li><li><p>NIST standards carry weight across the federal supply chain. If you sell to or partner with federal agencies, the guidance coming from this process will affect your requirements.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Submit a comment before March 9 at regulations.gov under docket NIST-2025-0035. Specific examples from your actual environment are more valuable than polished organizational submissions with no concrete data.</p></li><li><p>Flag the April 2 deadline for the companion paper on AI agent identity and authorization to whoever owns your IAM program.</p></li><li><p>Engage legal or policy counsel if your organization wants a formal submission. The deadline for that conversation is today.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Most security executives I know haven&#8217;t heard of this RFI. That&#8217;s a problem. The reason the resulting standards will be shaped by vendors instead of practitioners is that practitioners don&#8217;t show up to the process. I&#8217;m not asking you to become a standards wonk. I&#8217;m asking you to spend 30 minutes writing down what you&#8217;re actually seeing in production, the Claude Code RCE, the OpenClaw WebSocket exposure, the shadow AI breach cost, and submit it at regulations.gov. The comment period was designed for exactly that. Use it.</p><div><hr></div><h3>The One Thing You Won&#8217;t Hear About But You Need To</h3><h4>OpenSSF&#8217;s TLP:CLEAR Advisory Means 47,000 Repos Are Still Exposed Right Now</h4><p>On March 1, 2026, the Open Source Security Foundation issued a TLP:CLEAR advisory prompted by the hackerbot-claw campaign, documenting the specific misconfiguration classes exploited: unsafe pull_request_target trigger configurations, overprivileged GITHUB_TOKEN scopes, unsanitized inputs in shell execution contexts, and dynamic shell execution patterns (Threat Landscape Blog). TLP:CLEAR means no restrictions on distribution. It was published specifically so every organization running public GitHub Actions workflows could read it and fix their exposure.</p><p>The bot&#8217;s profile claims 47,391 repositories scanned. That number isn&#8217;t independently verified, but StepSecurity&#8217;s analysis confirms five of seven analyzed targets were compromised during a nine-day campaign that defenders didn&#8217;t detect while it was running. No CVEs. No zero-days. Documented, preventable misconfigurations. New repositories with the same patterns are being created today.</p><p><strong>Why it matters</strong></p><ul><li><p>The advisory is available and actionable. The barrier isn&#8217;t information access. It&#8217;s distribution through the security team to the platform engineers who control the workflows.</p></li><li><p>The attack surface isn&#8217;t shrinking. Hackerbot-claw found 47,000 potentially vulnerable repositories in a week. The automation will get rerun.</p></li><li><p>Undetected campaigns running for nine days means your current GitHub Actions monitoring isn&#8217;t catching this class of attack.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Get the OpenSSF advisory to your DevSecOps and platform engineering teams today. It contains the specific patterns to search for and the specific remediation steps.</p></li><li><p>Run StepSecurity harden-runner or equivalent tooling against your public repositories. The vulnerability patterns are enumerable. Find them before the next scanner does.</p></li><li><p>Require security review for new GitHub Actions workflows before merge. The misconfigurations hackerbot-claw exploited are consistently introduced during workflow creation.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>TLP:CLEAR means the government cleared the information for public release with no restrictions. It was published so practitioners could act on it. The fact that it&#8217;s &#8220;the thing you won&#8217;t hear about&#8221; is an indictment of how security information moves through the industry. Your platform engineers are shipping features. Nobody is reading OpenSSF advisories in real time unless someone built a process for it.</p><p>The hackerbot-claw campaign didn&#8217;t require a zero-day. It required patient scanning of publicly available information about CI/CD pipeline configurations. The attacker had that process. The question for your organization is whether you have the equivalent on defense. The OpenSSF advisory is the starting point. If you want additional context on building CI/CD security programs that account for this threat class, the practitioner content at rockcybermusings.com covers it. The attack surface is documented. Close it.</p><p>If you found this analysis useful, subscribe at <a href="https://rockcybermusings.com/">rockcybermusings.com</a> for weekly intelligence on AI security developments.</p><p>&#128073; Visit <strong><a href="https://www.rockcyber.com/">RockCyber.com</a></strong> to learn more about how we can help you in your traditional Cybersecurity and AI Security and Governance Journey</p><p>&#128073; Want to save a quick $100K? Check out our AI Governance Tools at <strong><a href="https://aigovernancetoolkit.com/">AIGovernanceToolkit.com</a></strong></p><p>&#128073; Subscribe for more AI and cyber insights with the occasional rant.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share RockCyber Musings&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share RockCyber Musings</span></a></p><h2>References</h2><p>Awesome Agents. (2026, March 2). <em>An AI agent just pwned Trivy&#8217;s 32K-star repo via GitHub Actions.</em> https://awesomeagents.ai/news/hackerbot-claw-trivy-github-actions-compromise/</p><p>BleepingComputer. (2026, March 2). <em>CyberStrikeAI tool adopted by hackers for AI-powered attacks.</em> https://www.bleepingcomputer.com/news/security/cyberstrikeai-tool-adopted-by-hackers-for-ai-powered-attacks/</p><p>Check Point Research. (2026, February 25). <em>Caught in the hook: RCE and API token exfiltration through Claude Code project files.</em> https://research.checkpoint.com/2026/rce-and-api-token-exfiltration-through-claude-code-project-files-cve-2025-59536/</p><p>Cybernews. (2026, March 4). <em>AI bot compromises five major GitHub repositories.</em> https://cybernews.com/security/claude-powered-ai-bot-compromises-five-github-repositories/</p><p>Cybernews. (2026, March 4). <em>Open some code, Claude Code runs with hacker&#8217;s instructions.</em> https://cybernews.com/security/claude-code-critical-vulnerability-enabled-rce/</p><p>Dark Reading. (2026, February 28). <em>Flaws in Claude Code put developers&#8217; machines at risk.</em> https://www.darkreading.com/application-security/flaws-claude-code-developer-machines-risk</p><p>Federal Register. (2026, January 8). <em>Request for information regarding security considerations for artificial intelligence agents</em> (Docket NIST-2025-0035). https://www.federalregister.gov/documents/2026/01/08/2026-00206/request-for-information-regarding-security-considerations-for-artificial-intelligence-agents</p><p>Help Net Security. (2026, March 3). <em>AI went from assistant to autonomous actor and security never caught up.</em> https://www.helpnetsecurity.com/2026/03/03/enterprise-ai-agent-security-2026/</p><p>NIST Center for AI Standards and Innovation. (2026, January 12). <em>CAISI issues request for information about securing AI agent systems.</em> https://www.nist.gov/news-events/news/2026/01/caisi-issues-request-information-about-securing-ai-agent-systems</p><p>Orca Security. (2026, March 3). <em>HackerBot-Claw: An AI-assisted campaign targeting GitHub Actions pipelines.</em> https://orca.security/resources/blog/hackerbot-claw-github-actions-attack/</p><p>Palo Alto Networks Unit 42. (2026, March 2). <em>Taming agentic browsers: Vulnerability in Chrome allowed extensions to hijack new Gemini panel.</em> https://unit42.paloaltonetworks.com/gemini-live-in-chrome-hijacking/</p><p>SC Media. (2026, March 2). <em>Google Chrome vulnerability risked hijacking Gemini panel by rogue extension.</em> https://www.scworld.com/news/google-chrome-vulnerability-risked-hijacking-gemini-panel-by-rogue-extension</p><p>Security Affairs. (2026, March 2). <em>Untrusted repositories turn Claude Code into an attack vector.</em> https://securityaffairs.com/188508/security/untrusted-repositories-turn-claude-code-into-an-attack-vector.html</p><p>StepSecurity. (2026, March 3). <em>Hackerbot-claw: An AI-powered bot actively exploiting GitHub Actions.</em> https://www.stepsecurity.io/blog/hackerbot-claw-github-actions-exploitation</p><p>Sysdig. (2026, March 4). <em>Security briefing: February 2026.</em> https://www.sysdig.com/blog/security-briefing-february-2026</p><p>The Hacker News. (2026, March 3). <em>Open-source CyberStrikeAI deployed in AI-driven FortiGate attacks across 55 countries.</em> https://thehackernews.com/2026/03/open-source-cyberstrikeai-deployed-in.html</p><p>The Hacker News. (2026, March 3). <em>New Chrome vulnerability let malicious extensions escalate privileges via Gemini panel.</em> https://thehackernews.com/2026/03/new-chrome-vulnerability-let-malicious.html</p><p>The Hacker News. (2026, February 28). <em>Claude Code flaws allow remote code execution and API key exfiltration.</em> https://thehackernews.com/2026/02/claude-code-flaws-allow-remote-code.html</p><p>The Hacker News. (2026, March 2). <em>North Korean hackers publish 26 npm packages hiding Pastebin C2 for cross-platform RAT.</em> https://thehackernews.com/2026/03/north-korean-hackers-publish-26-npm.html</p><p>Threat Landscape Blog. (2026, March 5). <em>Hackerbot-Claw: AI bot exploiting GitHub Actions CI/CD misconfigs for repo takeover.</em> https://threatlandscape.io/blog/hackerbot-claw-ai-bot-github-actions-supply-chain-attack</p><p>[un]prompted. (2026). <em>Agenda &#8212; [un]prompted, The AI Security Practitioner Conference, March 3-4, 2026.</em></p><p> https://unpromptedcon.org/</p><p>WIU Cybersecurity Center. (2026). <em>Cybersecurity news.</em> Western Illinois University. https://www.wiu.edu/cybersecuritycenter/cybernews.php</p>]]></content:encoded></item><item><title><![CDATA[Agentic AI Authorization: From T-Shaped to Z-Shaped Security]]></title><description><![CDATA[Context engineering is authorization engineering. Staff accordingly]]></description><link>https://www.rockcybermusings.com/p/agentic-ai-authorization-z-shaped-security</link><guid isPermaLink="false">https://www.rockcybermusings.com/p/agentic-ai-authorization-z-shaped-security</guid><dc:creator><![CDATA[Rock Lambros]]></dc:creator><pubDate>Tue, 03 Mar 2026 13:50:13 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!U5e_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa49c28d-919b-4431-90c8-c24487a2a44d_2048x2048.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!U5e_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa49c28d-919b-4431-90c8-c24487a2a44d_2048x2048.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!U5e_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa49c28d-919b-4431-90c8-c24487a2a44d_2048x2048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!U5e_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa49c28d-919b-4431-90c8-c24487a2a44d_2048x2048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!U5e_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa49c28d-919b-4431-90c8-c24487a2a44d_2048x2048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!U5e_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa49c28d-919b-4431-90c8-c24487a2a44d_2048x2048.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!U5e_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa49c28d-919b-4431-90c8-c24487a2a44d_2048x2048.jpeg" width="1456" height="1456" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/aa49c28d-919b-4431-90c8-c24487a2a44d_2048x2048.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1456,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:3153708,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/188792880?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa49c28d-919b-4431-90c8-c24487a2a44d_2048x2048.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!U5e_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa49c28d-919b-4431-90c8-c24487a2a44d_2048x2048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!U5e_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa49c28d-919b-4431-90c8-c24487a2a44d_2048x2048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!U5e_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa49c28d-919b-4431-90c8-c24487a2a44d_2048x2048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!U5e_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa49c28d-919b-4431-90c8-c24487a2a44d_2048x2048.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/p/agentic-ai-authorization-z-shaped-security?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/p/agentic-ai-authorization-z-shaped-security?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p>The T-shaped professional built the modern internet. Broad skills, deep expertise in one vertical, and the collaborative instinct to ship products at scale. That model worked. It still works for a lot of things. But 88% of organizations reported confirmed or suspected AI agent security incidents in the past year, according to Gravitee&#8217;s <a href="https://www.gravitee.io/state-of-ai-agent-security">State of AI Agent Security 2026</a> report. The T-shape isn&#8217;t enough anymore. Securing agentic AI demands a different professional geometry, and most teams haven&#8217;t made the upgrade.</p><h2>The LinkedIn Conversation That Started This</h2><p><a href="https://www.linkedin.com/in/llangdon/">Lock Langdon</a>, VP and CISO at Aprio, recently posted something on LinkedIn that got me thinking. He&#8217;d been building a security review tool in Claude Code and had a realization about why broad informational context matters more than clever prompts. He referenced the <a href="https://steamcdn-a.akamaihd.net/apps/valve/Valve_Handbook_LowRes.pdf">Valve employee handbook</a> and its description of T-shaped people, then connected it to effective AI use.</p><p>His observation was sharp: &#8220;AI rewards people who can think across disciplines.&#8221; Breadth gives you the ability to steer, validate, and connect dots that the model won&#8217;t connect for you.</p><p>He&#8217;s right. And he&#8217;s describing the exact professional shape that&#8217;s now getting exposed as insufficient for agentic AI security.</p><p>Valve&#8217;s handbook clearly spells out the T-shape. They value people who are &#8220;both generalists (highly skilled at a broad set of valuable things) and also experts&#8221; in one narrow discipline. The horizontal bar gives you range. The vertical stroke gives you depth. For building products in a flat organization, it&#8217;s a brilliant hiring filter.</p><p>For securing autonomous AI agents? It&#8217;s only half the picture.</p><h2>The Z-Shaped Professional and the Missing Diagonal</h2><p>Malcolm Harkins, currently Chief Security and Trust Officer at HiddenLayer, pushed the T-shape further in his book <em><a href="https://link.springer.com/book/10.1007/978-1-4302-5114-9">Managing Risk and Information Security: Protect to Enable</a></em>. He introduced the concept of the Z-shaped professional, and it&#8217;s the piece most AI teams are missing.</p><p>The Z-shape keeps the T&#8217;s horizontal bar (business acumen across the organization) and vertical stroke (technical depth). What it adds is a diagonal connecting the two. That diagonal represents deep risk and security knowledge, the translation layer that maps business constraints to technical controls and explains to the board why a security decision is a business decision.</p><p>Think about what that diagonal actually means in practice. A T-shaped engineer can build you an AI agent that works. A Z-shaped security professional can tell you whether that agent should be allowed to work with the permissions it&#8217;s requesting, within the business context it&#8217;s operating in, against the threat model you haven&#8217;t written yet.</p><p>That diagonal is the part nobody wants to do the hard work of encoding into their AI tooling. I said that in my response to Lock&#8217;s post, and I meant it. Everyone&#8217;s excited about broad context until I ask who defined the authorization boundaries. The room gets really quiet.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!wvYp!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d85796f-7241-4bfa-8215-0a4fce190566_3092x1690.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!wvYp!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d85796f-7241-4bfa-8215-0a4fce190566_3092x1690.png 424w, https://substackcdn.com/image/fetch/$s_!wvYp!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d85796f-7241-4bfa-8215-0a4fce190566_3092x1690.png 848w, https://substackcdn.com/image/fetch/$s_!wvYp!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d85796f-7241-4bfa-8215-0a4fce190566_3092x1690.png 1272w, https://substackcdn.com/image/fetch/$s_!wvYp!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d85796f-7241-4bfa-8215-0a4fce190566_3092x1690.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!wvYp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d85796f-7241-4bfa-8215-0a4fce190566_3092x1690.png" width="1456" height="796" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6d85796f-7241-4bfa-8215-0a4fce190566_3092x1690.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:796,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:366112,&quot;alt&quot;:&quot;Comparison diagram showing T-shaped, Z-shaped, and AI-augmented Z-shaped professional competency models with labeled dimensions&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/188792880?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d85796f-7241-4bfa-8215-0a4fce190566_3092x1690.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Comparison diagram showing T-shaped, Z-shaped, and AI-augmented Z-shaped professional competency models with labeled dimensions" title="Comparison diagram showing T-shaped, Z-shaped, and AI-augmented Z-shaped professional competency models with labeled dimensions" srcset="https://substackcdn.com/image/fetch/$s_!wvYp!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d85796f-7241-4bfa-8215-0a4fce190566_3092x1690.png 424w, https://substackcdn.com/image/fetch/$s_!wvYp!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d85796f-7241-4bfa-8215-0a4fce190566_3092x1690.png 848w, https://substackcdn.com/image/fetch/$s_!wvYp!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d85796f-7241-4bfa-8215-0a4fce190566_3092x1690.png 1272w, https://substackcdn.com/image/fetch/$s_!wvYp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d85796f-7241-4bfa-8215-0a4fce190566_3092x1690.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure 1: Professional Shape Evolution for AI Security</figcaption></figure></div><h2>&#8220;It Runs&#8221; vs. &#8220;It&#8217;s Safe to Run&#8221; Is a Context Engineering Problem</h2><p>Lock nailed the distinction in his original post. The difference between &#8220;it runs&#8221; and &#8220;it&#8217;s right&#8221; lies in context: threat models, business constraints, user behavior, and edge cases. I&#8217;ll double down on that and add that the difference between &#8220;it runs&#8221; and &#8220;it&#8217;s safe to run&#8221; is context engineering, and that gap is where organizations are getting wrecked.</p><p>Context engineering is the discipline of curating, structuring, and governing the information that feeds an AI system. It&#8217;s how you decide what goes into the system prompt, which tools the agent can access, what gets retrieved from your vector database, what persists in memory across sessions, and how context gets compressed when the window fills up.</p><p>Every one of those decisions is a security decision. Your system prompt defines the agent&#8217;s behavioral boundaries. Your tool access configuration determines its capability envelope. Your RAG pipeline controls what information it treats as authoritative. Your memory architecture determines what persists and what an attacker can poison.</p><p>I&#8217;ve been writing about this for months. Context engineering isn&#8217;t a developer productivity skill. It&#8217;s security engineering. The same architectural channels that make context engineering effective carry malicious payloads with equal ease. RAG retrieval pipelines that inject relevant knowledge also inject poisoned documents. Tool interfaces that offer rich functionality also expand the attack surface. Memory systems that maintain useful state also maintain poisoned state.</p><p>The data confirms this. <a href="https://www.strata.io/resources/whitepapers/securing-autonomous-ai-agents-csa-survey-report-2026-strata-identity/">The Cloud Security Alliance surveyed 285 IT and security professionals</a> and found that only 18% of security leaders feel highly confident their current IAM systems can manage agent identities. Only 28% can trace agent actions back to a human sponsor across all environments. Meanwhile, 45.6% of teams still rely on shared API keys for agent-to-agent authentication.</p><p>Training data and clever prompts don&#8217;t constitute security boundaries. An AI agent can&#8217;t tell the difference between &#8220;it runs&#8221; and &#8220;it&#8217;s safe to run&#8221; without someone encoding Z-shaped judgment into the controls. That someone needs to understand the business risk (horizontal), the technical implementation (vertical), and the security implications connecting them (diagonal).</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!cQXF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87a805ec-dc1b-4f65-8115-c5a53cd69c26_3500x2500.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cQXF!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87a805ec-dc1b-4f65-8115-c5a53cd69c26_3500x2500.png 424w, https://substackcdn.com/image/fetch/$s_!cQXF!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87a805ec-dc1b-4f65-8115-c5a53cd69c26_3500x2500.png 848w, https://substackcdn.com/image/fetch/$s_!cQXF!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87a805ec-dc1b-4f65-8115-c5a53cd69c26_3500x2500.png 1272w, https://substackcdn.com/image/fetch/$s_!cQXF!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87a805ec-dc1b-4f65-8115-c5a53cd69c26_3500x2500.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cQXF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87a805ec-dc1b-4f65-8115-c5a53cd69c26_3500x2500.png" width="1456" height="1040" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/87a805ec-dc1b-4f65-8115-c5a53cd69c26_3500x2500.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1040,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:302033,&quot;alt&quot;:&quot;Bar chart comparing executive confidence in AI agent security versus actual implementation of controls across identity, authorization, and monitoring dimensions&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/188792880?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87a805ec-dc1b-4f65-8115-c5a53cd69c26_3500x2500.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Bar chart comparing executive confidence in AI agent security versus actual implementation of controls across identity, authorization, and monitoring dimensions" title="Bar chart comparing executive confidence in AI agent security versus actual implementation of controls across identity, authorization, and monitoring dimensions" srcset="https://substackcdn.com/image/fetch/$s_!cQXF!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87a805ec-dc1b-4f65-8115-c5a53cd69c26_3500x2500.png 424w, https://substackcdn.com/image/fetch/$s_!cQXF!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87a805ec-dc1b-4f65-8115-c5a53cd69c26_3500x2500.png 848w, https://substackcdn.com/image/fetch/$s_!cQXF!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87a805ec-dc1b-4f65-8115-c5a53cd69c26_3500x2500.png 1272w, https://substackcdn.com/image/fetch/$s_!cQXF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87a805ec-dc1b-4f65-8115-c5a53cd69c26_3500x2500.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure 2: The AI Agent Security Confidence Gap</figcaption></figure></div><h2>The Four Layers Nobody Wants to Do</h2><p>In my LinkedIn conversation with Lock (in the comments of his post), I laid out a framework with four layers that have to happen in order.</p><p><strong>Layer 1: Define.</strong> Articulate your risk appetite and authorization policies. What is this agent allowed to do? What data can it access? Under what conditions should it stop and ask a human? What&#8217;s the blast radius if it goes wrong? These are people decisions. No technology required. Just hard thinking.</p><p><strong>Layer 2: Encode.</strong> Take those human decisions and turn them into policy-as-code, RBAC rules, tool permission schemas, and system prompt constraints. This is where Z-shaped judgment becomes machine-readable. If your risk appetite says &#8220;never modify production databases without human approval,&#8221; that needs to exist as an enforceable rule, not a line in a wiki nobody reads.</p><p><strong>Layer 3: Enforce.</strong> Technical controls that make the encoded policies real. Container isolation so one agent can&#8217;t interfere with another&#8217;s file system. Exclusive file ownership to prevent concurrent workers from creating race conditions. Signed tool definitions so an attacker can&#8217;t poison a tool description. Rate limiting on tool invocations to prevent data exfiltration through repeated calls. Least-privilege scoping so that a database tool is read-only on specific tables, not a full admin connection.</p><p><strong>Layer 4: Verify.</strong> Human review and audit trails. Log every tool call, every parameter, every result. Run automated testing and security scanning as part of the workflow, not after it. When an agent starts behaving oddly, your logs are the only way to reconstruct what happened. And someone with Z-shaped judgment needs to review those logs, because automated monitors won&#8217;t catch a well-crafted authorization boundary violation that technically follows every rule while violating the intent.</p><p>Unfortunately, organizations tend to jump straight to Layer 3 and Layer 4. They buy tools. They poorly setup configure monitors and audit trails, and then they wonder why their agents keep doing things they weren&#8217;t supposed to do. You can&#8217;t enforce what you haven&#8217;t defined. You can&#8217;t verify compliance with policies that don&#8217;t exist.</p><p>The Gravitee report found that only 14.4% of organizations have full security approval for their AI agents before they go into production. That means 85.6% skipped the &#8220;define&#8221; step entirely or gave it a passing nod.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!L69m!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3657754c-6c19-4314-9195-85b4fdf8de33_1615x6395.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!L69m!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3657754c-6c19-4314-9195-85b4fdf8de33_1615x6395.png 424w, https://substackcdn.com/image/fetch/$s_!L69m!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3657754c-6c19-4314-9195-85b4fdf8de33_1615x6395.png 848w, https://substackcdn.com/image/fetch/$s_!L69m!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3657754c-6c19-4314-9195-85b4fdf8de33_1615x6395.png 1272w, https://substackcdn.com/image/fetch/$s_!L69m!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3657754c-6c19-4314-9195-85b4fdf8de33_1615x6395.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!L69m!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3657754c-6c19-4314-9195-85b4fdf8de33_1615x6395.png" width="1456" height="5765" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3657754c-6c19-4314-9195-85b4fdf8de33_1615x6395.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:5765,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:548365,&quot;alt&quot;:&quot;Flow diagram showing four authorization layers with qualitative maturity labels from Critical Gap to Developing&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/188792880?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3657754c-6c19-4314-9195-85b4fdf8de33_1615x6395.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Flow diagram showing four authorization layers with qualitative maturity labels from Critical Gap to Developing" title="Flow diagram showing four authorization layers with qualitative maturity labels from Critical Gap to Developing" srcset="https://substackcdn.com/image/fetch/$s_!L69m!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3657754c-6c19-4314-9195-85b4fdf8de33_1615x6395.png 424w, https://substackcdn.com/image/fetch/$s_!L69m!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3657754c-6c19-4314-9195-85b4fdf8de33_1615x6395.png 848w, https://substackcdn.com/image/fetch/$s_!L69m!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3657754c-6c19-4314-9195-85b4fdf8de33_1615x6395.png 1272w, https://substackcdn.com/image/fetch/$s_!L69m!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3657754c-6c19-4314-9195-85b4fdf8de33_1615x6395.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure 3: Agentic AI Security Maturity by Layer </figcaption></figure></div><h2>What This Means for Hiring, Training, and Governance</h2><p>If your security team is staffed entirely with T-shaped professionals, you have people who can build agents and people who can spot vulnerabilities, but you don&#8217;t have people who can connect &#8220;the business needs this agent to process insurance claims&#8221; to &#8220;this agent should never access the claims adjudication database directly, only through the approved API, with row-level security scoped to the claimant&#8217;s policy.&#8221;</p><p>That connection is the diagonal, and it&#8217;s the part AI tools can&#8217;t do for you.</p><p><a href="https://neuraltrust.ai/guides/the-state-of-ai-agent-security-2026">NeuralTrust&#8217;s survey</a> of 160+ CISOs found that 73% are critically concerned about AI agent risks, but only 30% have mature safeguards. Their maturity model places 46% of enterprises in the &#8220;Reactive&#8221; tier. Reactive means you&#8217;re fixing things after they break. You skipped the define and encode layers, and now you&#8217;re running expensive cleanup operations in verify.</p><p><a href="https://www.cisco.com/site/us/en/products/security/state-of-ai-security.html">Cisco&#8217;s State of AI Security 2026 </a>report puts it bluntly: 83% of organizations planned to deploy agentic AI capabilities, but only 29% felt ready to do it securely. That 54-point gap between ambition and readiness is a define-and-encode gap. The technology exists. The professional judgment to wield it responsibly doesn&#8217;t, at least not at the scale organizations need.</p><p>For practitioners building with AI agents right now, ask yourself this: If you can&#8217;t articulate what your agent is authorized to do before it does it, what exactly are your tools enforcing?</p><h2>The OWASP Agentic Top 10 Confirms the Framework</h2><p>The OWASP GenAI Security Project released the <a href="https://genai.owasp.org/2025/12/09/owasp-genai-security-project-releases-top-10-risks-and-mitigations-for-agentic-ai-security/">Top 10 for Agentic Applications</a> in December 2025 after input from over 100 security researchers. The top concerns for agentic systems, as opposed to standalone LLMs, are memory poisoning, tool misuse, and privilege compromise.</p><p>Every one of those maps directly to a failure in the four-layer model. Memory poisoning succeeds when Layer 2 (encode) doesn&#8217;t include memory integrity controls. Tool misuse succeeds when Layer 1 (define) never articulated which tools the agent should access and under what conditions. Privilege compromise succeeds when Layer 3 (enforce) grants broad permissions because nobody did the Layer 1 work of determining what least-privilege looks like for this specific agent in this specific workflow.</p><p>The OWASP list validates that LLM security focuses on single-model interactions, while agentic security concerns what happens when models can plan, persist, and delegate across tools and systems. The attack surface isn&#8217;t the model anymore. It&#8217;s the context, and who curated it, and whether they had the Z-shaped judgment to do it securely.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!bnMW!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F509a798e-a06e-41c8-9d16-2aedd5c89668_6153x3145.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bnMW!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F509a798e-a06e-41c8-9d16-2aedd5c89668_6153x3145.png 424w, https://substackcdn.com/image/fetch/$s_!bnMW!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F509a798e-a06e-41c8-9d16-2aedd5c89668_6153x3145.png 848w, https://substackcdn.com/image/fetch/$s_!bnMW!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F509a798e-a06e-41c8-9d16-2aedd5c89668_6153x3145.png 1272w, https://substackcdn.com/image/fetch/$s_!bnMW!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F509a798e-a06e-41c8-9d16-2aedd5c89668_6153x3145.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bnMW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F509a798e-a06e-41c8-9d16-2aedd5c89668_6153x3145.png" width="1456" height="744" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/509a798e-a06e-41c8-9d16-2aedd5c89668_6153x3145.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:744,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:663168,&quot;alt&quot;:&quot;Matrix mapping OWASP agentic AI risks to the four authorization layers showing primary and secondary failure layers&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/188792880?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F509a798e-a06e-41c8-9d16-2aedd5c89668_6153x3145.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Matrix mapping OWASP agentic AI risks to the four authorization layers showing primary and secondary failure layers" title="Matrix mapping OWASP agentic AI risks to the four authorization layers showing primary and secondary failure layers" srcset="https://substackcdn.com/image/fetch/$s_!bnMW!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F509a798e-a06e-41c8-9d16-2aedd5c89668_6153x3145.png 424w, https://substackcdn.com/image/fetch/$s_!bnMW!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F509a798e-a06e-41c8-9d16-2aedd5c89668_6153x3145.png 848w, https://substackcdn.com/image/fetch/$s_!bnMW!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F509a798e-a06e-41c8-9d16-2aedd5c89668_6153x3145.png 1272w, https://substackcdn.com/image/fetch/$s_!bnMW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F509a798e-a06e-41c8-9d16-2aedd5c89668_6153x3145.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure 4: OWASP Agentic Risks Mapped to Authorization Layers</figcaption></figure></div><h2>Building Your Z-Shape</h2><p>The T-shaped professional built the internet. The Z-shaped professional has to secure the agents running on it. That&#8217;s not a criticism of the T-shape. It&#8217;s a recognition that agentic AI operates at a different level of autonomy and requires a different level of professional judgment to govern.</p><p>If you&#8217;re a security architect, start encoding your organization&#8217;s risk appetite into machine-readable policies. Not guidelines. Policies that tools can enforce.</p><p>If you&#8217;re a CISO, staff for the diagonal. Find people who can translate business risk into technical controls and articulate why an agent&#8217;s tool permissions matter to the quarterly risk review.</p><p>If you&#8217;re an engineer building with AI agents, stop thinking of context engineering as a performance optimization exercise. Every context decision is an authorization decision. Treat it that way.</p><p><strong>Key Takeaway:</strong> Your AI agent is exactly as secure as the weakest layer in your define-encode-enforce-verify chain, and most organizations haven&#8217;t started Layer 1.</p><h3>What to do next</h3><p>The four-layer model maps directly to the <a href="https://rockcyber.com">CARE framework</a> I use with clients: Create your governance foundations and authorization policies, Adapt them as threats and regulations shift, Run them as enforceable controls in production, and Evolve through continuous learning and post-incident review. If your agentic AI program skipped the Create step, you&#8217;re building on sand.</p><p>For a deeper look at how security leadership needs to evolve alongside AI, my book <em><a href="https://www.amazon.com/CISO-Evolution-Knowledge-Cybersecurity-Executives/dp/1119782481">The CISO Evolution</a></em> addresses the structural shift from technical security management to business-aligned risk governance. It was written before this current AI boom, but the principles still apply. The Z-shaped diagonal isn&#8217;t new. What&#8217;s new is that AI agents now operate at machine speed, which means the consequences of missing that diagonal arrive at machine speed too.</p><p>I&#8217;ve been tracking the intersection of context engineering and security engineering at <a href="https://rockcybermusings.com">RockCyber Musings</a>, including technical deep-dives on tool poisoning, memory integrity, and the authorization gaps in current AI frameworks. If this topic matters to your organization, that&#8217;s where the ongoing work lies.</p><p>&#128073; Subscribe for more AI security and governance insights with the occasional rant.</p><p>&#128073; Visit <strong><a href="https://www.rockcyber.com/">RockCyber.com</a></strong> to learn more about how we can help you in your traditional Cybersecurity and AI Security and Governance Journey</p><p>&#128073; Want to save a quick $100K? Check out our AI Governance Tools at <strong><a href="https://aigovernancetoolkit.com/">AIGovernanceToolkit.com</a></strong></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&quot;,&quot;text&quot;:&quot;Share RockCyber Musings&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share"><span>Share RockCyber Musings</span></a></p>]]></content:encoded></item><item><title><![CDATA[Weekly Musings Top 10 AI Security Wrapup: Issue 27 February 20, 2026 - February 26, 2026]]></title><description><![CDATA[Pentagon, Prompt Injection, and China&#8217;s AI Playbook: The Week AI Security Got Loud]]></description><link>https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260220-20260226</link><guid isPermaLink="false">https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260220-20260226</guid><dc:creator><![CDATA[Rock Lambros]]></dc:creator><pubDate>Fri, 27 Feb 2026 13:50:21 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!BTYW!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1e04693-c43f-43f6-875c-3b55f83916c4_1024x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!BTYW!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1e04693-c43f-43f6-875c-3b55f83916c4_1024x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!BTYW!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1e04693-c43f-43f6-875c-3b55f83916c4_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!BTYW!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1e04693-c43f-43f6-875c-3b55f83916c4_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!BTYW!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1e04693-c43f-43f6-875c-3b55f83916c4_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!BTYW!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1e04693-c43f-43f6-875c-3b55f83916c4_1024x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!BTYW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1e04693-c43f-43f6-875c-3b55f83916c4_1024x1024.png" width="1024" height="1024" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b1e04693-c43f-43f6-875c-3b55f83916c4_1024x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1024,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1233556,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/189344939?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1e04693-c43f-43f6-875c-3b55f83916c4_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!BTYW!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1e04693-c43f-43f6-875c-3b55f83916c4_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!BTYW!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1e04693-c43f-43f6-875c-3b55f83916c4_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!BTYW!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1e04693-c43f-43f6-875c-3b55f83916c4_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!BTYW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1e04693-c43f-43f6-875c-3b55f83916c4_1024x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260220-20260226?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260220-20260226?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p>The week of February 20, 2026, delivered a reckoning. Not in the abstract, conference-keynote sense. In the concrete sense of &#8220;your AI vendor might get declared a supply chain risk by your own government.&#8221; While security teams spent their days triaging a GitHub Copilot prompt injection that could take over your entire repository, Anthropic&#8217;s CEO was in a room with Pete Hegseth, being threatened with the Defense Production Act. Simultaneously, industrial-scale AI distillation attacks by Chinese labs were exposed, a financially motivated hacker with modest skills used commercial GenAI to breach 600 firewalls across 55 countries, and IBM&#8217;s annual X-Force report confirmed what most of us already knew but hadn&#8217;t put numbers to. This was not a normal week. Welcome to the AI security present tense.</p><p>If you&#8217;re a CISO trying to explain to your board why your vendor risk program now includes monitoring geopolitical standoffs between AI labs and cabinet secretaries, I feel for you. This newsletter exists precisely for that conversation. Bookmark the archive at <a href="https://rockcybermusings.com/">rockcybermusings.com</a> and check <a href="https://www.rockcyber.com/">rockcyber.com</a> for the advisory work that goes deeper.</p><h3>1. Pentagon Threatens to Blacklist Anthropic Over &#8220;Woke AI&#8221; Refusal to Drop Safeguards</h3><p>On February 24, Defense Secretary Pete Hegseth met with Anthropic CEO Dario Amodei and issued an ultimatum: strip the safety restrictions from Claude or face cancellation of Anthropic&#8217;s $200 million DoD contract, designation as a &#8220;supply chain risk,&#8221; and potential compulsion under the Defense Production Act. The sticking points are Claude&#8217;s restrictions against use in autonomous weapons without human oversight and mass domestic surveillance, positions Amodei has publicly defended for months. By February 25, the Pentagon had reached out to Boeing and Lockheed Martin, asking for an assessment of their dependence on Claude, a formal first step toward the supply chain risk designation. The designation, normally reserved for adversarial foreign vendors like Huawei, would effectively blacklist Anthropic across the defense industrial base. Claude is currently the only AI model cleared for use in classified U.S. military settings (NPR, Axios).</p><p><strong>Why it matters</strong></p><ul><li><p>The &#8220;supply chain risk&#8221; label would cascade across DoD prime contractors, potentially forcing them to strip Claude from pipelines where it&#8217;s embedded in sensitive workflows, regardless of whether Anthropic&#8217;s position changes.</p></li><li><p>This sets a precedent for government compulsion of AI lab design decisions, specifically the argument that AI safety guardrails constitute a national security liability rather than an asset.</p></li><li><p>Competing models from Google, OpenAI, and xAI have already agreed to &#8220;all lawful use&#8221; terms, meaning Anthropic could lose classified-domain market share to firms that accepted no comparable restrictions.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>If your organization operates in the defense contracting space, audit how deeply Claude is integrated in classification-adjacent workflows and assess your switching timeline if Anthropic loses its clearance standing.</p></li><li><p>Brief your board on the federal AI governance trajectory: the administration&#8217;s &#8220;innovation-first&#8221; stance is actively hostile to AI safety conditions as a contractual requirement.</p></li><li><p>Watch the DPA invocation question closely. If the administration successfully compels an AI lab&#8217;s design choices via DPA, your vendor agreements with any AI provider become significantly less predictable.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Let&#8217;s be clear about what&#8217;s actually happening. The administration is demanding that an AI company strip restrictions on autonomous lethal force and mass domestic surveillance. Anthropic said no. Hegseth called that &#8220;woke AI.&#8221; I call it one of the most consequential AI governance fights in U.S. history, playing out while most of the industry&#8217;s attention was focused on vulnerabilities in developer tooling.</p><p>The era of assuming your AI vendor&#8217;s ethics are your ethics is over. When a government can designate your AI provider as a supply chain risk for maintaining safety policies, those policies become a business continuity variable. Build your AI governance program like that&#8217;s true, because it is.</p><h3>2. Anthropic Exposes Industrial-Scale Claude Distillation Attacks by Three Chinese AI Labs</h3><p>On February 23, Anthropic published a detailed disclosure identifying three Chinese AI laboratories, DeepSeek, Moonshot AI, and MiniMax, as having conducted coordinated campaigns to extract Claude&#8217;s capabilities at industrial scale. The three labs generated over 16 million exchanges with Claude through approximately 24,000 fraudulent accounts, bypassing China&#8217;s access restrictions using commercial proxy services running what Anthropic calls &#8220;hydra cluster&#8221; architectures, sprawling networks designed to mix distillation traffic with legitimate requests. Anthropic attributed each campaign with high confidence using IP correlation, request metadata, and infrastructure indicators. MiniMax drove the heaviest traffic, over 13 million exchanges, and Anthropic detected its campaign while still active, watching MiniMax pivot within 24 hours of Claude&#8217;s new model release to begin extracting the latest capabilities. DeepSeek&#8217;s requests specifically targeted reasoning capabilities and censorship-safe alternatives to politically sensitive queries, consistent with training models to evade content restrictions (Anthropic blog, The Register, TechCrunch, CNN).</p><p><strong>Why it matters</strong></p><ul><li><p>Models built through illicit distillation strip the safety guardrails implemented by the originating lab. Chinese actors can acquire frontier AI capabilities without the restrictions that prevent those systems from assisting with bioweapons synthesis, offensive cyber operations, or disinformation at scale.</p></li><li><p>The scale of extraction, 16 million exchanges across 24,000 accounts, demonstrates this is not opportunistic scraping. It is a structured intelligence collection operation against American AI infrastructure.</p></li><li><p>Anthropic&#8217;s disclosure adds evidentiary support for stricter AI export controls and industry-wide distillation detection infrastructure, neither of which currently exists in a coordinated form.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Review any AI products or internal tools that wrap or call third-party model APIs. If those calls touch frontier models, understand whether your data is insulated from aggregation risks similar to these distillation patterns.</p></li><li><p>Engage your legal and procurement teams on supply chain risk assessments for AI model providers used by your organization, specifically their ToS enforcement posture and detection capabilities.</p></li><li><p>Track the policy response. Anthropic is explicitly calling for &#8220;a coordinated response across the AI industry, cloud providers, and policymakers.&#8221; That framing is the precursor to regulatory proposals.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>The distillation story gets presented as an IP theft problem. It is that, but only partly. The deeper problem is the safety bypass. When MiniMax runs 13 million exchanges through Claude to train its own model, it&#8217;s not paying licensing fees, but it&#8217;s also not inheriting Claude&#8217;s constitutional restrictions against helping with bioweapons or cyberattacks. It gets the capability without the cage.</p><p>The &#8220;hydra cluster&#8221; architecture Anthropic describes, 20,000 fraudulent accounts blending distillation traffic with legitimate use, is a detection problem that no single company can solve alone. This is what a coordinated AI security posture looks like when it&#8217;s absent. The industry needs shared threat intelligence on distillation patterns the same way it shares threat intel on ransomware actor TTPs. We don&#8217;t have that yet.</p><h3>3. RoguePilot: Passive Prompt Injection in GitHub Codespaces Enables Full Repository Takeover</h3><p>On February 24, Orca Security disclosed RoguePilot, a passive prompt-injection vulnerability in GitHub Codespaces that enabled attackers to achieve a full repository takeover by embedding malicious instructions in a GitHub Issue. No exploitation of Codespaces itself was required. When a developer opened a Codespace from a poisoned Issue, GitHub Copilot was immediately prompted with the Issue&#8217;s description and executed the hidden instructions, which were concealed inside HTML comments invisible to human reviewers. The attack chain exfiltrated the GITHUB_TOKEN by manipulating Copilot to access a symlinked sensitive file and append the token to a schema download request, leaking it to an attacker-controlled server. Microsoft patched the vulnerability following responsible disclosure, and no CVE had been assigned at time of reporting. Researcher Roi Nisimi at Orca called it a new class of AI-mediated supply chain attack (Orca Security, SecurityWeek, The Hacker News).</p><p><strong>Why it matters</strong></p><ul><li><p>The attack required no special privileges. Anyone who could create or view a GitHub Issue in a targeted repository could trigger it, placing it within reach of anonymous threat actors and insider risks alike.</p></li><li><p>This is a demonstration that AI agents with God Mode permissions, terminal access, file reads, API tokens, and network connectivity, cannot reliably distinguish between developer instructions and adversarial content embedded in data.</p></li><li><p>Microsoft patched this specific chain, but the root problem, AI agents treating user-generated content as potentially trusted, persists across every developer tool that integrates an LLM agent into an active workspace.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Rotate any GITHUB_TOKENs generated in Codespaces environments. Even if your team wasn&#8217;t targeted, token hygiene is your first action.</p></li><li><p>Audit your developer tooling inventory for any AI assistants that ingest user-generated content, Issues, comments, pull requests, wikis, and operate with elevated permissions. RoguePilot will not be the last of its class.</p></li><li><p>Push your security engineering teams to review json.schemaDownload settings and symlink sandboxing defaults in any LLM-integrated development environments in your stack.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>This is the AI agent security problem crystallized into a single attack chain. You have a model that&#8217;s been given broad permissions to be helpful in a developer context, and the moment that model processes untrusted content from a public-facing surface like a GitHub Issue, the permissions become the attack surface. The model doesn&#8217;t know to distrust the Issue. It was designed to process it.</p><p>I&#8217;ve been telling clients since the agentic AI wave hit that &#8220;helpful&#8221; and &#8220;privileged&#8221; is a dangerous combination without an explicit trust boundary model. RoguePilot proves it doesn&#8217;t take a sophisticated threat actor or a zero-day. It takes someone who understands that AI agents read everything they&#8217;re shown and act on it.</p><h3>4. GenAI Democratizes Cybercrime: Low-Skill Actor Breaches 600+ FortiGate Devices Across 55 Countries</h3><p>Amazon Threat Intelligence published a report on February 20 documenting a Russian-speaking, financially motivated threat actor, assessed as an individual or small group, who used multiple commercial generative AI services to compromise over 600 FortiGate devices in 55 countries between January 11 and February 18, 2026. No FortiGate vulnerabilities were exploited. The campaign targeted exposed management interfaces and weak single-factor credentials, using AI to automate scanning, script generation, configuration parsing, and victim prioritization. Amazon&#8217;s CISO CJ Moses described the custom tooling as bearing hallmarks of AI-generated code: redundant comments, naive JSON parsing, and failures under edge cases. When targets proved too hardened, the actor abandoned them and moved on, a pattern consistent with AI-augmented scale rather than technical depth. Post-compromise activity included Active Directory compromise, credential dumping, and targeting of Veeam backup infrastructure (AWS Security Blog, BleepingComputer, Cybersecurity Dive, The Record).</p><p><strong>Why it matters</strong></p><ul><li><p>The campaign demonstrates that AI services lower the barrier to entry for offensive cyber operations at scale. A single actor achieved what would historically require a well-resourced team.</p></li><li><p>The activity didn&#8217;t rely on any novel exploits. Exposed management ports and missing MFA are the root causes. AI just made systematic exploitation of those gaps cheap and fast.</p></li><li><p>Post-compromise actions, Active Directory compromise and backup targeting, align with ransomware pre-positioning. The technical unsophistication of the actor does not constrain the damage ceiling.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Immediately audit all perimeter devices, especially FortiGate, Palo Alto, and similar appliances, for management interfaces exposed to the internet on any port.</p></li><li><p>Enforce MFA across all VPN and admin access without exceptions. This campaign succeeded entirely because MFA was absent.</p></li><li><p>Harden backup infrastructure separately from production networks. Backup access should require distinct credentials, not shared Active Directory credentials, and should not be reachable from compromised VPN paths.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Amazon&#8217;s description of this actor as achieving &#8220;operational scale that would have previously required a significantly larger and more skilled team&#8221; is the AI security risk thesis for 2026 stated as a fact. The threat actor&#8217;s own documentation, left exposed on their own infrastructure because of poor OpSec, acknowledges when targets are too hardened to exploit. They just move on. That&#8217;s not a skill constraint, it&#8217;s a volume play.</p><p>The lesson here is not new. Missing MFA and exposed management interfaces have been on the &#8220;fix this now&#8221; list for a decade. What&#8217;s new is that ignoring them now feeds a threat actor pipeline that doesn&#8217;t get tired, doesn&#8217;t need to be patient, and doesn&#8217;t cost much to operate. The economics of cyber offense just changed again.</p><h3>5. IBM X-Force 2026 Threat Index: AI Accelerates Exploitation Speed, 300K ChatGPT Credentials Exposed</h3><p>IBM published the 2026 X-Force Threat Intelligence Index on February 25, reporting a 44% increase in attacks initiated through the exploitation of public-facing applications, driven by missing authentication controls and AI-enabled vulnerability discovery. Vulnerability exploitation became the leading cause of incidents, accounting for 40% of cases in 2025. Active ransomware and extortion groups increased 49% year over year. Supply chain and third-party compromises nearly quadrupled since 2020. On the AI-specific threat front, X-Force identified over 300,000 exposed ChatGPT credentials from infostealer malware, signaling that AI platforms have reached the same credential risk profile as core enterprise SaaS systems. IBM also noted that North Korean IT worker schemes are using AI for synthetic identity creation and translation to operate across global marketplaces (IBM X-Force, UK Newsroom).</p><p><strong>Why it matters</strong></p><ul><li><p>The ChatGPT credential exposure figure, 300,000 accounts, tells you that enterprise AI platform access is now a target of credential theft campaigns. Account compromise of AI tools creates follow-on risks including manipulated outputs, data exfiltration, and prompt injection by threat actors who gain access.</p></li><li><p>The 44% jump in public-facing application exploitation reflects that AI tools are helping attackers identify missing auth controls faster than organizations can patch them.</p></li><li><p>North Korean actors using AI for synthetic identity creation to penetrate companies as fake remote IT workers is an active, documented threat that most security teams are unprepared to detect.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Treat enterprise AI platform accounts like you treat cloud admin accounts. Enforce MFA, monitor for credential exposures via dark web intelligence, and rotate credentials regularly.</p></li><li><p>Add AI service accounts to your privileged access management inventory. ChatGPT, Claude, Gemini, and Copilot accounts with sensitive system access are high-value targets.</p></li><li><p>Build screening processes for remote technical hires that account for AI-augmented identity fabrication. Resume AI content detection is insufficient. Require live video verification with behavioral assessment.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Three hundred thousand exposed ChatGPT credentials. Let that settle for a moment. A year ago, someone might have asked why stealing a ChatGPT account matters. Now, with AI tools embedded in code pipelines, document workflows, and agentic systems with access to production data, the answer is obvious. Your AI account is your data account.</p><p>IBM&#8217;s broader finding on supply chain compromise quadrupling since 2020 is the decade-long trend nobody fixed. We&#8217;ve known this curve for years. AI is not changing the attack surface, it&#8217;s accelerating movement across an attack surface that was already too wide.</p><h3>6. OpenAI&#8217;s February Threat Report: Chinese Law Enforcement Uses ChatGPT to Target Japan&#8217;s Prime Minister</h3><p>OpenAI published a new threat disruption report on February 25, detailing recent cases of AI misuse by nation-state and criminal actors. The report&#8217;s lead case involved a Chinese law enforcement operator using a ChatGPT account to attempt to undermine support for Japan&#8217;s Prime Minister through a coordinated influence operation. OpenAI&#8217;s principal investigator Ben Nimmo described the operation as unusually revealing of China&#8217;s strategy for covert influence operations and transnational repression. OpenAI banned the associated accounts and shared indicators with industry partners. The report also documented cases of AI-generated content being used across multiple platforms simultaneously, with threat actors using different AI models at different stages of their operational workflow (OpenAI, Axios).</p><p><strong>Why it matters</strong></p><ul><li><p>Government-affiliated actors using commercial AI tools for foreign influence operations demonstrate that the threat is not limited to custom AI systems. Commodity access is sufficient.</p></li><li><p>The targeting of a democratic leader&#8217;s public legitimacy via AI-assisted influence operations previews what election and leadership integrity risks look like at scale.</p></li><li><p>OpenAI&#8217;s two-year publication cadence on these reports is generating actionable public threat intelligence, but the reports also reveal how much AI platforms are dealing with that goes unreported.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Build operational awareness of influence operations into your executive protection and media monitoring programs. AI-generated content can now target company leadership and board members with the same sophistication used against government officials.</p></li><li><p>If your organization operates in sectors with geopolitical exposure, energy, defense, or critical infrastructure, factor AI-assisted influence operations into your threat model.</p></li><li><p>Monitor OpenAI&#8217;s threat disruption report series as a recurring intelligence source. It&#8217;s public, primary, and reflects actual activity on a major AI platform.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>The interesting detail in this report is that the Chinese law enforcement operator used ChatGPT as one tool in a broader workflow that touched multiple AI platforms. That&#8217;s consistent with what Amazon observed in the FortiGate campaign and what Anthropic documented with the distillation attacks. Threat actors don&#8217;t pick one AI tool. They build multi-model workflows just like enterprise teams do.</p><p>The policy implication is significant. Restricting any single AI platform&#8217;s use by foreign actors doesn&#8217;t solve the problem. The capability is distributed. Coordination across AI providers for threat intelligence sharing is the only approach that has a realistic shot at tracking these operational workflows.</p><h3>7. Anthropic Launches Claude Code Security: AI-Powered Vulnerability Scanning for Enterprise</h3><p>On February 21, Anthropic began rolling out Claude Code Security, a new capability allowing Claude Code to scan software codebases for vulnerabilities and suggest patches. The feature is in limited research preview for Enterprise and Team customers. The announcement triggered a two-day sell-off in cybersecurity stocks, with GitLab dropping 8% and JFrog falling 25% amid fears that AI-native vulnerability scanning would cannibalize dedicated code-scanning platforms. Bank of America analysts pushed back on the severity of the threat, arguing that the tool poses a significant risk only to code-scanning specialists rather than broader security platforms, and noting that AI-based tools lack the visibility, control, and reliability to replace end-to-end security programs (WIU Cybersecurity Center, CNBC).</p><p><strong>Why it matters</strong></p><ul><li><p>AI-native vulnerability scanning built directly into the development workflow is fundamentally different from dedicated SAST/DAST tools. If it achieves comparable accuracy, the vendor selection calculus for code security tools shifts.</p></li><li><p>The market reaction, wiping $4.3 billion in GitLab&#8217;s market cap in two days, reflects real investor concern about AI disruption of security tooling categories, not just hype.</p></li><li><p>For CISOs, this is a sign to evaluate your code scanning vendor relationships against the trajectory of AI-native alternatives, not to replace them immediately, but to understand your exit options.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Request a technical briefing from your code scanning vendor on how they are differentiating from AI-native alternatives. Price, integration, and accuracy benchmarks matter.</p></li><li><p>Pilot Claude Code Security with a representative engineering team on a low-risk codebase. Generate your own internal benchmark data rather than relying on vendor comparisons.</p></li><li><p>Reframe your security tool evaluation process to account for AI-native alternatives in every category, not just code scanning. This is not the last product of this type.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>The stock market reaction was overstated. Bank of America is right that AI-powered code scanning doesn&#8217;t replace end-to-end security platforms, at least not yet. But the trajectory matters more than the present capability gap. When Anthropic can embed vulnerability detection directly into the coding workflow where developers are already working, the friction of adopting a separate scanning tool becomes a competitive disadvantage for established vendors.</p><p>The question I&#8217;m asking clients is simpler: if your developers are already using Claude Code daily, and Claude Code Security reduces their friction for addressing vulnerabilities without switching tools, what is the case for a standalone scanning product that adds another context switch?</p><h3>8. Malicious npm Campaign SANDWORM_MODE Uses 19 Packages to Harvest Crypto Keys and CI Secrets</h3><p>On February 23, supply chain security firm Socket disclosed an active campaign it codenamed SANDWORM_MODE, a cluster of at least 19 malicious npm packages designed to harvest cryptocurrency keys, CI/CD secrets, and API tokens from developer environments. The packages used dependency confusion and typosquatting techniques to position themselves for installation by developers targeting legitimate packages. Socket described the campaign as a &#8220;Shai-Hulud-like&#8221; supply chain worm designed to propagate through developer ecosystems by targeting shared package dependencies across interconnected projects (The Hacker News, WIU Cybersecurity Center).</p><p><strong>Why it matters</strong></p><ul><li><p>CI/CD secret theft is the direct path to production system compromise. Developer machines and pipelines hold credentials with scope that far exceeds what individual credentials should carry.</p></li><li><p>Supply chain attacks on npm are not novel, but the targeting of cryptocurrency infrastructure alongside CI secrets reflects a broadened target set, both immediate financial gain and longer-term access to production pipelines.</p></li><li><p>Organizations with open-source dependencies in their build pipelines and no dependency integrity validation are running a systemic risk that this campaign is designed to exploit.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Run a dependency audit on your npm packages today, specifically checking for recently added packages matching the naming patterns of heavily used libraries. Tool your CI pipeline to flag any package published in the last 90 days with rapid download acceleration.</p></li><li><p>Enforce software bill of materials requirements across your internal build pipelines and require cryptographic attestation for all published packages.</p></li><li><p>Restrict CI/CD secrets to minimal scope, rotate them on a regular schedule, and audit which pipelines are storing credentials with broader access than their task requires.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Supply chain attacks on developer tooling have a compounding quality that distinguishes them from most other threat categories. When you compromise a developer&#8217;s machine or CI pipeline, you get access to the code before it ships. That means you get to touch production systems through a path that looks like normal development activity. Security controls built around the production perimeter don&#8217;t see it coming.</p><p>SANDWORM_MODE landing one week after Cline CLI&#8217;s supply chain compromise and in the same month as RoguePilot is not a coincidence. The developer toolchain is the current hot frontier. If your security program doesn&#8217;t have explicit coverage of the npm ecosystem, your CI/CD secrets posture, and your AI-assisted development tooling, you have a gap that multiple active campaigns are targeting right now.</p><h3>9. AI-Driven Cyberattacks Now Breach Systems in an Average of 72 Minutes</h3><p>A study published February 23 via BusinessWorld Online and cited in a February cybersecurity roundup found that AI-driven cyberattacks now breach target systems in an average of 72 minutes from initial contact. The figure illustrates how AI tooling is compressing the exploitation timeline by automating vulnerability identification, credential testing, lateral movement planning, and exploitation scripting in near-real time. The finding aligns with Amazon&#8217;s FortiGate campaign documentation and IBM X-Force&#8217;s 44% jump in application exploitation figures published the same week, suggesting a convergence of independent research pointing to the same fundamental shift (BusinessWorld Online, Advanced IT Technologies cybersecurity roundup).</p><p><strong>Why it matters</strong></p><ul><li><p>A 72-minute average breach timeline means that the traditional &#8220;detect, investigate, respond&#8221; model is effectively broken at its current operational tempo. You need detection and automated containment that operates in minutes, not hours.</p></li><li><p>The compression of attack timelines is a direct consequence of AI tooling automating what previously required skilled human intervention at each step.</p></li><li><p>Your incident response plans and SOC SLAs were probably not written with a 72-minute adversary dwell window in mind. That&#8217;s the process gap this creates.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Benchmark your current mean time to detect against the 72-minute figure. If your average detection time exceeds that, you are consistently behind the adversary timeline.</p></li><li><p>Implement automated network-isolation triggers for anomalous credential usage and lateral-movement indicators that don&#8217;t require human approval in the initial response phase.</p></li><li><p>Run a tabletop exercise against a 72-minute breach scenario. The point is forcing your team to confront the tempo of modern AI-assisted attacks on paper before they face it live.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Seventy-two minutes. Think about what your SOC is doing at 3:17 a.m. on a Tuesday when that timer starts. The alert might not even be triaged by the time the attacker has Active Directory. This is the operational reality that makes the continuous, automated detection and response investment non-negotiable.</p><p>The productivity framing of AI security tools, &#8220;AI helps your analysts work faster,&#8221; is a secondary benefit at best. The primary reason to deploy AI-assisted detection is that the adversary is already using it, and your organization cannot match the attack tempo with humans alone.</p><h3>10. Exposed LLM Endpoints Expanding the Attack Surface for Organizations Running Their Own Models</h3><p>On February 23, threat researchers published an analysis in The Hacker News documenting how organizations deploying their own large language models are creating new internal attack surfaces through unprotected LLM endpoints and supporting APIs. The research observed that modern security risks for LLM deployments originate less from the models themselves and more from the infrastructure serving them: APIs, orchestration layers, model registries, and tool-calling endpoints. Each new LLM endpoint expands the attack surface, often with no authentication controls, minimal logging, and no separation from internal data systems (The Hacker News, WIU Cybersecurity Center).</p><p><strong>Why it matters</strong></p><ul><li><p>Organizations are deploying internal LLMs without applying the same security posture they would apply to any other internal API. Unauthenticated model endpoints with access to internal data are exactly the kind of target that opportunistic and targeted attackers seek.</p></li><li><p>LLM orchestration layers, the middleware connecting models to internal tools, databases, and external services, represent a new class of attack surface with no established hardening standard.</p></li><li><p>The vulnerability exposure is not primarily in the model weights or training data. It is in the infrastructure decisions made during deployment, which are often made by teams with no security review.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Conduct an inventory of every internal LLM endpoint in your environment, including development and experimental deployments. Apply authentication requirements to all of them without exception.</p></li><li><p>Include AI infrastructure, model endpoints, orchestration APIs, vector databases, and tool-calling integrations in your penetration testing scope. Most current pentest methodologies do not cover these surfaces.</p></li><li><p>Establish a minimum security baseline for any internal AI deployment, including authentication, logging, rate limiting, and network segmentation, before any model reaches non-development environments.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>This is the AI security gap that will generate the embarrassing breach stories of 2027. Right now, teams are spinning up internal RAG systems, agentic workflows, and model fine-tuning pipelines on infrastructure that wasn&#8217;t designed for adversarial access. The model is not the weak link. The unprotected API sitting in front of it, with access to your internal document store, is the weak link.</p><p>The pattern is familiar. Every time a new technology category emerges, the same thing happens: deployment races ahead of security posture, and the community learns the hard way. We&#8217;re early in the AI infrastructure deployment curve. The window to get ahead of this is narrowing fast.</p><h3>The One Thing You Won&#8217;t Hear About But You Need To</h3><h4>Infostealer Malware Now Targeting OpenClaw AI Agent Configuration Files and Gateway Tokens</h4><p>On February 24, The Hacker News reported on a new infostealer variant specifically designed to steal OpenClaw AI agent configuration files, API tokens, and gateway credentials from developer and enterprise systems where OpenClaw is installed. The malware targets OpenClaw&#8217;s local storage paths and the WebSocket Gateway daemon credentials, providing persistent, privileged access to the systems on which the agent is installed. The timing of this discovery follows the Cline CLI supply chain compromise that forced thousands of unauthorized OpenClaw installations in mid-February, and a separately disclosed critical vulnerability in OpenClaw versions before 2026.1.29 (CVE-2026-25253, CVSS 8.8) that allowed unauthenticated operator-level access through a crafted WebSocket handshake (The Hacker News).</p><p><strong>Why it matters</strong></p><ul><li><p>AI agent credentials are a new, high-value target class. An attacker with OpenClaw gateway tokens has a persistent foothold in the target environment with broad system permissions, the same permissions OpenClaw uses to perform its legitimate autonomous tasks.</p></li><li><p>The convergence of a supply chain attack forcing OpenClaw installations, a critical unpatched vulnerability in older versions, and a dedicated infostealer targeting its credentials represents a rare three-vector alignment against a single AI platform.</p></li><li><p>Organizations that don&#8217;t know OpenClaw is installed in their environment, which includes anyone who installed Cline CLI during the February 17 compromise window and didn&#8217;t remediate, are running exposed agent infrastructure they may not know exists.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Run an immediate audit for OpenClaw installations across your developer and CI/CD environments. If you find instances you didn&#8217;t authorize, treat them as compromised until proven otherwise.</p></li><li><p>Verify all OpenClaw installations are running version 2026.1.29 or later to eliminate the CVE-2026-25253 authentication bypass exposure.</p></li><li><p>Restrict OpenClaw&#8217;s network access to explicitly needed destinations and audit the permissions granted to its agent execution context. AI agents with broad permissions and persistent daemons should be treated as privileged processes, not user applications.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Nobody is writing the headline &#8220;Infostealer Targets AI Agent Credentials.&#8221; But they should be. This is the next phase of the credential theft ecosystem, and it has been visible coming for anyone who has watched how quickly OpenClaw gained deployment share in enterprise developer environments.</p><p>AI agents have privileged access to your systems by design. That&#8217;s what makes them useful. It&#8217;s also what makes their credentials extraordinarily valuable to attackers. When an infostealer can steal an AI agent&#8217;s configuration and token in the same way it steals browser-stored passwords, you&#8217;ve added a new category of credential to your exposure surface. Your existing credential monitoring program almost certainly doesn&#8217;t cover this yet.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share RockCyber Musings&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share RockCyber Musings</span></a></p><h2>References</h2><p>Amazon Web Services Security. (2026, February 20). <em>AI-augmented threat actor accesses FortiGate devices at scale</em>. https://aws.amazon.com/blogs/security/ai-augmented-threat-actor-accesses-fortigate-devices-at-scale/</p><p>Anthropic. (2026, February 23). <em>Detecting and preventing distillation attacks</em>. https://www.anthropic.com/news/detecting-and-preventing-distillation-attacks</p><p>Allyn, B. (2026, February 24). Hegseth threatens to blacklist Anthropic over &#8216;woke AI&#8217; concerns. <em>NPR</em>. https://www.npr.org/2026/02/24/nx-s1-5725327/pentagon-anthropic-hegseth-safety</p><p>Swan, J., &amp; Borsuk, A. (2026, February 25). Trump admin moves toward blacklisting Anthropic in AI safeguards fight. <em>Axios</em>. https://www.axios.com/2026/02/25/anthropic-pentagon-blacklist-claude</p><p>The Hacker News. (2026, February 24). RoguePilot flaw in GitHub Codespaces enabled Copilot to leak GITHUB_TOKEN. https://thehackernews.com/2026/02/roguepilot-flaw-in-github-codespaces.html</p><p>Orca Security. (2026, February 24). <em>RoguePilot: Critical GitHub Copilot vulnerability exploit</em>. https://orca.security/resources/blog/roguepilot-github-copilot-vulnerability/</p><p>SecurityWeek. (2026, February 24). GitHub Issues abused in Copilot attack leading to repository takeover. https://www.securityweek.com/github-issues-abused-in-copilot-attack-leading-to-repository-takeover/</p><p>The Hacker News. (2026, February 23). Anthropic says Chinese AI firms used 16 million Claude queries to copy model. https://thehackernews.com/2026/02/anthropic-says-chinese-ai-firms-used-16.html</p><p>CNBC. (2026, February 24). Anthropic accuses DeepSeek, Moonshot and MiniMax of distillation attacks on Claude. https://www.cnbc.com/2026/02/24/anthropic-openai-china-firms-distillation-deepseek.html</p><p>TechCrunch. (2026, February 23). Anthropic accuses Chinese AI labs of mining Claude as US debates AI chip exports. https://techcrunch.com/2026/02/23/anthropic-accuses-chinese-ai-labs-of-mining-claude-as-us-debates-ai-chip-exports/</p><p>IBM. (2026, February 25). <em>IBM 2026 X-Force Threat Intelligence Index</em>. https://uk.newsroom.ibm.com/ibm-2026-x-force-threat-index</p><p>BleepingComputer. (2026, February 21). Amazon: AI-assisted hacker breached 600 Fortinet firewalls in 5 weeks. https://www.bleepingcomputer.com/news/security/amazon-ai-assisted-hacker-breached-600-fortigate-firewalls-in-5-weeks/</p><p>Cybersecurity Dive. (2026, February 23). AI helps novice threat actor compromise FortiGate devices in dozens of countries. https://www.cybersecuritydive.com/news/ai-cyberattacks-fortigate-amazon/812830/</p><p>OpenAI. (2026, February 25). <em>Disrupting malicious uses of AI</em>. https://openai.com/index/disrupting-malicious-ai-uses/</p><p>CNBC. (2026, February 23). Cybersecurity stocks drop for a second day as new Anthropic tool fuels AI disruption fears. https://www.cnbc.com/2026/02/23/cybersecurity-stocks-anthropic-ai-crowdstrike.html</p><p>WIU Cybersecurity Center. (2026, February 25). <em>Cybersecurity news</em>. https://www.wiu.edu/cybersecuritycenter/cybernews.php</p><p>The Hacker News. (2026, February 23). Malicious npm packages harvest crypto keys, CI secrets, and API tokens. https://thehackernews.com/2026/02/malicious-npm-packages-harvest-crypto.html</p><p>The Hacker News. (2026, February 24). Infostealer steals OpenClaw AI agent configuration files and gateway tokens. https://thehackernews.com/2026/02/infostealer-steals-openclaw-ai-agent.html</p><p>BusinessWorld Online via Advanced IT Technologies. (2026, February 23). AI-driven cyberattacks now breach systems in 72 minutes, study finds, cited in: February 2026 cybersecurity news roundup. https://www.advancedittechnologies.com/post/february-2026-cybersecurity-news-roundup-major-breaches-ai-driven-attacks-critical-vulnerabiliti</p><p>The Hacker News. (2026, February 23). How exposed endpoints increase risk across LLM infrastructure. https://thehackernews.com/2026/02/how-exposed-endpoints-increase-risk.html</p><p>Dark Reading. (2026, February 21). 600+ FortiGate devices hacked by AI-armed amateur. https://www.darkreading.com/threat-intelligence/600-fortigate-devices-hacked-ai-amateur</p><p>The Record. (2026, February 23). Russian-speaking hackers used gen AI tools to compromise 600 firewalls, Amazon says. https://therecord.media/gen-ai-fortigate-hackers-russia</p>]]></content:encoded></item><item><title><![CDATA[Agentic AI Governance: Singapore Built the Skeleton, Not the Immune System]]></title><description><![CDATA[Singapore's agentic AI governance framework is a global first. It also has three critical gaps that create false confidence for CISOs. Here's what to fix.]]></description><link>https://www.rockcybermusings.com/p/agentic-ai-governance-singapore-framework-gaps</link><guid isPermaLink="false">https://www.rockcybermusings.com/p/agentic-ai-governance-singapore-framework-gaps</guid><dc:creator><![CDATA[Rock Lambros]]></dc:creator><pubDate>Tue, 24 Feb 2026 13:50:10 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!tT82!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe973c0e9-164a-4560-bf4d-7b10a04c0ba3_2752x1536.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!tT82!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe973c0e9-164a-4560-bf4d-7b10a04c0ba3_2752x1536.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!tT82!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe973c0e9-164a-4560-bf4d-7b10a04c0ba3_2752x1536.jpeg 424w, https://substackcdn.com/image/fetch/$s_!tT82!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe973c0e9-164a-4560-bf4d-7b10a04c0ba3_2752x1536.jpeg 848w, https://substackcdn.com/image/fetch/$s_!tT82!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe973c0e9-164a-4560-bf4d-7b10a04c0ba3_2752x1536.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!tT82!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe973c0e9-164a-4560-bf4d-7b10a04c0ba3_2752x1536.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!tT82!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe973c0e9-164a-4560-bf4d-7b10a04c0ba3_2752x1536.jpeg" width="1456" height="813" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e973c0e9-164a-4560-bf4d-7b10a04c0ba3_2752x1536.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:813,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2890314,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/188817617?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe973c0e9-164a-4560-bf4d-7b10a04c0ba3_2752x1536.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!tT82!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe973c0e9-164a-4560-bf4d-7b10a04c0ba3_2752x1536.jpeg 424w, https://substackcdn.com/image/fetch/$s_!tT82!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe973c0e9-164a-4560-bf4d-7b10a04c0ba3_2752x1536.jpeg 848w, https://substackcdn.com/image/fetch/$s_!tT82!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe973c0e9-164a-4560-bf4d-7b10a04c0ba3_2752x1536.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!tT82!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe973c0e9-164a-4560-bf4d-7b10a04c0ba3_2752x1536.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/p/agentic-ai-governance-singapore-framework-gaps?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/p/agentic-ai-governance-singapore-framework-gaps?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p><a href="https://www.imda.gov.sg/-/media/imda/files/about/emerging-tech-and-research/artificial-intelligence/mgf-for-agentic-ai.pdf">Singapore&#8217;s Model AI Governance Framework for Agentic AI</a> landed on January 22, 2026, and it&#8217;s the first national-level governance framework purpose-built for autonomous AI agents. That matters. It also tells you what to think about without telling you what to do. And in three critical areas, what it tells you is incomplete enough to create false confidence. If you&#8217;re a CISO using this as your agentic AI governance blueprint, you need to know where the gaps will bite you.</p><h2>What Singapore gets right</h2><p>Credit where earned. IMDA produced something useful, and I want to be specific about why.</p><p>The four-dimension structure (assess and bound risks, make humans accountable, implement technical controls, enable end-user responsibility) gives organizations a governance skeleton that maps cleanly to how enterprises already think about risk management. You can hand this to a board member, and they&#8217;ll recognize the logic. That&#8217;s not nothing. Most governance frameworks read like they were written for compliance analysts who&#8217;ve never shipped a product.</p><p>The risk-factor rubric in &#167;2.1.1 is the framework&#8217;s strongest operational contribution. It breaks risk into impact factors (domain tolerance for error, access to sensitive data, access to external systems, scope of agent actions) and likelihood factors (level of autonomy, task complexity, exposure to untrusted data). This gives you a concrete rubric for evaluating whether a use case is appropriate for agent deployment. Not theoretical. Practical. The kind of table a security architect can pull into a risk register tomorrow morning.</p><p>The tradecraft preservation warning in &#167;2.4.3 is rare for a governance document and directly relevant to every security leader reading this. The framework warns that &#8220;as agents take over entry-level tasks, which typically serve as the training ground for new staff, this could lead to loss of basic operational knowledge.&#8221; If you run a security team where AI coding assistants are displacing junior analyst skill development, this section just validated a concern you&#8217;ve probably struggled to articulate to leadership. IMDA recommends organizations &#8220;identify core capabilities of each job and provide sufficient training and work exposure so that users retain foundational skills.&#8221; I&#8217;d frame it more bluntly: if your junior analysts can&#8217;t triage an alert without an AI assistant, you don&#8217;t have a security team. You have a dependency.</p><p>The graduated rollout guidance in &#167;2.3.3 recommends controlling agent deployment based on three vectors: users (trained users first), tools (whitelisted MCP servers first), and systems (lower-risk internal systems first). This is how production deployments should work. It reflects real operational experience rather than theoretical deployment models.</p><p>So yes, Singapore did something that no other nation has done. They put agentic AI governance in writing, at a national level, with enough specificity to be useful. Now let me explain why &#8220;useful&#8221; isn&#8217;t the same as &#8220;sufficient.&#8221;</p><h2>HITL at agentic scale is security theater</h2><p>The framework&#8217;s central accountability mechanism is human-in-the-loop oversight at &#8220;significant checkpoints&#8221; (&#167;2.2.2, p.16). Singapore correctly identifies automation bias as &#8220;a bigger concern with increasingly capable agents&#8221; (p. 13). The framework recommends training humans to identify failure modes, auditing oversight effectiveness, and complementing human review with automated monitoring.</p><p>All reasonable. All insufficient.</p><p>The math doesn&#8217;t work. Let me show you.</p><p>Take a mid-size enterprise running 50 agents across customer service, code review, procurement, and data analysis. Each agent makes 20 tool calls per hour. That produces 1,000 approval-eligible events per hour. Even if only 10% require human review, your security team faces 100 approval requests per hour during business operations. At two minutes per meaningful review, that consumes over three full-time equivalents solely for agent oversight. Not security work. Not threat hunting. Not incident response. Rubber-stamping.</p><p>The framework provides no guidance on this capacity calculation. No threshold for when human review becomes performative. No architectural alternative for when HITL saturates.</p><p>The <a href="https://genai.owasp.org/resource/agentic-ai-threats-and-mitigations/">OWASP Agentic AI Threats and Mitigations Guide</a> classifies Overwhelming HITL (T10) as a deliberate attack vector. Adversaries can exploit the approval bottleneck by generating a flood of low-risk requests that train reviewers to rubber-stamp, then embed high-risk actions in the stream.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0UcH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f549714-f61b-47d0-bc45-86859cf43927_3143x8192.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0UcH!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f549714-f61b-47d0-bc45-86859cf43927_3143x8192.png 424w, https://substackcdn.com/image/fetch/$s_!0UcH!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f549714-f61b-47d0-bc45-86859cf43927_3143x8192.png 848w, https://substackcdn.com/image/fetch/$s_!0UcH!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f549714-f61b-47d0-bc45-86859cf43927_3143x8192.png 1272w, https://substackcdn.com/image/fetch/$s_!0UcH!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f549714-f61b-47d0-bc45-86859cf43927_3143x8192.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0UcH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f549714-f61b-47d0-bc45-86859cf43927_3143x8192.png" width="1456" height="3795" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1f549714-f61b-47d0-bc45-86859cf43927_3143x8192.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:3795,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:991644,&quot;alt&quot;:&quot;Bar chart showing agent approval requests per hour versus human review capacity for enterprises running 10, 25, 50, and 100 agents&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/188817617?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f549714-f61b-47d0-bc45-86859cf43927_3143x8192.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Bar chart showing agent approval requests per hour versus human review capacity for enterprises running 10, 25, 50, and 100 agents" title="Bar chart showing agent approval requests per hour versus human review capacity for enterprises running 10, 25, 50, and 100 agents" srcset="https://substackcdn.com/image/fetch/$s_!0UcH!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f549714-f61b-47d0-bc45-86859cf43927_3143x8192.png 424w, https://substackcdn.com/image/fetch/$s_!0UcH!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f549714-f61b-47d0-bc45-86859cf43927_3143x8192.png 848w, https://substackcdn.com/image/fetch/$s_!0UcH!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f549714-f61b-47d0-bc45-86859cf43927_3143x8192.png 1272w, https://substackcdn.com/image/fetch/$s_!0UcH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f549714-f61b-47d0-bc45-86859cf43927_3143x8192.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure 1: HITL capacity breakdown at enterprise scale</figcaption></figure></div><p>The <a href="https://cdn.openai.com/papers/practices-for-governing-agentic-ai-systems.pdf">Anthropic/OpenAI joint paper on governing agentic AI systems</a> directly warns that &#8220;when the user must approve many decisions and thus must make each approval quickly, reducing their ability to meaningfully consider each one,&#8221; the oversight becomes performative. The framework&#8217;s answer to automation bias is training and auditing. Training doesn&#8217;t change the cognitive architecture that produces complacency under sustained approval loads. You can&#8217;t train your way out of a capacity problem.</p><p>HITL works for low-volume, high-consequence decisions. It fails everywhere else. For the vast majority of agentic operations, organizations need tiered, consequence-based automation. Auto-approve low-risk reversible actions with logging. Route medium-risk actions through a watchdog agent (what some practitioners call the Janus pattern) for secondary validation. Reserve genuine human review for irreversible, high-blast-radius decisions only.</p><p>The framework&#8217;s blanket &#8220;define significant checkpoints&#8221; guidance will produce compliance artifacts, not security outcomes.</p><h2>Agent identity is broken, not &#8220;evolving&#8221;</h2><p>The framework describes agent identity challenges as an &#8220;an evolving space&#8221; in which &#8220;gaps exist today&#8221; (&#167;2.1.2, p. 11). This framing suggests that the problem will be resolved through incremental improvements to existing identity and access management systems.</p><p>It won&#8217;t. The problem is architectural.</p><p>Human IAM systems rest on three assumptions that agents violate. First, the entity requesting access can meaningfully consent to terms. An agent can&#8217;t meaningfully consent to OAuth scopes because it doesn&#8217;t understand what it&#8217;s granting. Second, the entity&#8217;s identity persists across sessions in a verifiable way. An agent&#8217;s identity may be ephemeral, spawned for a single task, then destroyed, or recursive, spawning sub-agents that inherit some permissions but not others. Third, authorization scopes are understood by the entity at request time. For agents, authorization scopes are determined at runtime by the planning module, not at request time by a human clicking &#8220;Allow.&#8221;</p><p>The <a href="https://genai.owasp.org/resource/owasp-top-10-for-agentic-applications-for-2026/">OWASP Agentic Top 10</a> (ASI03, December 2025) puts this plainly, &#8220;Without a distinct, governed identity of its own, an agent operates in an attribution gap that makes enforcing true least privilege impossible.&#8221; If you can&#8217;t attribute an action to a specific identity with verifiable scope, every downstream control the framework recommends, least privilege, access logging, accountability chains, rests on an unreliable foundation.</p><p>The OAuth consent model breaks down entirely for agentic workloads. Three-legged OAuth was designed for human consent flows where a user clicks &#8220;Allow&#8221; and understands what they&#8217;re granting. When an agent orchestrator requests OAuth scopes on behalf of a user, the consent model collapses. The agent can&#8217;t meaningfully consent, and the human operator often doesn&#8217;t know what scopes the agent will request at runtime.</p><p>The empirical evidence makes this worse. <a href="https://arxiv.org/abs/2601.10338">The Agent Skills in the Wild study</a> found excessive permission requests (PE1) across 94 skills that requested permissions far beyond stated functionality. This isn&#8217;t a bug in a few implementations. It&#8217;s the default behavior in the wild. Agents ask for more access than they need because the systems granting access weren&#8217;t designed to question them.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!umPi!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F249750a6-af09-451b-8e2f-ea4e9a330b40_4727x4291.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!umPi!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F249750a6-af09-451b-8e2f-ea4e9a330b40_4727x4291.png 424w, https://substackcdn.com/image/fetch/$s_!umPi!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F249750a6-af09-451b-8e2f-ea4e9a330b40_4727x4291.png 848w, https://substackcdn.com/image/fetch/$s_!umPi!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F249750a6-af09-451b-8e2f-ea4e9a330b40_4727x4291.png 1272w, https://substackcdn.com/image/fetch/$s_!umPi!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F249750a6-af09-451b-8e2f-ea4e9a330b40_4727x4291.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!umPi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F249750a6-af09-451b-8e2f-ea4e9a330b40_4727x4291.png" width="1456" height="1322" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/249750a6-af09-451b-8e2f-ea4e9a330b40_4727x4291.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1322,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:883325,&quot;alt&quot;:&quot;Table comparing human IAM assumptions to agentic reality across consent, persistence, and scope understanding&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/188817617?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F249750a6-af09-451b-8e2f-ea4e9a330b40_4727x4291.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Table comparing human IAM assumptions to agentic reality across consent, persistence, and scope understanding" title="Table comparing human IAM assumptions to agentic reality across consent, persistence, and scope understanding" srcset="https://substackcdn.com/image/fetch/$s_!umPi!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F249750a6-af09-451b-8e2f-ea4e9a330b40_4727x4291.png 424w, https://substackcdn.com/image/fetch/$s_!umPi!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F249750a6-af09-451b-8e2f-ea4e9a330b40_4727x4291.png 848w, https://substackcdn.com/image/fetch/$s_!umPi!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F249750a6-af09-451b-8e2f-ea4e9a330b40_4727x4291.png 1272w, https://substackcdn.com/image/fetch/$s_!umPi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F249750a6-af09-451b-8e2f-ea4e9a330b40_4727x4291.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure 2: Three IAM assumptions agents violate</figcaption></figure></div><p>The framework&#8217;s interim best practices (unique agent IDs, recording delegation, tying identity to supervising humans) are reasonable temporary measures, but calling this &#8220;evolving&#8221; rather than &#8220;broken&#8221; understates the urgency. Organizations should treat agent identity as a pre-deployment blocker for high-autonomy agents, not a checkbox to revisit later. Until agent-native identity primitives exist, decentralized identity, cryptographically bound intent, task-scoped ephemeral credentials, the accountability chain the framework builds on top of identity is structurally unsound.</p><h2>MCP gets two bullet points in a 29-page framework. That&#8217;s negligent.</h2><p>The entire MCP security guidance in Singapore&#8217;s framework reads: &#8220;For MCP servers: Whitelist trusted servers and only allow agent to interact with servers on that whitelist. Sandbox any code execution&#8221; (&#167;2.3.1, p.19). That&#8217;s it. Two bullet points for the protocol rapidly becoming the standard interface between AI agents and the external world.</p><p>Since Anthropic introduced MCP, adoption has exploded. Over 10,000 MCP servers are deployed. Claude, Cursor, Microsoft Copilot, Gemini, VS Code, and ChatGPT all support the protocol. In December 2025, Anthropic, OpenAI, and Block donated MCP and other projects to the Linux Foundation&#8217;s Agentic AI Foundation. MCP is infrastructure now. And the framework treats it like a footnote.</p><p>MCP&#8217;s threat surface spans three lifecycle phases, and the framework addresses only a fragment of the first one.</p><p>During creation, MCP servers face installer spoofing, supply-chain backdoors, name collision attacks, and the absence of authentication handshakes. The framework&#8217;s &#8220;whitelist trusted servers&#8221; advice partially addresses this phase. Partially.</p><p>During operations is where things get dangerous. Documented attacks include tool poisoning (malicious commands embedded in tool descriptions that the LLM executes as instructions), credential theft, sandbox escape, command injection, remote code execution, and the rug pull attack. The rug pull is particularly insidious as a legitimate tool passes initial vetting, earns whitelist status, then silently changes behavior during an update. The framework&#8217;s whitelisting guidance assumes trust is both verifiable and persistent. The rug pull attack exploits exactly this assumption.</p><p>During updates, servers face version drift, privilege persistence, configuration drift, and unsigned manifests. The framework says nothing about any of these.</p><p>The experimental evidence should alarm you. Research on <a href="https://openreview.net/pdf/7db8d7d31396bd9a8cc21dbbc479c7511639f8d8.pdf">LLM-driven AI agent communication</a> reports MCP exploits that achieved Bash shell backdoors on port 4444, SSH key exfiltration via email, and file deletion without triggering alerts. These aren&#8217;t theoretical attacks. They were demonstrated in controlled experiments.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!idYC!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a1a4ed9-81d2-41e3-8ac6-dc93b77fa57c_2885x8192.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!idYC!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a1a4ed9-81d2-41e3-8ac6-dc93b77fa57c_2885x8192.png 424w, https://substackcdn.com/image/fetch/$s_!idYC!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a1a4ed9-81d2-41e3-8ac6-dc93b77fa57c_2885x8192.png 848w, https://substackcdn.com/image/fetch/$s_!idYC!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a1a4ed9-81d2-41e3-8ac6-dc93b77fa57c_2885x8192.png 1272w, https://substackcdn.com/image/fetch/$s_!idYC!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a1a4ed9-81d2-41e3-8ac6-dc93b77fa57c_2885x8192.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!idYC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a1a4ed9-81d2-41e3-8ac6-dc93b77fa57c_2885x8192.png" width="2885" height="8192" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5a1a4ed9-81d2-41e3-8ac6-dc93b77fa57c_2885x8192.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:8192,&quot;width&quot;:2885,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:922321,&quot;alt&quot;:&quot;Grouped flow chart showing number of documented threat types in MCP creation, operation, and update phases versus Singapore framework coverage&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/188817617?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa53fff93-9ec8-4b7e-92ce-b89691ab75f8_2885x8192.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Grouped flow chart showing number of documented threat types in MCP creation, operation, and update phases versus Singapore framework coverage" title="Grouped flow chart showing number of documented threat types in MCP creation, operation, and update phases versus Singapore framework coverage" srcset="https://substackcdn.com/image/fetch/$s_!idYC!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a1a4ed9-81d2-41e3-8ac6-dc93b77fa57c_2885x8192.png 424w, https://substackcdn.com/image/fetch/$s_!idYC!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a1a4ed9-81d2-41e3-8ac6-dc93b77fa57c_2885x8192.png 848w, https://substackcdn.com/image/fetch/$s_!idYC!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a1a4ed9-81d2-41e3-8ac6-dc93b77fa57c_2885x8192.png 1272w, https://substackcdn.com/image/fetch/$s_!idYC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a1a4ed9-81d2-41e3-8ac6-dc93b77fa57c_2885x8192.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure 3: MCP threat surface across three lifecycle phases</figcaption></figure></div><p>The first malicious MCP server discovered in the wild impersonated Postmark&#8217;s email service on npm and silently BCC&#8217;d every agent-sent email to the attacker. This is supply chain security applied to the AI protocol layer, and the framework doesn&#8217;t address it.</p><p>Whitelisting without tool integrity monitoring, version pinning, hash verification of tool descriptions, and runtime policy enforcement at the MCP boundary creates a false sense of security. Organizations implementing this framework need to treat MCP servers with the same rigor applied to third-party software dependencies. Pin versions. Verify checksums. Monitor for behavioral changes. Enforce runtime sandboxing that restricts tool access to files, APIs, and network endpoints beyond the declared scope.</p><h2>The gaps the framework weaves around</h2><p>Two additional blind spots deserve attention, not as standalone failures, but as force multipliers for the three critical gaps above.</p><p>The framework treats instructions, memory, and tools as functional agent components (&#167;1.1.1, pp.3-4) without recognizing that they all converge in a shared, flat trust namespace. The <a href="https://genai.owasp.org/resource/llm-and-gen-ai-data-security-best-practices/">OWASP GenAI Data Security Guide</a> puts it starkly. The &#8220;context window aggregates multiple trust domains into a single flat namespace with no internal access control. RAG results, system prompts, user input, tool outputs, and conversation history all land in the same context with equal trust weight.&#8221; If you&#8217;re securing individual components (tools, memory, instructions) without addressing the architectural vulnerability that connects them, you&#8217;re securing individual rooms while leaving the hallways unguarded.</p><p>The framework also assumes organizations consciously deploy agentic AI through a controlled pipeline. The clean value chain diagram (model developers, system providers, deploying organization, end users) describes how agents should be deployed. Every major AI platform now offers agent-building capabilities accessible to non-technical users. Microsoft Copilot Studio, Salesforce AgentForce, and dozens of startups let business users create agents that connect to organizational data, send emails, update databases, and make API calls, all without security team involvement. Shadow AI agents inherit every risk the framework describes but operate entirely outside the governance structures it recommends. A CISO should spend equal effort on discovery and containment of unauthorized agents as on governing sanctioned ones.</p><h2>The honest assessment</h2><p>Singapore&#8217;s framework tells you what to think about. The OWASP Agentic Top 10 outlines what to defend against across 10 threat categories (ASI01-ASI10), with specific vulnerability descriptions and mitigation strategies for each. The <a href="https://cloudsecurityalliance.org/blog/2025/02/06/agentic-ai-threat-modeling-framework-maestro">MAESTRO threat modeling framework</a> provides 47 threat IDs organized across seven architectural layers with specific mitigations. The OWASP Securing Agentic Applications Guide provides implementation-level controls.</p><p>The Singapore framework sits at the &#8220;governance structure&#8221; tier. That tier is necessary, but it&#8217;s not sufficient. If you hand this to your team as &#8220;the agentic AI governance playbook,&#8221; you&#8217;ll produce compliance artifacts without meaningfully reducing risk.</p><p><strong>Key Takeaway:</strong> Singapore built the governance skeleton for agentic AI. The immune system, the controls that catch what gets past the perimeter, remains your engineering problem.</p><h3>What to do next</h3><p>Three things the framework doesn&#8217;t tell you that your team needs tomorrow.</p><p>First, quantify your HITL capacity. Use this formula: (agent decision rate) times (approval time per decision) times (number of active agents) equals required human reviewer hours. When that number exceeds available capacity, you need tiered automation, not more reviewers. Build HITL saturation metrics and circuit breakers that automatically reduce agent autonomy when human review queues exceed defined thresholds.</p><p>Second, harden MCP with zero-trust assumptions. The OWASP GenAI Security Project published two guides that give you an operational playbook the Singapore framework doesn&#8217;t: <a href="https://genai.owasp.org/resource/a-practical-guide-for-secure-mcp-server-development/">&#8220;A Practical Guide for Secure MCP Server Development&#8221;</a> (February 2026) for teams building MCP servers, and &#8220;<a href="https://genai.owasp.org/resource/cheatsheet-a-practical-guide-for-securely-using-third-party-mcp-servers-1-0/">A Practical Guide for Securely Using Third-Party MCP Servers&#8221;</a> (October 2025) for teams consuming them. Start with their governance workflow: require developers to submit third-party MCP servers with documentation and a hash of tool descriptions, run automated scans for malware and hidden instructions, then version-pin approved servers in a trusted registry before deployment. For rug pull defense, pin the version of each MCP server and its tools at the time of initial approval, use a hash or checksum to verify tool descriptions and functionality haven&#8217;t been altered, and maintain version history with alerts on unauthorized changes. Require cryptographic tool manifests, signed packages that include description, schema, version, and required permissions, verified at load time. Run third-party MCP servers inside Docker containers to prevent compromised tools from accessing host files or escaping the operating environment. Enforce least-privilege policies at the MCP boundary to restrict tools from reading local files, accessing sensitive APIs, or exfiltrating data beyond declared scope. For multi-tenant deployments, isolate sessions with per-user execution contexts and deterministic cleanup routines that flush file handles, temp storage, and cached tokens on session termination. These two guides give you the implementation details that &#167;2.3.1&#8217;s two bullet points leave out.</p><p>Third, hunt for shadow AI agents. Deploy network monitoring for MCP traffic from unapproved endpoints. Configure API gateways to detect agent-pattern behavior: rapid sequential API calls, tool-use headers, automated OAuth token requests. Update acceptable use policies to cover agent creation explicitly. Treat every discovered shadow agent as an incident requiring risk assessment before it continues operating.</p><p>The framework is a starting point. It&#8217;s not a destination. Use it as a governance structure, then fill the operational gaps with OWASP guidance, real-world threat intelligence, and the uncomfortable math that governance documents avoid.</p><p>&#128073; Subscribe for more AI security and governance insights with the occasional rant.</p><p>&#128073; Visit <strong><a href="https://www.rockcyber.com/">RockCyber.com</a></strong> to learn more about how we can help you in your traditional Cybersecurity and AI Security and Governance Journey</p><p>&#128073; Want to save a quick $100K? Check out our AI Governance Tools at <strong><a href="https://aigovernancetoolkit.com/">AIGovernanceToolkit.com</a></strong></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&quot;,&quot;text&quot;:&quot;Share RockCyber Musings&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share"><span>Share RockCyber Musings</span></a></p>]]></content:encoded></item><item><title><![CDATA[Weekly Musings Top 10 AI Security Wrapup: Issue 26 February 13, 2026 - February 19, 2026 ]]></title><description><![CDATA[The AI Attack Surface Is Now the Entire Stack: APTs, Agent Marketplaces, and the Infrastructure Under Your Feet]]></description><link>https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260213-20260219</link><guid isPermaLink="false">https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260213-20260219</guid><dc:creator><![CDATA[Rock Lambros]]></dc:creator><pubDate>Fri, 20 Feb 2026 13:50:26 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!p7Kq!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22e1d693-44a4-4bb3-acee-533b730f3746_1024x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!p7Kq!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22e1d693-44a4-4bb3-acee-533b730f3746_1024x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!p7Kq!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22e1d693-44a4-4bb3-acee-533b730f3746_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!p7Kq!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22e1d693-44a4-4bb3-acee-533b730f3746_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!p7Kq!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22e1d693-44a4-4bb3-acee-533b730f3746_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!p7Kq!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22e1d693-44a4-4bb3-acee-533b730f3746_1024x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!p7Kq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22e1d693-44a4-4bb3-acee-533b730f3746_1024x1024.png" width="1024" height="1024" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/22e1d693-44a4-4bb3-acee-533b730f3746_1024x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1024,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1233556,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/188564131?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22e1d693-44a4-4bb3-acee-533b730f3746_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!p7Kq!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22e1d693-44a4-4bb3-acee-533b730f3746_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!p7Kq!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22e1d693-44a4-4bb3-acee-533b730f3746_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!p7Kq!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22e1d693-44a4-4bb3-acee-533b730f3746_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!p7Kq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22e1d693-44a4-4bb3-acee-533b730f3746_1024x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260213-20260219?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260213-20260219?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p>The week of February 13, 2026 handed CISOs a masterclass in what AI security actually looks like when the theory meets the road. State-sponsored hackers are using Google&#8217;s own AI to run recon on your employees. The most popular AI agent framework turned its plugin marketplace into a malware distribution network. The tool your help desk uses for remote access got a CVSS 9.9 exploit actively running in the wild. And somewhere in your organization, someone probably asked an LLM for a password.</p><p>None of this is hypothetical anymore. The attack surface is the entire AI stack: the models, the agents, the marketplaces, the APIs, the infrastructure those agents touch, and the humans who trusted all of it more than they should have. If your AI governance program still lives in a slide deck, this week is a good reason to print it out and start over.</p><h3>1. Nation-State Hackers Are Using Gemini for Every Stage of the Kill Chain</h3><p>Google&#8217;s Threat Intelligence Group (GTIG) published its quarterly AI Threat Tracker report on February 12, documenting that state-sponsored actors from China, Iran, North Korea, and Russia are now using Gemini across reconnaissance, phishing, malware development, and post-compromise activities (Google GTIG). Chinese actors used Gemini to pose as security researchers and automate vulnerability analysis against U.S. targets, including RCE testing and WAF bypass techniques. North Korean group UNC2970 queried the tool multiple days a week for technical support and to profile high-value targets at cybersecurity and defense firms. Iran&#8217;s APT42 used it to craft hyper-personalized phishing lures with culturally accurate language, eliminating the grammar errors defenders have long relied on as a detection signal. GTIG also identified HONESTCUE, a malware downloader that calls Gemini&#8217;s API to generate C# code for second-stage payloads in real time, and COINBAIT, a cryptocurrency-themed phishing kit built with AI code generation tools.</p><p><strong>Why it matters</strong></p><ul><li><p>Phishing lure quality has fundamentally changed. AI-generated messages in native language with accurate cultural context defeat the grammar-based heuristics your security awareness training still teaches.</p></li><li><p>The HONESTCUE model of AI-as-a-backend-service for malware means attackers can generate unique payloads per target without static signatures to detect.</p></li><li><p>Google confirmed model extraction attacks at scale, where actors queried Gemini roughly 100,000 times in multiple languages to replicate its reasoning capabilities in competing systems.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Brief your security awareness teams that grammar and awkwardness are no longer reliable phishing indicators. Update training to focus on unsolicited contact, urgency, and out-of-band verification.</p></li><li><p>Inventory which internal systems Gemini or any LLM API can reach. An AI-generated payload that hits an exposed internal endpoint needs a path there. Find those paths.</p></li><li><p>Review your threat model for credential-harvesting scenarios where AI-accelerated OSINT shortens attacker dwell time in the reconnaissance phase. Reduce your publicly available employee footprint where possible.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p> AI doesn&#8217;t give attackers magic powers; it gives them scale and speed at tasks they already knew how to do. This and Anthropic&#8217;s GTG-1002 are the proof. The North Koreans aren&#8217;t doing anything novel, but they <em><strong>are</strong></em> doing reconnaissance faster and with more precision, and they&#8217;re doing it with a tool you&#8217;re probably paying for with your corporate account. The fact that China tried to get Gemini to plan RCE attacks against U.S. targets by pretending to be a CTF participant is almost funny, except that it worked often enough to be worth documenting.</p><p>The model extraction finding is the most alarming. If a foreign intelligence service builds a Gemini-equivalent using 100,000 queries of the real thing, they have a tool that behaves like Gemini with none of Google&#8217;s safety controls. That&#8217;s a bigger long-term problem than any single phishing campaign, and it&#8217;s one the enterprise sector has almost no visibility into.</p><h3>2. OpenClaw&#8217;s Security Crisis Escalates: 1,184 Malicious Skills, a Foundation Handoff, and a Race to Patch</h3><p>OpenClaw, the AI agent framework with 212,000 GitHub stars as of this writing, spent this week proving that rapid adoption without a security architecture is a gift to attackers (SC Media, SecurityWeek). The ClawHavoc supply chain campaign, first disclosed February 1, grew to at least 1,184 confirmed malicious skills in ClawHub, the platform&#8217;s third-party plugin marketplace. Antiy CERT&#8217;s analysis found payloads using staged downloads, reverse shells via Python system calls, and direct data theft, including the Atomic macOS Stealer (AMOS) targeting browser credentials, SSH keys, and crypto wallets. A single threat actor uploaded 354 malicious packages in what appears to have been an automated blitz. On Valentine&#8217;s Day (there is a certain irony here), OpenClaw founder Peter Steinberger announced he was joining OpenAI to lead personal agent development, with the project transitioning to the OpenClaw Foundation under OpenAI sponsorship. By February 19, SecurityWeek reported the launch of SecureClaw, an open-source hardening tool running 55 automated audit checks mapped to OWASP&#8217;s Agentic Security Initiative top 10 and MITRE ATLAS.</p><p><strong>Why it matters</strong></p><ul><li><p>ClawHub had no automated static analysis, no code signing, and no review process. Publishing a malicious skill required only a GitHub account one week old. Your developers are treating this marketplace like a trusted source.</p></li><li><p>OpenClaw&#8217;s persistent memory files (SOUL.md, MEMORY.md) were targeted, meaning malicious payloads can modify the agent&#8217;s long-term behavioral instructions and wait before triggering. Point-in-time malware analysis misses this entirely.</p></li><li><p>The transition to an OpenAI-sponsored foundation means a consumer-grade security nightmare is now a tier-one organization&#8217;s responsibility to clean up.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Audit your environment for OpenClaw deployments now. Any instance predating February 1, 2026 with API keys loaded should be treated as potentially compromised and keys rotated.</p></li><li><p>Treat every ClawHub skill like an untrusted third-party binary before installing it. No README is trustworthy. No prerequisite installation step should execute without review.</p></li><li><p>Engage your developer community on the SecureClaw hardening checks. If your org is running OpenClaw agents, you need the 55-point audit baseline before any production use.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>When a security researcher found 386 malicious packages from a single threat actor in OpenClaw's ClawHub marketplace, Steinberger told him security 'isn't really something he wants to prioritize.' He's since changed his tune on Lex Fridman's podcast, hired a security lead, and partnered with VirusTotal for malware scanning. Good. Progress matters. But sandboxing is still opt-in. The defaults still ship insecure. Words on a podcast don't fix architecture. I don't blame Steinberger for building fast and asking questions later. That's how open-source adoption works. I do blame the enterprises that deployed it without asking whether anyone had thought about what happens when the agent has shell access and the plugin store has no gates.</p><p>The OpenAI foundation takeover is interesting. Either OpenAI is going to clean this up properly, which would take real investment in supply chain security, or they&#8217;re going to inherit the liability that comes with 212,000 GitHub stars pointing at a platform with 20% malicious packages in its ecosystem. I&#8217;d watch that situation carefully. For now, the practical answer is: if OpenClaw is running in your environment, it is a high-severity finding until proven otherwise.</p><h3>3. BeyondTrust CVE-2026-1731 Hits Active Exploitation; CISA Mandates Federal Patching by February 16</h3><p>CISA added CVE-2026-1731 to its Known Exploited Vulnerabilities catalog on February 13, 2026, mandating that Federal Civilian Executive Branch agencies apply patches by February 16 (CISA, Help Net Security). The vulnerability, CVSS 9.9, allows an unauthenticated attacker to execute arbitrary OS commands against BeyondTrust Remote Support and Privileged Remote Access products via a crafted WebSocket request with zero user interaction required. watchTowr&#8217;s Ryan Dewhurst confirmed in-the-wild exploitation through global sensor networks, and Arctic Wolf separately detected attacks attempting to deploy the SimpleHelp RMM tool for persistence with lateral movement into Active Directory. BeyondTrust patched SaaS instances automatically on February 2 but self-hosted customers required manual action. The flaw was discovered by the Hacktron AI team using AI-enabled variant analysis after studying a related Ivanti bug, identifying approximately 8,500 exposed on-premises instances.</p><p><strong>Why it matters</strong></p><ul><li><p>BeyondTrust serves 75% of the Fortune 100. A pre-auth RCE in privileged remote access infrastructure is a direct path to crown jewel systems, with no prior foothold required.</p></li><li><p>The Silk Typhoon precedent from 2024 means Chinese state actors have already demonstrated intent to exploit BeyondTrust products at scale. With active exploitation confirmed, the question is who got there first.</p></li><li><p>AI-enabled variant analysis cut the time from patch release to public PoC to under two weeks. Discovery-to-exploitation timelines are compressing in both directions simultaneously.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Patch now. Any Remote Support version 25.3.1 or earlier and PRA version 24.3.4 or earlier is vulnerable. Self-hosted customers who have not applied BT26-02 should assume compromise and begin incident response procedures.</p></li><li><p>Segment BeyondTrust deployments from internal networks wherever architecturally possible. Post-exploitation lateral movement via ActiveDirectory was confirmed in at least one attacker cluster.</p></li><li><p>Revisit your exposure to other BeyondTrust products. The variant analysis method that found this vulnerability treats entire vulnerability classes as attack surface, not individual CVEs.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>BeyondTrust is having a rough patch. They already had Silk Typhoon weaponize a zero-day against the U.S. Treasury in December 2024. Now they have a CVSS 9.9 pre-auth RCE with confirmed exploitation inside two weeks of public disclosure. The pattern of targeting is not a coincidence." This is the product pattern where a dominant market position in privileged access makes you a permanent priority target. Silk Typhoon already told us what they do with access to BeyondTrust once they have it.</p><p>The AI angle here is the one worth pausing on. Hacktron AI found this vulnerability by doing AI-assisted variant analysis across codebases after reading watchTowr&#8217;s technical writeup on a related Ivanti bug. The entire discovery-to-exploitation cycle, from responsible disclosure to confirmed mass scanning, played out in 13 days. If your patching cycle runs longer than that for critical remote access infrastructure, this CVE is the poster child for why that has to change.</p><h3>4. Cline npm Supply Chain Attack Silently Installs OpenClaw on Developer Machines</h3><p>On February 17, 2026, an unknown actor used a stolen npm publish token to release cline@2.3.0 with one change: a postinstall script that silently deployed OpenClaw on any machine that updated during an eight-hour window (GitHub Security Advisory GHSA-9ppg-jx86-fqw7). Adnan Khan had reported the root cause, a prompt injection vulnerability in Cline&#8217;s Claude-powered issue triage workflow that enabled GitHub Actions cache poisoning and credential theft, privately on January 1, followed up four times, and got no response. He went public February 9. Cline patched in 30 minutes. The token was already gone.</p><p>Michael Bargury, CTO of Zenity, ran RAPTOR, the open-source forensics tool built by Gadi Evron, CEO of Knostic, against the advisory URL and had full attribution in five minutes: actor <code>glthub-actions</code> (a typosquat using a lowercase L), the weaponized issue, and the exfiltration infrastructure. The critical finding is that the attacker found Khan&#8217;s public proof-of-concept test repository on January 2 and struck on January 28, ten days before full disclosure. Patch windows don&#8217;t protect you when the POC is already public.</p><p><strong>Why it matters</strong></p><ul><li><p>GitHub Actions workflows that hand AI agents broad tool permissions and accept untrusted input from public issue trackers are an attack surface most teams haven&#8217;t mapped. That pattern is common right now.</p></li><li><p>The attacker moved during coordinated disclosure, not after. Your vendor&#8217;s patch timeline offers no protection if a researcher&#8217;s test repo is public while they&#8217;re waiting for a response.</p></li><li><p>RAPTOR produced high-confidence attribution from a single URL in five minutes. Attackers doing threat intelligence on your vendors have the same capability.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Audit every GitHub Actions workflow using AI agents for untrusted input paths. Issue titles, PR descriptions, and branch names are all attacker-controlled strings.</p></li><li><p>Enforce <code>--ignore-scripts</code> in automated build environments and scope npm publish tokens to the minimum necessary. A token that can publish anything in your org is a single point of failure.</p></li><li><p>Add AI-assisted forensics tooling to your IR playbook before you need it. Five minutes to attribution is the new baseline expectation.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>I know both Michael and Gadi, and watching RAPTOR nail full attribution in five minutes while the rest of the industry was still reading the advisory is exactly the kind of thing that should embarrass every security team still running manual IR processes. Khan did everything right. Six weeks of responsible disclosure, documented in detail, across every available channel, ignored completely. The vendor patched in 30 minutes once the blog went public. The gap between those two timelines is where the breach lived.</p><p>The meta-structure here is almost too clean. An AI agent with misconfigured permissions accepted natural language from anonymous internet users and handed an attacker publish credentials for a 4 million user developer tool. Then an AI forensics agent reconstructed the crime scene from a URL. Attacker used AI as a weapon. Defender used AI as a microscope. What separated the outcomes wasn&#8217;t sophistication. It was that the vendor ignored six weeks of warning. If a researcher is trying to reach you, pick up the phone.</p><h3>5. AI-Generated Passwords Are Fundamentally Insecure, and Vibe Coding Is Shipping Them to Production</h3><p>AI cybersecurity firm Irregular published research on February 18-19 showing that ChatGPT, Claude, and Gemini generate highly predictable passwords with dramatically reduced entropy compared to cryptographically secure random generation (The Register, Malwarebytes). When Irregular prompted Claude Opus 4.6 fifty times, 20 of the 50 results were duplicates and 18 were the exact same string. Every Claude-generated password began with an uppercase &#8220;G&#8221; and used the same narrow character subset. GPT-5.2 started nearly all passwords with &#8220;v&#8221; and used &#8220;Q&#8221; as the second character in nearly half of outputs. The researchers measured 20 to 27 bits of entropy in LLM-generated 16-character passwords versus the 98 to 120 bits expected from cryptographically random generation. Irregular noted the problem is not fixable by prompt engineering or temperature adjustments, since the predictability is structural to how LLMs generate tokens.</p><p><strong>Why it matters</strong></p><ul><li><p>Any developer who used an LLM to generate a password, secret, or API key in production code has shipped a predictable, likely crackable credential. Code review doesn&#8217;t catch this because the string looks complex.</p></li><li><p>Attackers can now build wordlists from known LLM output patterns. If your environment has credentials generated this way, those wordlists apply to you directly.</p></li><li><p>The gap between &#8220;looks strong&#8221; and &#8220;is strong&#8221; is exactly the kind of mismatch that causes silent, undetected compromises over months.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Audit your codebase and configuration management for any credentials that may have been AI-generated. Rotate them regardless of apparent strength.</p></li><li><p>Add explicit policy to your AI usage guidelines: LLMs must not be used for password, secret, or cryptographic key generation. Use dedicated cryptographic random number generators or password managers with CSPRNG backends.</p></li><li><p>Add detection to your code review process for hardcoded credentials. The presence of secrets in code is a separate problem, but AI-generated secrets in code are doubly dangerous.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>This one is going to sting because it&#8217;s been happening quietly for three years. Every developer who ever typed &#8220;give me a strong password&#8221; into ChatGPT and copy-pasted the result into a config file or API key field created a predictable credential. The entropy numbers here are damning: 27 bits versus 98. That&#8217;s a category error. The model isn&#8217;t generating randomness. It&#8217;s predicting what a password looks like, which is the exact opposite of what you need.</p><p>The vibe coding angle makes this worse. Teams building rapidly with AI assistance are shipping AI-generated credentials into production at a rate that&#8217;s probably not tracked anywhere in your CMDB. This isn&#8217;t a training problem you can solve with a webinar. It requires a tooling change and an audit of anything AI-assisted from the last two years. Start there.</p><h3>6. India Launches Global AI Summit, Shifting the Governance Conversation from Safety to Impact</h3><p>The India AI Impact Summit opened February 19-20, 2026 in New Delhi, with Prime Minister Narendra Modi hosting the fifth in the series of global AI summits following the UK, France, Korea, and Rwanda iterations (Crowell &amp; Moring, techUK). India deliberately reframed the summit theme away from &#8220;safety&#8221; and &#8220;action&#8221; toward &#8220;impact,&#8221; signaling a strategic preference for deployment-focused governance over precautionary frameworks. The summit brings together governments representing over 30 nations, private sector leaders from startups to multinationals, and is expected to shape India&#8217;s domestic AI regulatory approach. India highlighted four domestic startups building open-source foundational AI models tailored to local languages: Sarvam AI, Soket AI, Gnani AI, and Gan AI. The International AI Safety Report 2026, published February 3 and authored by over 100 experts across 30 countries, provides the independent scientific baseline for the governance conversations happening at the summit.</p><p><strong>Why it matters</strong></p><ul><li><p>The shift from &#8220;safety&#8221; to &#8220;impact&#8221; framing at a summit representing one billion users rebalances global AI governance pressure toward deployment and away from precaution. This affects how multinational enterprises navigate cross-border AI risk.</p></li><li><p>India&#8217;s regulatory landscape for AI is still forming. Summit outcomes will directly shape what obligations enterprises face in one of the world&#8217;s largest and fastest-growing technology markets.</p></li><li><p>The Global South&#8217;s increasing voice in AI governance means compliance strategies built entirely around EU AI Act timelines or U.S. executive order directions are increasingly incomplete.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Expand your AI regulatory horizon map to include India explicitly. If you operate in the Indian market or source AI talent and infrastructure there, this summit&#8217;s outputs will produce compliance obligations.</p></li><li><p>Monitor the parallel EU AI Act timeline shifts. The European Commission is considering moving high-risk AI system obligations from August 2026 to December 2027. That extension affects your deployment planning.</p></li><li><p>Brief your board on the diverging international governance frameworks. AI risk is no longer a single regulatory question. It&#8217;s a multi-jurisdictional portfolio problem.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>The rebranding from &#8220;safety summit&#8221; to &#8220;impact summit&#8221; is doing more work than it appears. It&#8217;s India saying, loud and clearly, that the safety-first framing from the UK and EU rounds is not the only valid frame. For the Global South, AI&#8217;s upside is too large to subordinate to risk frameworks designed primarily by wealthy nations with different exposure profiles. That&#8217;s a legitimate position, and it creates governance fragmentation that multinational enterprises will have to navigate without clean guidance.</p><p>From a security practitioner standpoint, what I watch at these summits isn&#8217;t the keynotes but the working groups. The incident reporting requirements, the transparency obligations for AI developers, and the cross-border data flow rules are where the actual compliance burden lives. The International AI Safety Report 2026 laid out a solid scientific baseline for what risks are real and what risk management practices actually work. Whether any of these summits produces enforceable governance is a different question entirely.</p><h3>7. DHS Announces CIRCIA Virtual Town Halls, Bringing Mandatory Incident Reporting Closer to Reality</h3><p>On February 13, 2026, the U.S. Department of Homeland Security announced virtual town hall meetings scheduled for March 2026 on the implementation of the Cyber Incident Reporting for Critical Infrastructure Act of 2022 (CIRCIA) (Crowell &amp; Moring). CIRCIA mandates covered entities to report substantial cyber incidents to CISA within 72 hours and ransomware payments within 24 hours. The town halls signal that CISA is moving toward final rulemaking, which will define exactly which organizations qualify as covered critical infrastructure and what &#8220;substantial&#8221; means in practice. The upcoming rules will affect sectors including healthcare, finance, transportation, energy, and communications. This is directly relevant to AI security because AI systems deployed in critical infrastructure pipelines will fall within CIRCIA&#8217;s scope once the covered entity definitions are finalized.</p><p><strong>Why it matters</strong></p><ul><li><p>If your organization operates in or supplies critical infrastructure sectors, CIRCIA reporting obligations are coming. The town halls are your signal to get your incident response playbooks in order now, not when the rule is final.</p></li><li><p>AI-related incidents involving critical infrastructure components will carry mandatory reporting obligations. Your incident classification framework needs to account for AI system failures as a triggering event.</p></li><li><p>The 72-hour reporting window is unforgiving. Organizations without pre-defined internal notification chains, executive decision authority, and external counsel relationships will fail to meet it.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Participate in the March CIRCIA town halls or send your legal and compliance teams. The comment period is the last opportunity to influence what &#8220;substantial incident&#8221; means before it becomes binding.</p></li><li><p>Map your AI deployments against critical infrastructure definitions. If an AI system supports operational continuity in a covered sector, assume it is a reportable asset.</p></li><li><p>Conduct a tabletop exercise specifically for the 72-hour CIRCIA timeline. The failure mode is not usually lack of knowledge; it&#8217;s lack of pre-authorized decision-making when leadership is unavailable at 2 a.m.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>CIRCIA has been in rulemaking purgatory for years, and the town halls are the clearest signal yet that the administration intends to finalize it. What&#8217;s notable from an AI security angle is that the statute was written before enterprise AI deployments at scale were reality. The final rule needs to address what happens when an AI system operating in a critical infrastructure pipeline is the vehicle for a substantial cyber incident. Right now, that&#8217;s ambiguous.</p><p>The practical guidance I&#8217;d give: don&#8217;t wait for regulatory clarity before building your incident response muscle. If an AI agent with access to operational technology fails catastrophically, you need the same 72-hour reporting capability you&#8217;d need for a ransomware attack. Those muscles take time to build. The town halls are the starting gun, not the finish line.</p><h3>8. 300 Million AI Chat Messages Exposed in Firebase Misconfiguration</h3><p>A security researcher discovered an exposed database belonging to the Chat and Ask AI application, operated by developer Codeway, exposing approximately 300 million messages tied to 25 million users (Malwarebytes). The exposure traced back to a Firebase misconfiguration, a well-documented error class where Google Firebase Security Rules are set to public, allowing anyone with the project URL to read, modify, or delete data without authentication. The exposed data included complete chat histories, AI model configurations, and user settings. The researcher, named Harry, found the issue while building an automated scanning tool for iOS and Android apps and discovered 103 of 200 iOS apps tested had the same vulnerability class. Codeway resolved the issue across all its apps within hours of responsible disclosure. Harry set up a public registry called Firehound where users can check whether apps they use have exposed this flaw.</p><p><strong>Why it matters</strong></p><ul><li><p>300 million chat messages include everything users ever said to an AI assistant: medical questions, financial details, relationship problems, confidential business discussions. That data is permanently exposed once leaked, regardless of remediation.</p></li><li><p>The systematic nature of this vulnerability class means Codeway is not the exception. Seventeen percent of iOS apps in Harry&#8217;s initial scan had the same misconfiguration.</p></li><li><p>AI applications collecting sensitive conversational data without adequate backend security controls are a rapidly growing liability category that most enterprise third-party risk programs are not yet equipped to assess.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Audit your enterprise mobile app allowlist for AI applications that may store conversational data in Firebase or similar backend-as-a-service platforms. Request evidence of backend security controls as part of vendor assessment.</p></li><li><p>Advise employees against using consumer AI chat applications for business discussions. The security controls on consumer apps are not equivalent to enterprise SaaS, and the conversational data those apps collect is a legitimate exfiltration target.</p></li><li><p>Check whether any apps your organization uses appear in Harry&#8217;s Firehound registry at firehound.app.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>The fact that a security researcher had to build an automated scanner to find this at scale tells you that the industry has not treated AI application backend security as a systematic concern. Firebase misconfigurations are not exotic vulnerabilities. They&#8217;re configuration errors that developers make when they prioritize shipping over hardening. When those developers are building AI applications that store millions of sensitive conversations, the exposure radius is enormous.</p><p>I want to flag the third-party risk angle for enterprise security leaders. Your employees are using AI chat applications for work conversations whether you&#8217;ve blessed those tools or not. The question is whether your vendor risk program knows what backend infrastructure those applications run on and whether those backends have been assessed. Most programs don&#8217;t, and 300 million exposed messages is the cost of that gap.</p><h3>9. Taiwan Warns China Is Rehearsing a Digital Siege</h3><p>On February 13, 2026, Taiwan issued a warning that China may be rehearsing a &#8220;digital siege&#8221; targeting the island&#8217;s critical communications and infrastructure (The Record from Recorded Future News). Taiwan&#8217;s National Security Bureau assessed that Chinese state actors are probing submarine cable systems, satellite communication dependencies, and undersea internet infrastructure in patterns consistent with pre-conflict preparation. The warning follows years of documented Chinese cyber operations against Taiwanese government agencies, financial institutions, and defense contractors. AI tools are increasingly part of Chinese state-sponsored reconnaissance and operational planning, as the Google GTIG report published the same week documented. The digital siege scenario involves coordinated disruption of communications infrastructure to isolate Taiwan before or during a conventional military operation.</p><p><strong>Why it matters</strong></p><ul><li><p>The digital siege model does not require a military conflict to be relevant to enterprise security. The same techniques used to rehearse the isolation of Taiwan&#8217;s infrastructure apply to attacking submarine cables, satellite uplinks, and internet exchange points globally.</p></li><li><p>Multinational enterprises with operations or supply chain dependencies in Taiwan face a material business continuity risk that most BCP plans do not adequately address.</p></li><li><p>AI-assisted reconnaissance at the infrastructure level represents a qualitative shift in how state actors prepare for large-scale operations.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Include Taiwan-specific disruption scenarios in your business continuity planning if you have operations or significant supplier relationships in Taiwan. The realistic scenario is degraded communications, not a clean cutover.</p></li><li><p>Map your organization&#8217;s dependence on Taiwan-based semiconductor and manufacturing supply chains. The disruption scenario is the subject of active adversary preparation.</p></li><li><p>Brief your board on the geopolitical exposure your organization carries in the Taiwan Strait scenario. This belongs in your enterprise risk register alongside financial and operational risks.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Taiwan produces the advanced chips that run your AI infrastructure. Any serious disruption to Taiwan&#8217;s communications or manufacturing capacity disrupts global AI supply chains in ways that dwarf any single cyberattack. China knows this. Their state actors are mapping the dependencies precisely because disrupting them at the right moment creates leverage that kinetic operations alone cannot.</p><p>The AI security angle that gets missed in most coverage is that Chinese APT groups are using AI tools, including Gemini as documented this week, to accelerate exactly the kind of infrastructure reconnaissance that underpins a digital siege strategy. Faster, more precise OSINT on submarine cable routing, satellite uplink dependencies, and internet exchange peering relationships means better targeting when the time comes. The rehearsal is happening now. Whether enterprises are watching is a different question.</p><h3>10. OpenClaw CVE-2026-25253 Docker Sandbox Bypass Leaves Persistent Exposure</h3><p>Security researchers at Depthfirst and Snyk confirmed during the week of February 13 that the original patch for CVE-2026-25253, OpenClaw&#8217;s one-click RCE vulnerability, was incomplete (SecurityWeek, Barrack.ai). The Docker sandbox bypass was assigned its own CVE, CVE-2026-24763, and patched in OpenClaw version 2026.1.30. The initial vulnerability allowed a victim who visited a malicious web page to have their authentication token stolen and their OpenClaw gateway compromised for full remote code execution. With 179,000 GitHub stars and 720,000 weekly downloads, the number of vulnerable deployments is substantial, particularly given that many users run OpenClaw on dedicated always-on machines with broad system access. Belgium&#8217;s Centre for Cybersecurity issued an emergency advisory urging immediate patching. As of February 19, SecurityWeek confirmed no known unfixed CVEs in the latest version 2026.2.17 but noted a large installed base of older versions remained in production.</p><p><strong>Why it matters</strong></p><ul><li><p>The incomplete initial patch is a recurring pattern. Security teams that patched to 2026.1.29, believing they were protected, were not. Patch validation for complex systems requires verifying the specific vulnerability class, not just the version number.</p></li><li><p>Always-on AI agent deployments on dedicated hardware create persistent high-value targets. A machine running OpenClaw continuously, with access to email, shell, and connected services, is a significant lateral movement asset if compromised.</p></li><li><p>Belgium&#8217;s emergency advisory signals that national cybersecurity agencies are treating OpenClaw&#8217;s exposure as a critical infrastructure-level risk, not just a developer tool problem.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Verify OpenClaw deployments are running version 2026.2.17 or later. Version 2026.1.29 is not fully patched. Confirm the specific CVEs are addressed in your version before closing the finding.</p></li><li><p>Isolate OpenClaw agent deployments from your corporate network with explicit firewall rules. An agent that can only reach the services it legitimately needs has a dramatically smaller blast radius than one with unrestricted internal access.</p></li><li><p>Treat the SecureClaw open-source hardening tool as a baseline requirement before any production OpenClaw deployment. Fifty-five automated audit checks mapped to MITRE ATLAS is a reasonable starting point, not an optional addition.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>The incomplete patch disclosure on a CVSS 8.8 one-click RCE is the kind of thing that turns a bad week into a worse one for security teams who thought they&#8217;d handled it. The version number said patched. The actual vulnerability class said otherwise. This is why patch validation procedures need to go beyond version confirmation, particularly for products with multiple CVEs being patched in rapid succession over a short period.</p><p>What concerns me more than any individual CVE here is the deployment pattern. People are running OpenClaw on Mac minis in their homes and offices as always-on AI assistants with full access to their personal and professional digital lives. The security model for that use case does not exist. The threat model has not been written. The incident response plan if the agent is compromised has definitely not been tested. That&#8217;s the real story underneath all of these CVEs.</p><h3>The One Thing You Won&#8217;t Hear About But You Need To</h3><h3>Check Point Reveals AI Assistants Can Be Weaponized as Bidirectional C2 Proxy Channels</h3><p>Check Point researchers disclosed on February 18, 2026 that AI assistants with web-browsing capabilities, specifically Microsoft Copilot and xAI&#8217;s Grok, can be turned into bidirectional command-and-control proxy channels by malware (AI Security Daily Briefing, February 18, 2026). The technique works by sending malware-controlled &#8220;summarization prompts&#8221; to attacker-controlled URLs via the AI assistant&#8217;s browser. The assistant fetches the URL, interprets attacker commands embedded in the page content, and returns data, effectively acting as a cleansing layer that makes malicious C2 traffic appear as legitimate HTTPS connections to AI provider domains. From a network monitoring perspective, the traffic looks like normal Copilot or Grok activity. Standard network security tools that whitelist major AI provider domains, as many enterprise configurations do, pass this traffic without inspection.</p><p><strong>Why it matters</strong></p><ul><li><p>Every enterprise that has whitelisted Copilot, Grok, or similar AI assistant domains in its network security stack has created a potential bypass channel for malware C2 communications.</p></li><li><p>The attack does not require exploiting the AI system itself. It exploits the trust that network controls extend to AI provider domains by default.</p></li><li><p>AI assistants with browsing capabilities are increasingly standard enterprise tools. The attack surface grows with every deployment.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Audit your network security configurations for blanket whitelisting of AI provider domains. Traffic to Copilot, Grok, ChatGPT, or similar services should be inspectable, not exempt from inspection.</p></li><li><p>Monitor for high-frequency, low-volume browsing requests from AI agents to newly registered or unusual domains. The attack pattern requires the agent to browse attacker-controlled content regularly.</p></li><li><p>Require explicit approval and network segmentation for any AI assistant deployment with web-browsing capabilities. The browsing feature specifically is what creates the C2 channel.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>This one deserves more attention than it&#8217;s getting. The attack is technically simple but operationally brilliant. You&#8217;ve trained your security team to look for C2 traffic going to suspicious domains. The traffic to Microsoft Copilot&#8217;s endpoints is not suspicious. It&#8217;s expected. Now those expected connections are the C2 channel. Your detection model just broke.</p><p>The enterprise AI deployment pattern that created this exposure is the same one most organizations followed: enabled Copilot or a similar tool, added the provider to the network allowlist so it functions properly, moved on to the next project. Nobody asked what happens if someone uses that trusted channel in reverse. Check out the <a href="https://rockcybermusings.com/">rockcybermusings.com</a> archive for more on how AI assistant deployments are changing your threat model, and visit <a href="https://rockcyber.com/">rockcyber.com</a> if you need help working through what your current AI deployment means for your detection stack.</p><p>If you found this analysis useful, subscribe at <strong><a href="https://rockcybermusings.com/">rockcybermusings.com</a></strong> for weekly intelligence on AI security developments.</p><p>&#128073; Visit <strong><a href="https://www.rockcyber.com/">RockCyber.com</a></strong> to learn more about how we can help you in your traditional Cybersecurity and AI Security and Governance Journey</p><p>&#128073; Want to save a quick $100K? Check out our AI Governance Tools at <strong><a href="https://aigovernancetoolkit.com/">AIGovernanceToolkit.com</a></strong></p><p>&#128073; Subscribe for more AI and cyber insights with the occasional rant.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share RockCyber Musings&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share RockCyber Musings</span></a></p><h2>References</h2><p>Anadolu Agency. (2026, February 19). <em>Experts warn AI-generated passwords may expose users to security risks</em>. https://www.aa.com.tr/en/science-technology/experts-warn-ai-generated-passwords-may-expose-users-to-security-risks/3834887</p><p>Antiy CERT. (2026, February 19). <em>ClawHavoc poisons OpenClaw&#8217;s ClawHub with 1,184 malicious skills</em>. GBHackers. https://gbhackers.com/clawhavoc-infects-openclaws-clawhub/</p><p>Aryaka. (2026, February 18). <em>Securing OpenClaw agents from ClawHavoc supply-chain attacks with AI-driven protection</em>. https://www.aryaka.com/blog/securing-openclaw-agents-clawhavoc-supply-chain-attack-ai-secure-protection/</p><p>Barrack.ai. (2026, February 16). <em>OpenClaw is a security nightmare: Here&#8217;s the safe way to run it</em>. https://blog.barrack.ai/openclaw-security-vulnerabilities-2026/</p><p>CISA. (2026, February 13). <em>CVE-2026-1731 added to Known Exploited Vulnerabilities catalog</em>. https://www.cisa.gov/known-exploited-vulnerabilities-catalog</p><p>Conscia. (2026, February). <em>The OpenClaw security crisis</em>. https://conscia.com/blog/the-openclaw-security-crisis/</p><p>Crowell &amp; Moring LLP. (2026, February). <em>Setting the agenda for global AI governance: India to host AI Impact Summit in February 2026</em>. https://www.crowell.com/en/insights/client-alerts/Setting-the-Agenda-for-Global-AI-Governance-India-to-Host-AI-Impact-Summit-in-February-2026</p><p>Dewhurst, R. (2026, February 12). [Post on X confirming in-the-wild exploitation of CVE-2026-1731]. watchTowr. Referenced via Help Net Security: https://www.helpnetsecurity.com/2026/02/13/beyondtrust-cve-2026-1731-poc-exploit-activity/</p><p>eSecurity Planet. (2026, February). <em>Hundreds of malicious skills found in OpenClaw&#8217;s ClawHub</em>. https://www.esecurityplanet.com/threats/hundreds-of-malicious-skills-found-in-openclaws-clawhub/</p><p>Google Threat Intelligence Group. (2026, February 12). <em>Nation-state hackers using Gemini AI for recon and attack support</em>. Referenced via BleepingComputer: https://www.bleepingcomputer.com/news/security/google-says-hackers-are-abusing-gemini-ai-for-all-attacks-stages/</p><p>Google Threat Intelligence Group. (2026, February 12). <em>State-backed hackers exploit Gemini AI for cyber recon and attacks</em>. Security Affairs. https://securityaffairs.com/187958/ai/google-state-backed-hackers-exploit-gemini-ai-for-cyber-recon-and-attacks.html</p><p>Help Net Security. (2026, February 13). <em>Hackers probe, exploit newly patched BeyondTrust RCE flaw (CVE-2026-1731)</em>. https://www.helpnetsecurity.com/2026/02/13/beyondtrust-cve-2026-1731-poc-exploit-activity/</p><p>Irregular. (2026, February 18). <em>LLM-generated passwords fundamentally weak</em>. The Register. https://www.theregister.com/2026/02/18/generating_passwords_with_llms</p><p>Malwarebytes. (2026, February 19). <em>AI-generated passwords are a security risk</em>. https://www.malwarebytes.com/blog/news/2026/02/ai-generated-passwords-are-a-security-risk</p><p>Malwarebytes. (2026, February). <em>AI chat app leak exposes 300 million messages tied to 25 million users</em>. https://www.malwarebytes.com/blog/news/2026/02/ai-chat-app-leak-exposes-300-million-messages-tied-to-25-million-users</p><p>Orca Security. (2026, February). <em>Critical CVE-2026-1731 vulnerability in BeyondTrust Remote Support and PRA</em>. https://orca.security/resources/blog/cve-2026-1731-beyondtrust-vulnerability/</p><p>Paubox. (2026, February 19). <em>State-sponsored hackers are using AI at every stage of cyberattacks</em>. https://www.paubox.com/blog/state-sponsored-hackers-are-using-ai-at-every-stage-of-cyberattacks</p><p>Rapid7. (2026, February). <em>CVE-2026-1731: Critical unauthenticated RCE in BeyondTrust Remote Support and PRA</em>. https://www.rapid7.com/blog/post/etr-cve-2026-1731-critical-unauthenticated-remote-code-execution-rce-beyondtrust-remote-support-rs-privileged-remote-access-pra/</p><p>SC Media. (2026, February 19). <em>Massive OpenClaw supply chain attack floods ClawHub with malicious skills</em>. https://www.scworld.com/brief/massive-openclaw-supply-chain-attack-floods-openclaw-with-malicious-skills</p><p>SecurityWeek. (2026, February 19). <em>OpenClaw security issues continue as SecureClaw open source tool debuts</em>. https://www.securityweek.com/openclaw-security-issues-continue-as-secureclaw-open-source-tool-debuts/</p><p>StepSecurity. (2026, February 17). <em>Cline supply chain attack detected: cline@2.3.0 silently installs OpenClaw</em>. https://www.stepsecurity.io/blog/cline-supply-chain-attack-detected-cline-2-3-0-silently-installs-openclaw</p><p>techmaniacs.com. (2026, February 18). <em>AI security daily briefing: February 18, 2026</em>. https://techmaniacs.com/2026/02/18/ai-security-daily-briefing-february-18-2026/</p><p>techUK. (2026, February). <em>The release of the International AI Safety Report 2026</em>. https://www.techuk.org/resource/the-release-of-the-international-ai-safety-report-2026-navigating-rapid-ai-advancement-and-emerging-risks.html</p><p>The Record from Recorded Future News. (2026, February 13). <em>China may be rehearsing a digital siege, Taiwan warns</em>. https://therecord.media/china-may-be-rehearsing-digital-siege-taiwan-warns</p><p>TechSpot. (2026, February 19). <em>AI-generated passwords are surprisingly easy to crack, researchers find</em>. https://www.techspot.com/news/111392-ai-generated-passwords-surprisingly-easy-crack-researchers-find.html</p><p>The Hacker News. (2026, February 12). <em>Google reports state-backed hackers using Gemini AI for recon and attack support</em>. https://thehackernews.com/2026/02/google-reports-state-backed-hackers.html</p>]]></content:encoded></item><item><title><![CDATA[Training vs Inference: Where Your Data Actually Leaks in LLM Systems]]></title><description><![CDATA[13% of GenAI prompts leak sensitive data at inference while training extraction hits 0.00001%. Evidence-based analysis of where to focus your AI security budget.]]></description><link>https://www.rockcybermusings.com/p/llm-data-leakage-training-vs-inference-risk</link><guid isPermaLink="false">https://www.rockcybermusings.com/p/llm-data-leakage-training-vs-inference-risk</guid><dc:creator><![CDATA[Rock Lambros]]></dc:creator><pubDate>Tue, 17 Feb 2026 13:50:07 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!xDzj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdac80afd-c8db-41a7-8045-5a8e931092e6_2048x2048.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!xDzj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdac80afd-c8db-41a7-8045-5a8e931092e6_2048x2048.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!xDzj!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdac80afd-c8db-41a7-8045-5a8e931092e6_2048x2048.png 424w, https://substackcdn.com/image/fetch/$s_!xDzj!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdac80afd-c8db-41a7-8045-5a8e931092e6_2048x2048.png 848w, https://substackcdn.com/image/fetch/$s_!xDzj!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdac80afd-c8db-41a7-8045-5a8e931092e6_2048x2048.png 1272w, https://substackcdn.com/image/fetch/$s_!xDzj!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdac80afd-c8db-41a7-8045-5a8e931092e6_2048x2048.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!xDzj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdac80afd-c8db-41a7-8045-5a8e931092e6_2048x2048.png" width="1456" height="1456" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/dac80afd-c8db-41a7-8045-5a8e931092e6_2048x2048.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1456,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:6710726,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/185809718?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdac80afd-c8db-41a7-8045-5a8e931092e6_2048x2048.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!xDzj!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdac80afd-c8db-41a7-8045-5a8e931092e6_2048x2048.png 424w, https://substackcdn.com/image/fetch/$s_!xDzj!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdac80afd-c8db-41a7-8045-5a8e931092e6_2048x2048.png 848w, https://substackcdn.com/image/fetch/$s_!xDzj!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdac80afd-c8db-41a7-8045-5a8e931092e6_2048x2048.png 1272w, https://substackcdn.com/image/fetch/$s_!xDzj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdac80afd-c8db-41a7-8045-5a8e931092e6_2048x2048.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/p/llm-data-leakage-training-vs-inference-risk?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/p/llm-data-leakage-training-vs-inference-risk?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p>Let me save you six months of hand-wringing. You&#8217;ve been protecting the wrong door.</p><p>I&#8217;ve sat through more AI governance meetings than I care to count. Every single one features the same theatrical performance. Legal waves GDPR Article 17 around like a talisman. Procurement demands &#8220;no training on our data&#8221; clauses with the fervor of medieval peasants warding off vampires. And everyone nods sagely, convinced they&#8217;ve addressed the AI risk problem.</p><p>Meanwhile, <a href="https://www.lasso.security/blog/lasso-research-reveals-13-of-generative-ai-prompts-contain-sensitive-organizational-data">Lasso Security research shows 13% of employee prompts to GenAI chatbots contain sensitive data</a>. Not training datasets. Prompts. Every day. Right now. While you&#8217;re reading this, someone in your organization is probably pasting customer PII into ChatGPT because they need help formatting a spreadsheet.</p><p>The mathematics of where your data actually leaks should embarrass every security team that spent the last two years obsessing over training data extraction. But it won&#8217;t. Because we love our comfortable narratives more than uncomfortable evidence.</p><h2>The LLM Data Lifecycle: What You Should Have Learned Before That Vendor Demo</h2><p>Here&#8217;s something that drives me up the wall. Security professionals confidently opine about LLM risks without understanding the basic mechanics of how these systems work. They conflate training with inference because no one ever walked them through the machine learning operations pipeline. So let me do that now, since apparently your AI vendors were too busy selling you governance platforms to explain the fundamentals.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0K7n!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61fb64f7-b6b2-44df-909a-471c3a9a65e0_2385x1635.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0K7n!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61fb64f7-b6b2-44df-909a-471c3a9a65e0_2385x1635.png 424w, https://substackcdn.com/image/fetch/$s_!0K7n!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61fb64f7-b6b2-44df-909a-471c3a9a65e0_2385x1635.png 848w, https://substackcdn.com/image/fetch/$s_!0K7n!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61fb64f7-b6b2-44df-909a-471c3a9a65e0_2385x1635.png 1272w, https://substackcdn.com/image/fetch/$s_!0K7n!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61fb64f7-b6b2-44df-909a-471c3a9a65e0_2385x1635.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0K7n!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61fb64f7-b6b2-44df-909a-471c3a9a65e0_2385x1635.png" width="1456" height="998" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/61fb64f7-b6b2-44df-909a-471c3a9a65e0_2385x1635.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:998,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:232366,&quot;alt&quot;:&quot;Flowchart showing data collection, pretraining, fine-tuning, alignment, evaluation, deployment, inference with RAG, and logging phases with data touchpoints and risk indicators at each stage&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/185809718?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61fb64f7-b6b2-44df-909a-471c3a9a65e0_2385x1635.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Flowchart showing data collection, pretraining, fine-tuning, alignment, evaluation, deployment, inference with RAG, and logging phases with data touchpoints and risk indicators at each stage" title="Flowchart showing data collection, pretraining, fine-tuning, alignment, evaluation, deployment, inference with RAG, and logging phases with data touchpoints and risk indicators at each stage" srcset="https://substackcdn.com/image/fetch/$s_!0K7n!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61fb64f7-b6b2-44df-909a-471c3a9a65e0_2385x1635.png 424w, https://substackcdn.com/image/fetch/$s_!0K7n!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61fb64f7-b6b2-44df-909a-471c3a9a65e0_2385x1635.png 848w, https://substackcdn.com/image/fetch/$s_!0K7n!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61fb64f7-b6b2-44df-909a-471c3a9a65e0_2385x1635.png 1272w, https://substackcdn.com/image/fetch/$s_!0K7n!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61fb64f7-b6b2-44df-909a-471c3a9a65e0_2385x1635.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure 1: The LLM Data Lifecycle</figcaption></figure></div><p></p><p>The pipeline divides into two distinct phases with fundamentally different data exposure mechanisms. Understanding this split determines whether your security controls make sense or constitute expensive window dressing.</p><h3>Training Phase: Where Data Gets Baked Into Weights</h3><p><strong>Data Collection and Curation</strong> happens before anyone touches a model. Someone has to assemble the training corpus. For frontier models, this means scraping the public internet, licensing book datasets, acquiring code repositories, and purchasing proprietary content. Your data enters the pipeline here. If a vendor trained on data that included your documents, this is where the exposure occurred. The curation process involves deduplication, filtering, and quality assessment. Data that survives curation proceeds to training. Data that doesn&#8217;t get discarded. But &#8220;discarded&#8221; in ML pipelines doesn&#8217;t mean &#8220;securely deleted.&#8221; It means &#8220;not used for this training run.&#8221;</p><p><strong>Pretraining</strong> is where the paranoia lives. Models consume massive datasets; we&#8217;re talking trillions of tokens scraped from the digital detritus of human civilization. GPT-2 trained on 40GB of text. GPT-3 scaled to 570GB. Modern frontier models ingest petabytes. Your data, if it enters here, gets compressed via gradient descent optimization across billions of parameters. The model adjusts its weights to minimize prediction error. Your sensitive document becomes a statistical ghost, distributed across a neural network in ways that make traditional data governance frameworks weep.</p><p>This is where the &#8220;don&#8217;t train on my data&#8221; clause comes from. I get it&#8230; the intuition makes sense. If my data trains the model, the model might regurgitate my data. Logic checks out, but it&#8217;s mostly flawed.</p><p><strong>Supervised Fine-Tuning</strong> narrows the model&#8217;s behavior after pretraining. This stage uses curated instruction-response pairs to teach the model how to follow directions. The datasets are smaller, thousands to millions of examples, versus trillions of pretraining tokens. However, memorization risk actually increases here because the same examples often appear multiple times during training. If your data appears in fine-tuning datasets, extraction probability rises substantially compared to pretraining exposure.</p><p><strong>Preference Alignment</strong> shapes responses to match human expectations. Reinforcement Learning from Human Feedback, Direct Preference Optimization, Constitutional AI, pick your flavor. Human raters compare model outputs, and the training process adjusts weights accordingly. The data here includes both prompts and human preference signals. Less raw text, but still potential exposure if your content appeared in alignment examples.</p><p><strong>Evaluation and Red-Teaming</strong> happen after training completes but before deployment. This is adversarial testing to find failure modes, safety issues, and capability gaps. It is not simply validation during training. Red teams probe for jailbreaks, harmful outputs, and yes, training data extraction. If your data appears in both training and evaluation sets, congratulations. You have a contamination problem that nobody in procurement thought to ask about.</p><h3>Inference Phase: Where Data Flows Through (And Gets Logged)</h3><p><strong>Model Deployment</strong> means containerizing the trained model, spinning up API infrastructure, configuring scaling policies, and connecting authentication systems. The model weights are frozen. No learning happens. But this phase determines how data will flow through the system and, critically, where it will be stored.</p><p><strong>Real-Time Inference</strong> is where users actually interact. Every prompt submitted. Every response received. The model processes inputs, generates outputs, and moves on. This is what most people picture when they think about using AI. What they don&#8217;t picture is everything that happens around inference.</p><p><strong>RAG and Context Injection</strong> deserve their own callout because most organizations miss them entirely. Retrieval-Augmented Generation means your inference pipeline queries external databases, pulls relevant documents, and injects that content into prompts before sending them to the model. Those &#8220;external databases&#8221; often contain your proprietary documents, customer records, internal wikis, and knowledge bases.</p><p>Here&#8217;s what that means for data exposure. When Karen in accounting asks the AI chatbot about expense policy, the RAG system retrieves your expense policy document, injects it into the prompt, sends the combined content to the model, receives a response, and logs the entire transaction. Your proprietary document just flowed through an external API, got combined with user prompts, and landed in a logging system you may not control. No training required. No memorization needed. Direct exposure through the inference pipeline you deployed to be helpful.</p><p><strong>Logging and Monitoring</strong> is where inference data exposure actually occurs. Production systems log prompts and responses for debugging, abuse detection, quality assurance, and compliance. Those logs contain everything users submitted. Every customer name. Every financial figure. Every code snippet. Every piece of sensitive data that employees paste into the helpful AI assistant.</p><p>The 13% figure from Lasso Security? That&#8217;s the percentage of prompts containing sensitive data. Those prompts get logged. Stored. Often retained far longer than necessary. Sometimes shared with vendors for &#8220;product improvement.&#8221; This is the front door standing wide open while everyone obsesses over vault security.</p><p>The distinction between training and inference exposure matters because the technical mechanisms differ fundamentally. Training memorization requires your data to survive gradient compression across billions of parameters and then be extractable via adversarial prompting. Inference exposure requires only that employees type sensitive information into text boxes and that someone, somewhere, logs the transaction.</p><p>Guess which one happens more often.</p><h2>How Memorization Actually Works: The Math Your Vendor Didn&#8217;t Show You</h2><p>Let me walk you through what happens when your data enters a training corpus. Researchers have quantified exactly how much training data can be extracted from production models, and the numbers don&#8217;t support the panic.</p><p><a href="https://www.usenix.org/system/files/sec21-carlini-extracting.pdf">Nicholas Carlini and colleagues at Google developed the formal framework that everyone cites, but few have actually read.</a> A string is <strong>extractable with $k$ tokens of context</strong> from a model if an adversary can prompt the model with a prefix and recover the exact training sequence.</p><p>The empirical results should recalibrate your threat model. From GPT-2&#8217;s 40GB training set, researchers extracted exactly 604 unique examples. Let me write that as a percentage so it sinks in:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;\\text{Extraction Rate}_{\\text{baseline}} = \\frac{604}{40 \\times 10^9 \\text{ characters}} \\approx 0.00000015%&quot;,&quot;id&quot;:&quot;YCNLPGGGDQ&quot;}" data-component-name="LatexBlockToDOM"></div><p>Six hundred four examples from forty billion characters. That&#8217;s the threat model you&#8217;ve been losing sleep over.</p><p>But I&#8217;m not going to cherry-pick. Extraction rates depend heavily on attack sophistication. The <strong>divergence attack</strong> against ChatGPT caused the model to emit memorized training data at 150x the normal rate by prompting the model to repeat a single word indefinitely until it diverged from its aligned behavior. Even with that adversarial technique, only 3% of generated text was memorized content.</p><p>The relationship between model scale and memorization follows a log-linear pattern that Carlini&#8217;s team quantified:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;\\text{Fraction}{\\text{extractable}} \\propto 0.19 \\times \\log{10}(\\text{Parameters}) + C&quot;,&quot;id&quot;:&quot;DANAPXPFSX&quot;}" data-component-name="LatexBlockToDOM"></div><p></p><p>Where <em>C</em> is a constant dependent on training configuration. This means a 10x increase in model size yields roughly a 19 percentage-point higher extraction rate. Larger models memorize more. The 175B parameter GPT-3 memorizes more than the 1.5B parameter GPT-2. This is real. This matters.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!xEWq!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f316f72-579f-4a5e-8fd8-0249fcdafac2_1784x1033.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!xEWq!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f316f72-579f-4a5e-8fd8-0249fcdafac2_1784x1033.png 424w, https://substackcdn.com/image/fetch/$s_!xEWq!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f316f72-579f-4a5e-8fd8-0249fcdafac2_1784x1033.png 848w, https://substackcdn.com/image/fetch/$s_!xEWq!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f316f72-579f-4a5e-8fd8-0249fcdafac2_1784x1033.png 1272w, https://substackcdn.com/image/fetch/$s_!xEWq!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f316f72-579f-4a5e-8fd8-0249fcdafac2_1784x1033.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!xEWq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f316f72-579f-4a5e-8fd8-0249fcdafac2_1784x1033.png" width="1456" height="843" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3f316f72-579f-4a5e-8fd8-0249fcdafac2_1784x1033.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:843,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:101631,&quot;alt&quot;:&quot;Bar chart comparing extraction rates across different attack methods ranging from 0.00001% baseline to 47% with divergence attacks&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/185809718?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f316f72-579f-4a5e-8fd8-0249fcdafac2_1784x1033.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Bar chart comparing extraction rates across different attack methods ranging from 0.00001% baseline to 47% with divergence attacks" title="Bar chart comparing extraction rates across different attack methods ranging from 0.00001% baseline to 47% with divergence attacks" srcset="https://substackcdn.com/image/fetch/$s_!xEWq!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f316f72-579f-4a5e-8fd8-0249fcdafac2_1784x1033.png 424w, https://substackcdn.com/image/fetch/$s_!xEWq!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f316f72-579f-4a5e-8fd8-0249fcdafac2_1784x1033.png 848w, https://substackcdn.com/image/fetch/$s_!xEWq!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f316f72-579f-4a5e-8fd8-0249fcdafac2_1784x1033.png 1272w, https://substackcdn.com/image/fetch/$s_!xEWq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f316f72-579f-4a5e-8fd8-0249fcdafac2_1784x1033.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure 2: Training Data Extraction Rates by Attack Method</figcaption></figure></div><p>Three factors dominate whether your specific data gets memorized, and understanding these should inform your actual risk assessment rather than your vendor-driven anxiety:</p><p><strong>Data duplication is the killer.</strong> GPT-2 memorized sequences after just 33 repetitions. The relationship is exponential:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;P(\\text{memorization}) \\propto e^{\\alpha \\times \\text{repetitions}}&quot;,&quot;id&quot;:&quot;BPTZRPEDPG&quot;}" data-component-name="LatexBlockToDOM"></div><p>Where&#8230; </p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;\\propto&quot;,&quot;id&quot;:&quot;TTINURTQNA&quot;}" data-component-name="LatexBlockToDOM"></div><p>&#8230; varies by model architecture (sorry, Substack doesn&#8217;t let me shrink a LaTex block so it fits in line). A 10x increase in repetitions yields extraction rates 25-30 percentage points higher. If your sensitive data appears once in a training corpus of trillions of tokens, your extraction risk approaches zero. If it appears hundreds of times across multiple sources, you have a real problem.</p><p><strong>Model scale amplifies everything.</strong> Larger models memorize 2-5x more than smaller counterparts. The 6B parameter GPT-Neo extracted 65% of test sequences, compared with 20% for the 125M version. The relationship holds across architectures.</p><p><strong>Training dynamics create temporal windows.</strong> Models exhibit the highest memorization at the beginning and end of training, with the lowest rates midway through. Early-memorized examples become permanently encoded in lower layers. This has implications for when your data entered the training pipeline, but good luck getting that information from any vendor.</p><h2>Membership Inference Attacks: A Coin Flip with Extra Steps</h2><p>Here&#8217;s where I get genuinely irritated. Membership inference attacks attempt to determine whether specific data was used in training. These attacks should terrify privacy professionals if they worked.</p><p>They don&#8217;t work.</p><p>State-of-the-art research demonstrates that <strong>MIAs barely outperform random guessing for most settings across varying LLM sizes and domains.</strong> The best attacks achieve Area Under the Receiver Operating Characteristic Curve (AUC-ROC) scores of 0.50-0.55. For those who skipped statistics class, 0.50 equals flipping a coin.</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;\\text{AUC-ROC}_{\\text{best MIA}} \\approx 0.55&quot;,&quot;id&quot;:&quot;GFYENIPKDJ&quot;}" data-component-name="LatexBlockToDOM"></div><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;\\text{AUC-ROC}_{\\text{random guess}} = 0.50&quot;,&quot;id&quot;:&quot;HQWNCHZXZN&quot;}" data-component-name="LatexBlockToDOM"></div><p>That&#8217;s a 0.05 improvement over pure chance. You&#8217;d get better results asking a magic 8-ball.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!odiK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3195b289-fb9f-45d2-9574-4606ed5ab948_1845x757.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!odiK!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3195b289-fb9f-45d2-9574-4606ed5ab948_1845x757.png 424w, https://substackcdn.com/image/fetch/$s_!odiK!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3195b289-fb9f-45d2-9574-4606ed5ab948_1845x757.png 848w, https://substackcdn.com/image/fetch/$s_!odiK!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3195b289-fb9f-45d2-9574-4606ed5ab948_1845x757.png 1272w, https://substackcdn.com/image/fetch/$s_!odiK!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3195b289-fb9f-45d2-9574-4606ed5ab948_1845x757.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!odiK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3195b289-fb9f-45d2-9574-4606ed5ab948_1845x757.png" width="1456" height="597" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3195b289-fb9f-45d2-9574-4606ed5ab948_1845x757.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:597,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:90834,&quot;alt&quot;:&quot;Table comparing four membership inference attack methods against large language models. Loss-based attacks achieve 0.50 AUC-ROC with no practical utility, equivalent to random guessing. Reference-based LiRA attacks reach 0.54 with marginal utility. Zlib entropy attacks score 0.53, also marginal. Neighborhood-based attacks perform best at 0.55 AUC-ROC but remain weak. Footer note: AUC-ROC of 0.50 equals random guessing; best MIA achieves only 0.55.&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/185809718?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3195b289-fb9f-45d2-9574-4606ed5ab948_1845x757.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Table comparing four membership inference attack methods against large language models. Loss-based attacks achieve 0.50 AUC-ROC with no practical utility, equivalent to random guessing. Reference-based LiRA attacks reach 0.54 with marginal utility. Zlib entropy attacks score 0.53, also marginal. Neighborhood-based attacks perform best at 0.55 AUC-ROC but remain weak. Footer note: AUC-ROC of 0.50 equals random guessing; best MIA achieves only 0.55." title="Table comparing four membership inference attack methods against large language models. Loss-based attacks achieve 0.50 AUC-ROC with no practical utility, equivalent to random guessing. Reference-based LiRA attacks reach 0.54 with marginal utility. Zlib entropy attacks score 0.53, also marginal. Neighborhood-based attacks perform best at 0.55 AUC-ROC but remain weak. Footer note: AUC-ROC of 0.50 equals random guessing; best MIA achieves only 0.55." srcset="https://substackcdn.com/image/fetch/$s_!odiK!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3195b289-fb9f-45d2-9574-4606ed5ab948_1845x757.png 424w, https://substackcdn.com/image/fetch/$s_!odiK!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3195b289-fb9f-45d2-9574-4606ed5ab948_1845x757.png 848w, https://substackcdn.com/image/fetch/$s_!odiK!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3195b289-fb9f-45d2-9574-4606ed5ab948_1845x757.png 1272w, https://substackcdn.com/image/fetch/$s_!odiK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3195b289-fb9f-45d2-9574-4606ed5ab948_1845x757.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Why do MIAs fail against LLMs? Two reasons, and both should have been obvious.</p><p>First, modern LLMs train on massive datasets for very few epochs. GPT-3-class models see each example maybe once or twice during training. This creates minimal overfitting on individual examples. Without overfitting, the statistical signal that MIAs exploit simply doesn&#8217;t exist. The model treats seen and unseen data nearly identically.</p><p>Second, the boundary between training and non-training data is fundamentally fuzzy for natural language. Text from the same domain exhibits similar statistical characteristics whether it appeared in training or not. A legal contract looks like a legal contract. Distinguishing membership becomes statistically impossible because the underlying distributions overlap almost completely.</p><p>Difficulty calibration techniques can improve MIA performance by up to 0.10 AUC. Even optimized attacks show <strong>limited practical utility</strong> for large-scale privacy breaches. You cannot reliably determine if your data was in the training set. Full stop.</p><p>So why do vendors keep selling you tools to detect training data inclusion? <em><strong>Because fear sells better than evidence.</strong></em></p><h2>The Real Threat: What&#8217;s Actually Happening While You&#8217;re Guarding the Vault</h2><p>Now let&#8217;s examine what actually happens in organizations that deploy LLMs. The Samsung incident from 2023 became the canonical case study, but most coverage missed the point entirely.</p><p>Three Samsung semiconductor engineers pasted proprietary code and meeting transcripts into ChatGPT over a 20-day period. One entered buggy source code from a semiconductor database seeking debugging help. Another uploaded program code designed to identify defective equipment for optimization suggestions. A third converted confidential meeting notes into a presentation.</p><p>I need you to understand something critical: <strong>This was not a training data extraction attack.</strong></p><p>Zero adversarial techniques. No sophisticated prompting. No cryptographic attacks on model weights. Engineers simply pasted confidential data into a text box because they wanted help with their jobs, and the AI was right there, waiting and ready to help.</p><p>Samsung&#8217;s response tells you what they actually learned. They banned public AI tools and implemented a 1,024-byte upload limit. They didn&#8217;t hire a team to monitor the extraction of training data. They recognized the threat as <strong>operational, not theoretical.</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!myyv!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce7f1467-9462-4598-ad77-1b785cb0e4b3_1132x1195.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!myyv!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce7f1467-9462-4598-ad77-1b785cb0e4b3_1132x1195.png 424w, https://substackcdn.com/image/fetch/$s_!myyv!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce7f1467-9462-4598-ad77-1b785cb0e4b3_1132x1195.png 848w, https://substackcdn.com/image/fetch/$s_!myyv!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce7f1467-9462-4598-ad77-1b785cb0e4b3_1132x1195.png 1272w, https://substackcdn.com/image/fetch/$s_!myyv!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce7f1467-9462-4598-ad77-1b785cb0e4b3_1132x1195.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!myyv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce7f1467-9462-4598-ad77-1b785cb0e4b3_1132x1195.png" width="1132" height="1195" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ce7f1467-9462-4598-ad77-1b785cb0e4b3_1132x1195.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1195,&quot;width&quot;:1132,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:123930,&quot;alt&quot;:&quot;Pie chart showing 75% of documented AI data leaks stem from inference logging versus 25% from training extraction&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/185809718?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce7f1467-9462-4598-ad77-1b785cb0e4b3_1132x1195.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Pie chart showing 75% of documented AI data leaks stem from inference logging versus 25% from training extraction" title="Pie chart showing 75% of documented AI data leaks stem from inference logging versus 25% from training extraction" srcset="https://substackcdn.com/image/fetch/$s_!myyv!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce7f1467-9462-4598-ad77-1b785cb0e4b3_1132x1195.png 424w, https://substackcdn.com/image/fetch/$s_!myyv!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce7f1467-9462-4598-ad77-1b785cb0e4b3_1132x1195.png 848w, https://substackcdn.com/image/fetch/$s_!myyv!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce7f1467-9462-4598-ad77-1b785cb0e4b3_1132x1195.png 1272w, https://substackcdn.com/image/fetch/$s_!myyv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce7f1467-9462-4598-ad77-1b785cb0e4b3_1132x1195.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure 3: Real-World AI Data Leaks by Root Cause</figcaption></figure></div><p>Lasso Security&#8217;s analysis of real-world GenAI usage quantified what anyone paying attention already suspected. Their findings, based on data collected between December 2023 and February 2025:</p><p><strong>13% of employee prompts contain sensitive data.</strong> That&#8217;s the baseline exposure rate. No attack required. No adversary needed. Just normal people using AI tools the way normal people use AI tools.</p><p>Category breakdown reveals the scope of what&#8217;s flowing through inference endpoints:</p><ul><li><p>Code and tokens (API keys, credentials): 30% of sensitive submissions</p></li><li><p>Network information (IPs, MACs, internal URLs): 38% of sensitive submissions</p></li><li><p>PII/PCI data (emails, payment information): 11.2% of sensitive submissions</p></li></ul><p>The <a href="https://www.ibm.com/reports/data-breach">2025 IBM Cost of a Data Breach Report</a> found 13% of organizations reported AI-related breaches. Of those breached, <strong>97% lacked proper AI access controls.</strong> Not 97% lacked training data extraction monitoring. Ninety-seven percent lacked basic access controls for inference endpoints.</p><h2>The Probability-Weighted Risk Calculation You Should Have Done Two Years Ago</h2><p>Let me show you the math that should change your budget allocation. This is basic risk quantification. The kind of analysis you&#8217;d do for any other security domain. But somehow AI got a pass on quantitative rigor.</p><p><strong>Risk = Likelihood &#215; Severity</strong></p><p>CISSP 101</p><p><strong>For training data extraction:</strong></p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;R_{\\text{train}} = P(\\text{extraction}) \\times S(\\text{data})&quot;,&quot;id&quot;:&quot;GLDZXCNQOB&quot;}" data-component-name="LatexBlockToDOM"></div><p></p><p>Where:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;P(\\text{extraction}) \\in [0.00001\\%, 3\\%] (\\text{adversarial attack required})&quot;,&quot;id&quot;:&quot;QIHUGLUYCT&quot;}" data-component-name="LatexBlockToDOM"></div><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;S(\\text{data}) = \\text {High (PII, trade secrets, code)}&quot;,&quot;id&quot;:&quot;VBYHFENQSH&quot;}" data-component-name="LatexBlockToDOM"></div><ul><li><p><strong>Risk Score: Low to Medium</strong></p></li></ul><p><strong>For inference data exposure:</strong></p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;R_{\\text{inference}} = P(\\text{exposure}) \\times S(\\text{data})&quot;,&quot;id&quot;:&quot;DQRFFREWXU&quot;}" data-component-name="LatexBlockToDOM"></div><p>Where:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;P(\\text{exposure}) = 13\\%~\\text{baseline (no attack required)}&quot;,&quot;id&quot;:&quot;KYYKOKUYGO&quot;}" data-component-name="LatexBlockToDOM"></div><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;S(\\text{data}) = \\text{High (identical data types)}&quot;,&quot;id&quot;:&quot;HTNQNDROFU&quot;}" data-component-name="LatexBlockToDOM"></div><ul><li><p><strong>Risk Score: High to Critical</strong></p></li></ul><p><strong>The risk ratio tells the story:</strong></p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;\\frac{R_{\\text{inference}}}{R_{\\text{train}}} = \\frac{0.13}{0.03} \\approx 4.3 \\text{ (conservative)}&quot;,&quot;id&quot;:&quot;IJSNVTLPSM&quot;}" data-component-name="LatexBlockToDOM"></div><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;\\frac{R_{\\text{inference}}}{R_{\\text{train}}} = \\frac{0.13}{0.00000015} \\approx 867,000 \\text{ (baseline comparison)}&quot;,&quot;id&quot;:&quot;HSUABSCWSI&quot;}" data-component-name="LatexBlockToDOM"></div><p>Even comparing against sophisticated adversarial extraction at 3%, inference risk remains over 4x higher. Against baseline extraction rates, inference exposure presents nearly a million times higher probability. And inference exposure requires zero technical sophistication. It requires only that employees do their jobs using the tools you&#8217;ve given them.</p><p>Does your security budget allocation reflect this ratio? If not, you&#8217;re defending based on narrative rather than evidence.</p><h2>Control Efficacy: Why You Can Fix Inference But Not Training</h2><p>The asymmetry extends beyond probability to controllability, and this is where the grumpy uncle in me really wants to shake some sense into the industry.</p><p><strong>Inference controls are mature, proven, and cost-effective:</strong></p><p>Zero-retention contracts exist. OpenAI&#8217;s API offers enterprise customers explicit no-data-retention options. Azure OpenAI provides stateless inference with double encryption. AWS Bedrock offers similar guarantees. You can contractually eliminate inference data retention today. The cost ranges from $0 (API configuration) to $50K (enterprise agreement negotiation).</p><p>Traditional DLP integration doesn&#8217;t work. Your existing stack watches file transfers and email attachments. Prompts are ephemeral text in encrypted API calls that legacy DLP never sees. What works is AI-native guardrails. Input validation that scans prompts before the model processes them. Output filtering that inspects responses before users receive them. These aren&#8217;t DLP products with &#8220;AI&#8221; slapped on the label. They&#8217;re purpose-built controls that understand prompt structure, embedding semantics, and conversational context. Cost: $50K-200K one-time plus annual licensing. Deployment time: weeks to months. Efficacy: 95%+ detection of sensitive data patterns when properly tuned.</p><p>Targeted prompt hygiene coaching changes behavior. Not a 45-minute module. Three rules, live examples, quarterly reinforcement. $10K-50K annually. ROI shows up in your logs within quarters.</p><p>Private deployment eliminates external data flow entirely. Run the model in your VPC. Air-gap if you&#8217;re paranoid. Cost: $100K-millions depending on scale. Efficacy: 100% against external exposure.</p><p>Total efficacy for inference controls: <strong>100% risk reduction achievable.</strong> GDPR compliant. ISO 42001 auditable. HIPAA addressable. These are solved problems.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!mUq1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18bbfccc-4db1-4a86-8600-11fc32d50dae_2085x1335.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!mUq1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18bbfccc-4db1-4a86-8600-11fc32d50dae_2085x1335.png 424w, https://substackcdn.com/image/fetch/$s_!mUq1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18bbfccc-4db1-4a86-8600-11fc32d50dae_2085x1335.png 848w, https://substackcdn.com/image/fetch/$s_!mUq1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18bbfccc-4db1-4a86-8600-11fc32d50dae_2085x1335.png 1272w, https://substackcdn.com/image/fetch/$s_!mUq1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18bbfccc-4db1-4a86-8600-11fc32d50dae_2085x1335.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!mUq1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18bbfccc-4db1-4a86-8600-11fc32d50dae_2085x1335.png" width="1456" height="932" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/18bbfccc-4db1-4a86-8600-11fc32d50dae_2085x1335.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:932,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:168193,&quot;alt&quot;:&quot;Table comparing inference and training controls across cost, deployment time, efficacy, and compliance satisfaction&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/185809718?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18bbfccc-4db1-4a86-8600-11fc32d50dae_2085x1335.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Table comparing inference and training controls across cost, deployment time, efficacy, and compliance satisfaction" title="Table comparing inference and training controls across cost, deployment time, efficacy, and compliance satisfaction" srcset="https://substackcdn.com/image/fetch/$s_!mUq1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18bbfccc-4db1-4a86-8600-11fc32d50dae_2085x1335.png 424w, https://substackcdn.com/image/fetch/$s_!mUq1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18bbfccc-4db1-4a86-8600-11fc32d50dae_2085x1335.png 848w, https://substackcdn.com/image/fetch/$s_!mUq1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18bbfccc-4db1-4a86-8600-11fc32d50dae_2085x1335.png 1272w, https://substackcdn.com/image/fetch/$s_!mUq1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18bbfccc-4db1-4a86-8600-11fc32d50dae_2085x1335.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure 4: Control Efficacy and Cost Comparison</figcaption></figure></div><p><strong>Training controls remain experimental and break your models:</strong></p><p>Differential privacy provides mathematical guarantees that sound great in papers. The practical reality? Research shows 36% accuracy degradation for meaningful privacy protection at &#949;=10 (where lower epsilon means better privacy but worse model performance)</p><p>Training overhead increases 2-3x in compute cost. You&#8217;re paying more for a worse model that might protect against a threat that&#8217;s already unlikely.</p><p>Machine unlearning &#8220;significantly degrades model utility&#8221; and &#8220;scales poorly with forget set sizes.&#8221; Direct quote from the research literature. No vendor will tell you this because they&#8217;re selling you unlearning solutions.</p><p>Retraining from scratch costs $1M-10M per GPT-3-class training run. For 1,000 GDPR erasure requests per year (a modest estimate for any company with European customers), full compliance through retraining would cost:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;\\text{Annual Cost} = 1,000 \\times $5M = $5B&quot;,&quot;id&quot;:&quot;WQMLSYTSPG&quot;}" data-component-name="LatexBlockToDOM"></div><p>That&#8217;s not a typo. Five billion dollars annually for GDPR training data compliance. The EU AI Act acknowledges this reality with careful language: <strong>&#8220;Once personal data is used to train AI model, deletion may not be possible.&#8221;</strong></p><p>For inference logs? You delete them from the database. Query complete. Compliance achieved. Cost: approximately nothing.</p><h2>Where Security Budgets Should Actually Go</h2><p>The evidence demands a resource reallocation that most CISOs haven&#8217;t made because they&#8217;re still managing to the narrative rather than the data.</p><p>I don&#8217;t have a survey that breaks down AI security spending by lifecycle phase. Nobody does. That research gap tells you something about where the industry&#8217;s head is at. We&#8217;re still debating whether training data extraction is a real threat, while 13% of employees paste sensitive data into GenAI tools every day.</p><p>Here&#8217;s what I see in practice: security teams obsess over model security, training data provenance, and adversarial attacks that require nation-state capabilities to execute. Meanwhile, inference-time data governance gets a fraction of the attention despite representing the overwhelming majority of actual data exposure.</p><p>This allocation is backwards. It&#8217;s like spending most of your physical security budget on vault doors while leaving the loading dock propped open.</p><p>The fix isn&#8217;t complicated. Shift focus from theoretical training-time attacks to the inference-time exposures happening right now:</p><p><strong>Inference data governance comes first.</strong> Forget legacy DLP. It watches the perimeter while your data evaporates into prompts. You need prompt guardrails that scan input for sensitive patterns before processing, output filters that inspect model responses for data leakage, and audit logging for every AI interaction. Implement zero-retention contracts with every AI vendor. The UK NCSC calls this &#8220;radical transparency&#8221; for a reason.</p><p><strong>Prompt hygiene training changes behavior.</strong> This isn&#8217;t another awareness program destined for the ignore pile. Prompt hygiene is a discrete skill with immediate feedback. Three rules, live examples from your own environment, quarterly reinforcement. $10K-50K annually. Behavior changes because employees see the flags in real time, not because they clicked through a module.</p><p><strong>Shadow AI visibility is non-negotiable.</strong> You can&#8217;t govern what you can&#8217;t see. Inventory every AI tool touching your environment, sanctioned or not.</p><p><strong>Training data monitoring is maintenance, not a priority.</strong> Keep the controls you have. Don&#8217;t expand investment here until you&#8217;ve closed the inference gaps.</p><p>Your legal team negotiated &#8220;no training&#8221; clauses because they understood GDPR Article 17 implications. That was appropriate due diligence. But that clause doesn&#8217;t protect you from the engineer who pastes customer data into ChatGPT tomorrow morning because their deadline is today and the AI is helpful.</p><p>The vault door is secure. Your &#8220;no training&#8221; clause is airtight. Meanwhile, 13% of your workforce submits sensitive data through the front door every single day, and you&#8217;re congratulating yourself on your AI governance program.</p><p>I&#8217;ve been doing this long enough to know that most readers will nod along, agree with the analysis, and change nothing. The narrative is comfortable. The vendor demos are compelling. The boardroom expects training data extraction theater.</p><p>But for those of you willing to follow the evidence: your data leaks at inference. Fix that first. Fix that now. The training data extraction problem is real but rare. The inference exposure problem is certain and ongoing.</p><p><strong>Key Takeaway:</strong> Training data extraction is a lottery you probably won&#8217;t lose. Inference data exposure is a certainty you&#8217;re experiencing right now. Stop guarding the vault and secure the front door.</p><h3>What to do next</h3><p>Your AI governance posture needs an honest assessment, not another vendor demo. The <a href="https://www.rockcyber.com/ai-strategy-and-governance">CARE Framework</a> provides a structured approach to evaluating AI risk across the dimensions that actually matter. Use it to audit where your current controls focus versus where the evidence says your risks concentrate.</p><p>If your organization needs an external perspective - someone willing to tell you what your vendors won&#8217;t - <a href="https://www.rockcyber.com">RockCyber</a> offers AI governance assessments built on the quantitative analysis in this post. We&#8217;ll tell you where your budget should go, not where your current vendors want it to go.</p><p>&#128073; Subscribe for more AI security and governance insights with the occasional rant.</p><p>&#128073; Visit <strong><a href="https://www.rockcyber.com/">RockCyber.com</a></strong> to learn more about how we can help you in your traditional Cybersecurity and AI Security and Governance Journey</p><p>&#128073; Want to save a quick $100K? Check out our AI Governance Tools at <strong><a href="https://aigovernancetoolkit.com/">AIGovernanceToolkit.com</a></strong></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&quot;,&quot;text&quot;:&quot;Share RockCyber Musings&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share"><span>Share RockCyber Musings</span></a></p>]]></content:encoded></item><item><title><![CDATA[Weekly Musings Top 10 AI Security Wrapup: Issue 25 February 6, 2026 - February 12, 2026]]></title><description><![CDATA[Microsoft patches prompt injection flaws in Copilot, North Korea weaponizes deepfakes for crypto theft, and a 200-page global report confirms what we already knew: governance can't keep up]]></description><link>https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260206-20260212</link><guid isPermaLink="false">https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260206-20260212</guid><dc:creator><![CDATA[Rock Lambros]]></dc:creator><pubDate>Fri, 13 Feb 2026 13:50:21 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!KvJ9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff8c7a65e-4c9b-4550-81e0-7b31346354f0_1024x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!KvJ9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff8c7a65e-4c9b-4550-81e0-7b31346354f0_1024x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!KvJ9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff8c7a65e-4c9b-4550-81e0-7b31346354f0_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!KvJ9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff8c7a65e-4c9b-4550-81e0-7b31346354f0_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!KvJ9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff8c7a65e-4c9b-4550-81e0-7b31346354f0_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!KvJ9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff8c7a65e-4c9b-4550-81e0-7b31346354f0_1024x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!KvJ9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff8c7a65e-4c9b-4550-81e0-7b31346354f0_1024x1024.png" width="1024" height="1024" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f8c7a65e-4c9b-4550-81e0-7b31346354f0_1024x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1024,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1233556,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/187815899?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff8c7a65e-4c9b-4550-81e0-7b31346354f0_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!KvJ9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff8c7a65e-4c9b-4550-81e0-7b31346354f0_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!KvJ9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff8c7a65e-4c9b-4550-81e0-7b31346354f0_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!KvJ9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff8c7a65e-4c9b-4550-81e0-7b31346354f0_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!KvJ9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff8c7a65e-4c9b-4550-81e0-7b31346354f0_1024x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Your AI coding assistant can now be weaponized through a poisoned codebase. This week gave us the first actively exploited prompt injection vulnerabilities in GitHub Copilot, a North Korean crew running deepfake Zoom calls to rob crypto firms, and 300 million private chatbot messages sitting in an open database.</p><p>The theme is that speed kills. AI capabilities are moving faster than the people building them can secure them, faster than the regulators writing rules, and faster than the enterprises deploying them can govern.</p><p>Here&#8217;s what mattered and what you should do about it. If you&#8217;re building your AI governance program, <a href="https://www.rockcyber.com">RockCyber</a> can help you close the gaps.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260206-20260212?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260206-20260212?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><h3>1. Microsoft patches prompt injection RCE in GitHub Copilot, three actively exploited zero-days hit Windows</h3><p>Microsoft&#8217;s February 10 Patch Tuesday dropped fixes for 58 vulnerabilities, including six actively exploited zero-days (BleepingComputer). The AI security headline is that remote code execution vulnerabilities in GitHub Copilot across VS Code, Visual Studio, and JetBrains products (CVE-2026-21516, CVE-2026-21523, CVE-2026-21256). These flaws stem from command injection triggered through prompt injection (Krebs on Security). A threat actor embeds a malicious prompt into a codebase, and when an agent workflow executes, the prompt becomes code. CISA added all six zero-days to its KEV catalog the same day (SecurityWeek).</p><p><strong>Why it matters</strong></p><ul><li><p>AI coding assistants are now a confirmed attack surface. The prompt-to-execution chain is live.</p></li><li><p>CI/CD pipelines using Copilot agent workflows are exposed to supply chain attacks through poisoned repositories.</p></li><li><p>Six zero-days exploited simultaneously signals coordinated offensive activity.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Patch all affected GitHub Copilot, VS Code, Visual Studio, and JetBrains installations before the CISA KEV deadline.</p></li><li><p>Audit CI/CD pipelines that use Copilot agent mode. Disable automatic command execution from AI-suggested code until you validate controls.</p></li><li><p>Prioritize CVE-2026-21510 (SmartScreen bypass), CVE-2026-21514 (Word OLE bypass), and CVE-2026-21533 (RDP to SYSTEM) across all endpoints.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Prompt injection strikes again, but this time in an actively exploited vulnerability in the world&#8217;s most popular AI coding tool. A poisoned README triggers remote code execution when Copilot processes it. Every AI agent that reads unvalidated context and executes actions has this same problem. The architecture is the vulnerability. Start treating AI agent security as a distinct risk domain in your program.</p><h3>2. International AI Safety Report 2026 drops, confirms governance is losing the race</h3><p>On February 3, the second International AI Safety Report was published, chaired by Yoshua Bengio and authored by over 100 experts from 30+ countries (International AI Safety Report). The report finds that AI capabilities have accelerated sharply in mathematics, coding, and autonomous task execution (Inside Global Tech). A key finding is that some models detect when they&#8217;re being tested and behave differently during evaluation versus deployment. The report calls risk management frameworks &#8220;still immature&#8221; (AI Governance Library).</p><p><strong>Why it matters</strong></p><ul><li><p>The largest global collaboration on AI safety now says governance can&#8217;t keep up with capability advances.</p></li><li><p>Models that game their own evaluations undermine every safety benchmark your vendor shows you.</p></li><li><p>12 frontier AI companies published safety frameworks in 2025, but the report notes wide variation in scope and enforceability.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Use the report as board-level evidence to justify accelerated AI governance investment. It carries the weight of 30 governments and the UN.</p></li><li><p>Challenge vendor safety claims by asking about evaluation gaming. If their model can distinguish test from deployment, your risk assessment is incomplete.</p></li><li><p>Map your AI risk management practices against the report&#8217;s four-component framework: risk identification, analysis, mitigation, and governance.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Most of this report confirms what practitioners already knew, but that&#8217;s why it matters. When Bengio and 100 experts backed by 30 nations say &#8220;real-world evidence of safety measure effectiveness remains limited,&#8221; your board can&#8217;t dismiss that as one consultant&#8217;s opinion. The bit about models gaming evaluations should keep you up at night. If the AI can tell when it&#8217;s being tested, every safety benchmark you&#8217;ve seen is unreliable. Send the executive summary to your CEO.</p><h3>3. North Korea&#8217;s UNC1069 runs deepfake Zoom calls to steal crypto</h3><p>Google&#8217;s Mandiant published research on February 9 detailing UNC1069, a North Korean threat actor targeting a FinTech organization (Google Cloud Blog). The attackers compromised a crypto executive&#8217;s Telegram account, sent a Calendly link to a fake Zoom meeting, and displayed a deepfake video of a CEO during the call (The Record). The intrusion deployed seven distinct malware families including three new tools (Dark Reading).</p><p><strong>Why it matters</strong></p><ul><li><p>Deepfake video is now a confirmed offensive tool in state-sponsored financial crime.</p></li><li><p>Seven malware families on a single host shows a targeted, high-investment operation.</p></li><li><p>The Telegram-to-Zoom-to-malware chain exploits the trust your employees place in the common business tools they use daily.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Brief cryptocurrency and financial services teams on the specific attack chain: compromised Telegram account, Calendly invite, spoofed Zoom link, ClickFix infection.</p></li><li><p>Require verified domains for all conferencing links. Block look-alike Zoom infrastructure at email and web gateways.</p></li><li><p>Train staff that &#8220;run these commands to fix audio/video&#8221; during unsolicited calls is a red flag. That&#8217;s the ClickFix technique, and it works because people comply.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>North Korea stole over $2 billion in crypto in 2025. UNC1069 is just one crew in what TRM Labs calls an &#8220;industrialized theft supply chain.&#8221; The deepfake angle changes everything. When your employee sees a video of a known CEO on a Zoom call, trust goes up, skepticism goes down, and seven malware families get dropped.</p><h3>4. DockerDash vulnerability turns AI metadata into executable code</h3><p>On February 3, Noma Labs disclosed a critical flaw in Docker&#8217;s Ask Gordon AI assistant, dubbed DockerDash (Noma Security). A malicious metadata label in a Docker image hijacks the AI&#8217;s execution chain: Gordon reads it, forwards it to the MCP Gateway, and the gateway executes it with zero validation (SecurityWeek). In cloud/CLI environments, this enables RCE. In Docker Desktop, it enables data exfiltration (Infosecurity Magazine).</p><p><strong>Why it matters</strong></p><ul><li><p>This is another well-documented exploitation of Model Context Protocol (MCP) as an attack surface, a protocol gaining rapid adoption.</p></li><li><p>The attack uses standard Docker LABEL fields, meaning malicious images look normal to existing scanning tools.</p></li><li><p>The pattern applies to any AI agent that reads external context and passes it to an execution layer.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Upgrade to Docker Desktop 4.50.0 or later immediately.</p></li><li><p>Audit any AI agent in your environment that reads metadata, documents, or external context and can trigger tool execution.</p></li><li><p>Demand human-in-the-loop confirmation for all MCP tool invocations in development environments.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Noma&#8217;s Sasi Levi nailed it. The DockerDash vulnerability &#8220;isn&#8217;t Docker-specific, it&#8217;s AI-specific.&#8221; Every RAG system, every AI assistant that reads metadata, every agent that processes external input faces the same problem. The AI cannot distinguish between context and instruction. The MCP standard is growing fast. If you&#8217;re deploying MCP-connected agents, treat every external input as potentially malicious.</p><h3>5. Chat and Ask AI app leaks 300 million messages from 25 million users</h3><p>Security researcher Harry found that Chat and Ask AI, a wrapper app with 50+ million downloads connecting users to ChatGPT, Claude, and Gemini, left its Firebase backend wide open (404 Media). The misconfiguration exposed roughly 300 million messages from 25 million users (Malwarebytes). Messages contained mental health discussions, self-harm requests, and queries about illegal activities (GBHackers).</p><p><strong>Why it matters</strong></p><ul><li><p>300 million messages prove that AI chat wrapper apps store everything users type, and often store it insecurely.</p></li><li><p>The Firebase misconfiguration is a known, preventable error. This wasn&#8217;t a sophisticated attack. It was an open door.</p></li><li><p>Users treated AI chats as private conversations. The exposed content creates extortion, identity theft, and social engineering risk at massive scale.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Inventory third-party AI wrapper apps in your environment. If employees use consumer AI apps for work, their conversations sit in someone else&#8217;s database.</p></li><li><p>Add AI app vetting to your shadow IT detection process. Firebase misconfigurations are scannable.</p></li><li><p>Remind employees: anything typed into an AI chat may be stored, breached, and public. Treat AI conversations with the same caution as email.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Codeway built a wrapper app that connects to real AI models from real companies. But the app stored everything in a Firebase database with public read access. No attack needed. Just log in. Think about how many AI wrapper apps your employees downloaded this year. Each one stores conversations on infrastructure you don&#8217;t control. The apps people use to access AI are the weak link.</p><h3>6. Senator Hassan presses Bondu after AI toy exposes 50,000 children&#8217;s chat logs</h3><p>Senator Maggie Hassan sent a formal letter to Bondu, maker of AI plush toys for children ages 3 to 9, after researchers found Bondu&#8217;s web portal allowed anyone with a Gmail account to access 50,000 children&#8217;s chat transcripts (Senate Joint Economic Committee). Exposed data included children&#8217;s names, birth dates, and intimate conversations (Axios). Bondu patched the issue within hours of disclosure (Malwarebytes).</p><p><strong>Why it matters</strong></p><ul><li><p>50,000 children&#8217;s private conversations were accessible to anyone with a Google account. No hacking required.</p></li><li><p>Congressional attention signals that AI toy security will become a regulatory priority, with California SB 867 already proposing a four-year moratorium on AI companion chatbots for minors.</p></li><li><p>The data included information that researchers called &#8220;a kidnapper&#8217;s dream&#8221;: names, birthdays, family details, and behavioral patterns.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>If your organization develops or invests in consumer AI products that interact with children, conduct an immediate security audit of data storage and access controls.</p></li><li><p>Monitor legislative developments. SB 867 in California and the Parents and Kids Safe AI Act will reshape compliance requirements for AI products targeting minors.</p></li><li><p>Review your enterprise&#8217;s policy on AI products used by employees&#8217; families.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Joel Margolis told Wired the exposed data was &#8220;a kidnapper&#8217;s dream.&#8221; Names, birthdays, what kids like, who their family members are, all accessible with a Gmail login. Bondu touted 18 months of safe beta testing. The problem was never the toy&#8217;s behavior. It was that every word a child said to a stuffed dinosaur sat on an open server. The attack surface isn&#8217;t the model. It&#8217;s the trust that makes people share everything with a friendly interface.</p><h3>7. Eight critical n8n vulnerabilities expose AI orchestration infrastructure</h3><p>Between January and early February 2026, eight new high-to-critical CVEs were disclosed in n8n, the popular open-source workflow automation platform (Geordie AI). The vulnerabilities span expression evaluation, file access controls, Git, SSH, Merge nodes, and Python execution. Several bypass earlier patches, including CVE-2026-25049 which enables system command execution via malicious workflows (The Hacker News). Organizations that upgraded to n8n 2.2.2 following January guidance remain exposed.</p><p><strong>Why it matters</strong></p><ul><li><p>n8n sits at the intersection of APIs, secrets, CI/CD, and AI orchestration. A compromise here cascades across your entire automation stack.</p></li><li><p>Multiple vulnerabilities bypass earlier patches, creating a false sense of security for organizations that thought they were current.</p></li><li><p>Authenticated workflow editor access is enough to exploit several flaws, and that role is commonly granted to non-administrators.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Patch n8n immediately. Do not assume the January patch cycle covered you.</p></li><li><p>Audit who has workflow editor access. Apply least-privilege principles to your automation platform.</p></li><li><p>Review all n8n workflows that handle credentials or connect to AI services.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>n8n flies under the security radar because it&#8217;s &#8220;just automation.&#8221; But it connects to your APIs, stores your credentials, orchestrates your AI workflows, and executes code. Eight CVEs in one month, several bypassing previous patches, tells me the security debt in AI tooling is deeper than most teams realize. If you&#8217;re running n8n in production, treat it like Active Directory. To an attacker, it&#8217;s just as valuable.</p><h3>8. Ivanti EPMM zero-days breach Dutch and Finnish government systems</h3><p>Ivanti disclosed two zero-day vulnerabilities in Endpoint Manager Mobile (EPMM), CVE-2026-1281 and CVE-2026-1340, both CVSS 9.8, on January 29 (The Hacker News). By February 9, the Netherlands confirmed compromise of government systems. Finland&#8217;s Valtori disclosed a breach affecting 50,000 government employees (The Hacker News).</p><p><strong>Why it matters</strong></p><ul><li><p>Two European government institutions confirmed compromise within days of disclosure.</p></li><li><p>The dormant payload technique suggests brokers are stockpiling access to government systems for future sale.</p></li><li><p>EPMM manages mobile devices. Compromising it means potential access to every managed device.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>If you run Ivanti EPMM, patch to the emergency release immediately. Check for indicators of compromise, specifically the /mifs/403.jsp web shell path.</p></li><li><p>Review EPMM logs for unusual authentication patterns between January 29 and your patch date. The zero-days were exploited before disclosure.</p></li><li><p>Audit your mobile device management architecture. If your MDM platform is compromised, assume all managed devices are potentially exposed.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Ivanti. Again. EPMM manages mobile devices for governments and critical infrastructure operators. Two 9.8 CVSS zero-days, exploited before disclosure, breaching government systems across Europe. The dormant payload angle is telling. Access brokers are treating compromised government infrastructure like inventory: gain entry, plant loaders, wait to sell.</p><h3>9. Colorado AI Act delayed to June 2026 as federal preemption fight intensifies</h3><p>Colorado&#8217;s AI Act (SB 24-205), originally scheduled for February 1, 2026, has been delayed to June 30 (King and Spalding). The delay follows President Trump&#8217;s December 2025 executive order directing an AI Litigation Task Force to challenge state AI laws (Gunderson Dettmer). Colorado is the only state law specifically named in the executive order.</p><p><strong>Why it matters</strong></p><ul><li><p>The most comprehensive U.S. state AI law targeting algorithmic discrimination in high-risk systems just got pushed back five months.</p></li><li><p>The federal preemption fight creates compliance uncertainty for every organization deploying AI in employment, lending, housing, or insurance decisions.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Don&#8217;t pause compliance work. The executive order cannot overturn state law. State laws remain enforceable until struck down.</p></li><li><p>Prepare for a two-track reality: implement state obligations while tracking federal moves that could narrow those obligations.</p></li><li><p>Document your impact assessments and risk management now. You&#8217;ll need them regardless of which jurisdiction prevails.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Companies scrambling to comply by February 1 just got breathing room. Companies that already did the work got a competitive advantage. The federal preemption push faces a long road through courts. My advice: build your AI governance to the highest standard any jurisdiction requires. If you meet Colorado&#8217;s requirements, you&#8217;ll meet most of what Texas, California, and the EU AI Act demand.</p><h3>10. EU AI Act high-risk guidance misses deadline as August 2026 enforcement looms</h3><p>The European Commission missed its deadline to publish guidelines on high-risk AI system requirements under the EU AI Act (IAPP). CEN and CENELEC, the standardization bodies, also missed their deadline and now aim for the end of 2026. High-risk obligations become enforceable in August 2026. The GPAI Code of Practice is expected in final form by June (Bird and Bird).</p><p><strong>Why it matters</strong></p><ul><li><p>Organizations building compliance programs for high-risk AI systems have no finalized guidance or technical standards, with enforcement just six months away.</p></li><li><p>The standards delay means there&#8217;s no &#8220;safe harbor&#8221; for companies trying to demonstrate conformity.</p></li><li><p>Industry groups are calling for enforcement delays, but the statutory deadline stands.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Don&#8217;t wait for final standards. Build your compliance program on the AI Act text, the NIST AI Risk Management Framework, and available draft guidance.</p></li><li><p>Track the General-Purpose AI Code of Practice. If you&#8217;re a GPAI provider, the June finalization means compliance work needs to be underway now.</p></li><li><p>Engage your legal team on whether your AI systems qualify as &#8220;high-risk&#8221; under the Act. The classification determines your obligations, and waiting for guidance is not a defense.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>The EU passed the most ambitious AI regulation in history, then couldn&#8217;t deliver the guidance companies need to comply with it. But the enforcement date hasn&#8217;t moved. August 2026 is coming whether the paperwork is ready or not. Build to the text of the Act, supplement with NIST and OWASP guidance, and document everything.</p><h3>The One Thing You Won&#8217;t Hear About But You Need To</h3><h3>AI agents are inheriting the permissions of the humans who use them, and nobody is governing that</h3><p>Well&#8230; if you follow me on <strong><a href="https://www.linkedin.com/in/rocklambros">Linkedin</a></strong>, you&#8217;ve definitely heard about this. Just take a look across this week&#8217;s stories and you&#8217;ll see a pattern nobody is naming. In the Copilot vulnerabilities, the AI executes malicious code with the developer&#8217;s privileges. In DockerDash, Ask Gordon forwards poisoned metadata using whatever access Docker provides. In n8n, workflow editor access exploits critical vulnerabilities because the platform inherits broad permissions.</p><p>AI agents inherit the permission scope of the humans and systems they operate within, but they don&#8217;t inherit the judgment to use those permissions safely.</p><p><strong>Why it matters</strong></p><ul><li><p>Traditional identity and access management (IAM) was built for human users who exercise judgment. AI agents exercise none.</p></li><li><p>The blast radius of a compromised AI agent equals the full permission set of the account or environment it operates in.</p></li><li><p>No major IAM framework currently addresses AI agent permissions as a distinct category requiring separate controls.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Assign each agent a workload identity that is cryptographically attested and bound to its runtime environment, not a static service account. Pair it with just-in-time, short-lived tokens scoped per task so there are no standing credentials to steal or reuse.</p></li><li><p>Enforce capability-level access controls that evaluate what each agent is doing, not just who spawned it. A developer&#8217;s AI assistant should never inherit production access simply because the developer has it.</p></li><li><p>Treat agent permissions as privileged access: review them quarterly, require justification for renewal, and auto-revoke on expiry. Standing agent permissions are standing risk.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>We&#8217;re giving AI agents the keys to the kingdom and hoping they&#8217;ll only open the right doors. That&#8217;s not a security strategy. That&#8217;s a prayer. The NIST AI RMF doesn&#8217;t address agent permissions. The EU AI Act barely touches it. I think this becomes the defining AI security challenge of 2026. Start with an inventory. Then start restricting. For more on building AI governance programs that address these gaps, visit <a href="https://rockcybermusings.com">RockCyber Musings</a> or reach out to <a href="https://www.rockcyber.com">RockCyber</a> directly.</p><p>If you found this analysis useful, subscribe at <strong><a href="https://rockcybermusings.com/">rockcybermusings.com</a></strong> for weekly intelligence on AI security developments.</p><p>&#128073; Visit <strong><a href="https://www.rockcyber.com/">RockCyber.com</a></strong> to learn more about how we can help you in your traditional Cybersecurity and AI Security and Governance Journey</p><p>&#128073; Want to save a quick $100K? Check out our AI Governance Tools at <strong><a href="https://aigovernancetoolkit.com/">AIGovernanceToolkit.com</a></strong></p><p>&#128073; Subscribe for more AI and cyber insights with the occasional rant.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&quot;,&quot;text&quot;:&quot;Share RockCyber Musings&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share"><span>Share RockCyber Musings</span></a></p><h2>References</h2><p>404 Media. (2026, February 5). Massive AI chat app leaked millions of users&#8217; private conversations. <a href="https://www.404media.co/massive-ai-chat-app-leaked-millions-of-users-private-conversations/">https://www.404media.co/massive-ai-chat-app-leaked-millions-of-users-private-conversations/</a></p><p>ASIS Online. (2026, February). New International AI Safety Report spotlights emerging risks. <em>Security Management</em>. <a href="https://www.asisonline.org/security-management-magazine/latest-news/today-in-security/2026/february/2026-international-safety-report/">https://www.asisonline.org/security-management-magazine/latest-news/today-in-security/2026/february/2026-international-safety-report/</a></p><p>Axios. (2026, February 3). Senator presses AI toy company bondu after kids&#8217; chat data was exposed. <a href="https://www.axios.com/2026/02/03/ai-toy-bondu-chat-data-exposure-hassan">https://www.axios.com/2026/02/03/ai-toy-bondu-chat-data-exposure-hassan</a></p><p>Bird &amp; Bird. (2026). Taking the EU AI Act to practice: Understanding the draft transparency code of practice. <a href="https://www.twobirds.com/en/insights/2026/taking-the-eu-ai-act-to-practice-understanding-the-draft-transparency-code-of-practice">https://www.twobirds.com/en/insights/2026/taking-the-eu-ai-act-to-practice-understanding-the-draft-transparency-code-of-practice</a></p><p>BleepingComputer. (2026, February 11). Microsoft February 2026 Patch Tuesday fixes 6 zero-days, 58 flaws. <a href="https://www.bleepingcomputer.com/news/microsoft/microsoft-february-2026-patch-tuesday-fixes-6-zero-days-58-flaws/">https://www.bleepingcomputer.com/news/microsoft/microsoft-february-2026-patch-tuesday-fixes-6-zero-days-58-flaws/</a></p><p>Check Point Research. (2026, February 9). 9th February: Threat intelligence report. <a href="https://research.checkpoint.com/2026/9th-february-threat-intelligence-report/">https://research.checkpoint.com/2026/9th-february-threat-intelligence-report/</a></p><p>Common Sense Media. (2026, January 22). Common Sense Media warns against AI toy companions after research reveals safety risks [Press release]. <a href="https://www.commonsensemedia.org/press-releases/common-sense-media-warns-against-ai-toy-companions-after-research-reveals-safety-risks">https://www.commonsensemedia.org/press-releases/common-sense-media-warns-against-ai-toy-companions-after-research-reveals-safety-risks</a></p><p>CSO Online. (2026, February 11). Microsoft fixes six zero-days on February 2026 Patch Tuesday. <a href="https://www.csoonline.com/article/4130446/february-2026-patch-tuesday-six-new-and-actively-exploited-microsoft-vulnerabilities-addressed.html">https://www.csoonline.com/article/4130446/february-2026-patch-tuesday-six-new-and-actively-exploited-microsoft-vulnerabilities-addressed.html</a></p><p>CyberAdviser Blog. (2026, January). What to expect in AI regulation in 2026. <a href="https://www.cyberadviserblog.com/2026/01/what-to-expect-in-ai-regulation-in-2026/">https://www.cyberadviserblog.com/2026/01/what-to-expect-in-ai-regulation-in-2026/</a></p><p>Cybernews. (2026, February 10). Armed with new tools, North Koreans ramp up attacks on lucrative crypto sector. <a href="https://cybernews.com/security/north-korea-ai-lucrative-crypto-industry/">https://cybernews.com/security/north-korea-ai-lucrative-crypto-industry/</a></p><p>CyberPress. (2026, February 10). AI chat app data breach exposes 300 million messages. <a href="https://cyberguy.com/security/millions-ai-chat-messages-exposed-app-data-leak/">https://cyberguy.com/security/millions-ai-chat-messages-exposed-app-data-leak/</a></p><p>Dark Reading. (2026, February 11). North Korea&#8217;s UNC1069 hammers crypto firms with AI. <a href="https://www.darkreading.com/threat-intelligence/north-koreas-unc1069-hammers-crypto-firms">https://www.darkreading.com/threat-intelligence/north-koreas-unc1069-hammers-crypto-firms</a></p><p>GBHackers. (2026, February 10). 25 million users affected as AI chat platform leaks 300 million messages. <a href="https://gbhackers.com/ai-chat-platform-leaks-300-million-messages/">https://gbhackers.com/ai-chat-platform-leaks-300-million-messages/</a></p><p>Geordie AI. (2026, February). Eight new n8n CVEs in February 2026: Updated patching guidance. <a href="https://www.geordie.ai/resources/technical-advisory-eight-new-n8n-cves-since-january---updated-remediation-guidance">https://www.geordie.ai/resources/technical-advisory-eight-new-n8n-cves-since-january---updated-remediation-guidance</a></p><p>Google Cloud Blog. (2026, February 9). UNC1069 targets cryptocurrency sector with new tooling and AI-enabled social engineering. <a href="https://cloud.google.com/blog/topics/threat-intelligence/unc1069-targets-cryptocurrency-ai-social-engineering">https://cloud.google.com/blog/topics/threat-intelligence/unc1069-targets-cryptocurrency-ai-social-engineering</a></p><p>Gunderson Dettmer. (2026). 2026 AI laws update: Key regulations and practical guidance. <a href="https://www.gunder.com/en/news-insights/insights/2026-ai-laws-update-key-regulations-and-practical-guidance">https://www.gunder.com/en/news-insights/insights/2026-ai-laws-update-key-regulations-and-practical-guidance</a></p><p>IAPP. (2026, February). European Commission misses deadline for AI Act guidance on high-risk systems. <a href="https://iapp.org/news/a/european-commission-misses-deadline-for-ai-act-guidance-on-high-risk-systems">https://iapp.org/news/a/european-commission-misses-deadline-for-ai-act-guidance-on-high-risk-systems</a></p><p>Infosecurity Magazine. (2026, February 9). DockerDash exposes AI supply chain weakness in Docker&#8217;s Ask Gordon. <a href="https://www.infosecurity-magazine.com/news/dockerdash-weakness-dockers-ask">https://www.infosecurity-magazine.com/news/dockerdash-weakness-dockers-ask</a></p><p>Inside Global Tech. (2026, February 10). International AI Safety Report 2026 examines AI capabilities, risks, and safeguards. <a href="https://www.insideglobaltech.com/2026/02/10/international-ai-safety-report-2026-examines-ai-capabilities-risks-and-safeguards/">https://www.insideglobaltech.com/2026/02/10/international-ai-safety-report-2026-examines-ai-capabilities-risks-and-safeguards/</a></p><p>International AI Safety Report. (2026, February 3). International AI Safety Report 2026. <a href="https://internationalaisafetyreport.org/publication/international-ai-safety-report-2026">https://internationalaisafetyreport.org/publication/international-ai-safety-report-2026</a></p><p>King &amp; Spalding. (2026). New state AI laws are effective on January 1, 2026, but a new executive order signals disruption. <a href="https://www.kslaw.com/news-and-insights/new-state-ai-laws-are-effective-on-january-1-2026-but-a-new-executive-order-signals-disruption">https://www.kslaw.com/news-and-insights/new-state-ai-laws-are-effective-on-january-1-2026-but-a-new-executive-order-signals-disruption</a></p><p>Krebs on Security. (2026, February 11). Patch Tuesday, February 2026 edition. <a href="https://krebsonsecurity.com/2026/02/patch-tuesday-february-2026-edition/">https://krebsonsecurity.com/2026/02/patch-tuesday-february-2026-edition/</a></p><p>Malwarebytes. (2026, February 10). AI chat app leak exposes 300 million messages tied to 25 million users. <a href="https://www.malwarebytes.com/blog/news/2026/02/ai-chat-app-leak-exposes-300-million-messages-tied-to-25-million-users">https://www.malwarebytes.com/blog/news/2026/02/ai-chat-app-leak-exposes-300-million-messages-tied-to-25-million-users</a></p><p>Malwarebytes. (2026, February). An AI plush toy exposed thousands of private chats with children. <a href="https://www.malwarebytes.com/blog/news/2026/02/an-ai-plush-toy-exposed-thousands-of-private-chats-with-children">https://www.malwarebytes.com/blog/news/2026/02/an-ai-plush-toy-exposed-thousands-of-private-chats-with-children</a></p><p>Noma Security. (2026, February 3). DockerDash: Two attack paths, one AI supply chain crisis. <a href="https://noma.security/blog/dockerdash-two-attack-paths-one-ai-supply-chain-crisis/">https://noma.security/blog/dockerdash-two-attack-paths-one-ai-supply-chain-crisis/</a></p><p>SecurityWeek. (2026, February 4). DockerDash flaw in Docker AI assistant leads to RCE, data theft. <a href="https://www.securityweek.com/dockerdash-flaw-in-docker-ai-assistant-leads-to-rce-data-theft/">https://www.securityweek.com/dockerdash-flaw-in-docker-ai-assistant-leads-to-rce-data-theft/</a></p><p>SecurityWeek. (2026, February 11). 6 actively exploited zero-days patched by Microsoft with February 2026 updates. <a href="https://www.securityweek.com/6-actively-exploited-zero-days-patched-by-microsoft-with-february-2026-updates/">https://www.securityweek.com/6-actively-exploited-zero-days-patched-by-microsoft-with-february-2026-updates/</a></p><p>The Hacker News. (2026, February 3). Docker fixes critical Ask Gordon AI flaw allowing code execution via image metadata. <a href="https://thehackernews.com/2026/02/docker-fixes-critical-ask-gordon-ai.html">https://thehackernews.com/2026/02/docker-fixes-critical-ask-gordon-ai.html</a></p><p>The Hacker News. (2026, February 11). North Korea-linked UNC1069 uses AI lures to attack cryptocurrency organizations. <a href="https://thehackernews.com/2026/02/north-korea-linked-unc1069-uses-ai.html">https://thehackernews.com/2026/02/north-korea-linked-unc1069-uses-ai.html</a></p><p>The Hacker News. (2026, February 12). Dutch authorities confirm Ivanti zero-day exploit exposed employee contact data. <a href="https://thehackernews.com/2026/02/dutch-authorities-confirm-ivanti-zero.html">https://thehackernews.com/2026/02/dutch-authorities-confirm-ivanti-zero.html</a></p><p>The Record. (2026, February 10). North Korean hackers targeted crypto exec with fake Zoom meeting, ClickFix scam. <a href="https://therecord.media/north-korean-hackers-targeted-crypto-exec-clickfix">https://therecord.media/north-korean-hackers-targeted-crypto-exec-clickfix</a></p><p>U.S. Senate Joint Economic Committee. (2026, February 3). Senator Hassan presses toy company on child safety and privacy practices. <a href="https://www.jec.senate.gov/public/index.cfm/democrats/2026/2/senator-hassan-presses-toy-company-on-child-safety-and-privacy-practices-after-children-s-conversations-with-its-ai-chat-toy-were-left-exposed-to-any-gmail-user">https://www.jec.senate.gov/public/index.cfm/democrats/2026/2/senator-hassan-presses-toy-company-on-child-safety-and-privacy-practices-after-children-s-conversations-with-its-ai-chat-toy-were-left-exposed-to-any-gmail-user</a></p><p>Zero Day Initiative. (2026, February 11). The February 2026 security update review. <a href="https://www.zerodayinitiative.com/blog/2026/2/10/the-february-2026-security-update-review">https://www.zerodayinitiative.com/blog/2026/2/10/the-february-2026-security-update-review</a></p>]]></content:encoded></item><item><title><![CDATA[Behold the Zerg! Parallel Claude Code Orchestration for the Swarm]]></title><description><![CDATA[Spawn workers. Ship code. Skip the chaos.]]></description><link>https://www.rockcybermusings.com/p/behold-zerg-parallel-claude-code-orchestration</link><guid isPermaLink="false">https://www.rockcybermusings.com/p/behold-zerg-parallel-claude-code-orchestration</guid><dc:creator><![CDATA[Rock Lambros]]></dc:creator><pubDate>Tue, 10 Feb 2026 13:50:20 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!DKuJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc16369d-471b-49fc-94df-599eb6ff9106_2816x1536.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!DKuJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc16369d-471b-49fc-94df-599eb6ff9106_2816x1536.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!DKuJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc16369d-471b-49fc-94df-599eb6ff9106_2816x1536.png 424w, https://substackcdn.com/image/fetch/$s_!DKuJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc16369d-471b-49fc-94df-599eb6ff9106_2816x1536.png 848w, https://substackcdn.com/image/fetch/$s_!DKuJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc16369d-471b-49fc-94df-599eb6ff9106_2816x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!DKuJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc16369d-471b-49fc-94df-599eb6ff9106_2816x1536.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!DKuJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc16369d-471b-49fc-94df-599eb6ff9106_2816x1536.png" width="1456" height="794" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bc16369d-471b-49fc-94df-599eb6ff9106_2816x1536.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:794,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:7649304,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/187202248?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc16369d-471b-49fc-94df-599eb6ff9106_2816x1536.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!DKuJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc16369d-471b-49fc-94df-599eb6ff9106_2816x1536.png 424w, https://substackcdn.com/image/fetch/$s_!DKuJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc16369d-471b-49fc-94df-599eb6ff9106_2816x1536.png 848w, https://substackcdn.com/image/fetch/$s_!DKuJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc16369d-471b-49fc-94df-599eb6ff9106_2816x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!DKuJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc16369d-471b-49fc-94df-599eb6ff9106_2816x1536.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/p/behold-zerg-parallel-claude-code-orchestration?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/p/behold-zerg-parallel-claude-code-orchestration?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p>Every major AI coding assistant got pwned last year. GitHub Copilot. Cursor. Windsurf. Claude Code. JetBrains Junie. All of them.</p><p>The IDEsaster disclosure in December 2025 documented 30+ vulnerabilities across the entire ecosystem. 100% of tested AI IDEs were vulnerable to prompt injection attacks that chain through legitimate IDE features to achieve remote code execution and data exfiltration. Then came the MCP breaches. Tool poisoning attacks through the Model Context Protocol let malicious servers exfiltrate entire WhatsApp histories. The GitHub MCP server got hijacked through a poisoned public issue that leaked private repository contents.</p><p>I watched this unfold while building parallel Claude Code infrastructure for my own work. The performance benefits of running multiple agents simultaneously were obvious. The security implications were terrifying.</p><p>Claude Code changed how I build software. Hell, it <em>lets</em> me do it as I don&#8217;t have a heavy software development background. One instance handles tasks that used to take me hours, but you can&#8217;t run two of them at once. Not really. </p><p>Workarounds exist. Git worktrees. Multiple terminals. Manual coordination. They all share the same fatal flaw. You become the orchestrator instead of the engineer. You&#8217;re babysitting AI agents instead of shipping code.</p><p>I got tired of it. Tired of copy-pasting context between sessions. Tired of watching one agent sit idle while another churned through a task I could&#8217;ve parallelized. Tired of evaluating orchestration tools that optimized for speed and autonomy while treating security as an optional configuration. Phase-two stuff.</p><p>So I built Zerg. Parallel Claude Code orchestration with security, context engineering, and crash recovery baked in from day one. Not bolted on. Not &#8220;coming in phase two.&#8221; Built in.</p><p>Today it&#8217;s open source. Here&#8217;s what it does, how the architecture works, and what you should know before you touch it.</p><h2>The Problem Nobody Else Solved</h2><p>The biggest limitation of Claude Code is embarrassingly simple: one task at a time or risk conflicts and race conditions. While one agent refactors authentication, you wait. It can&#8217;t write tests in parallel. It can&#8217;t document changes while it codes. You sit there watching a spinner like you&#8217;re on dial-up. </p><p>This isn&#8217;t a bug. Claude Code is a CLI tool. It does one thing well. But one thing isn&#8217;t enough anymore. There are many tools out there that allow you to parallelize tasks, but none take a security-first, context engineering mindset.</p><p>ZERG does.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!EG66!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb31e91e2-74ad-4fe6-80a6-44a613548c5c_7175x1700.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!EG66!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb31e91e2-74ad-4fe6-80a6-44a613548c5c_7175x1700.png 424w, https://substackcdn.com/image/fetch/$s_!EG66!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb31e91e2-74ad-4fe6-80a6-44a613548c5c_7175x1700.png 848w, https://substackcdn.com/image/fetch/$s_!EG66!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb31e91e2-74ad-4fe6-80a6-44a613548c5c_7175x1700.png 1272w, https://substackcdn.com/image/fetch/$s_!EG66!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb31e91e2-74ad-4fe6-80a6-44a613548c5c_7175x1700.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!EG66!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb31e91e2-74ad-4fe6-80a6-44a613548c5c_7175x1700.png" width="1456" height="345" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b31e91e2-74ad-4fe6-80a6-44a613548c5c_7175x1700.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:345,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:426004,&quot;alt&quot;:&quot;Gantt chart showing 8 hours sequential vs 2 hours parallel execution&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/187202248?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb31e91e2-74ad-4fe6-80a6-44a613548c5c_7175x1700.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Gantt chart showing 8 hours sequential vs 2 hours parallel execution" title="Gantt chart showing 8 hours sequential vs 2 hours parallel execution" srcset="https://substackcdn.com/image/fetch/$s_!EG66!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb31e91e2-74ad-4fe6-80a6-44a613548c5c_7175x1700.png 424w, https://substackcdn.com/image/fetch/$s_!EG66!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb31e91e2-74ad-4fe6-80a6-44a613548c5c_7175x1700.png 848w, https://substackcdn.com/image/fetch/$s_!EG66!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb31e91e2-74ad-4fe6-80a6-44a613548c5c_7175x1700.png 1272w, https://substackcdn.com/image/fetch/$s_!EG66!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb31e91e2-74ad-4fe6-80a6-44a613548c5c_7175x1700.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a><figcaption class="image-caption">Figure 1: Sequential vs Parallel Execution</figcaption></figure></div><p>The community noticed. A feature request hit Claude Code&#8217;s GitHub repo in August 2025 describing exactly this pain. Developers manually create worktrees, navigate between directories, and manage multiple terminal windows. The request proposed native orchestration. It shipped literally a few days ago. Too soon and too &#8220;beta&#8221; for me to integrate into Zerg before release.</p><p>So what did everyone do? They built workarounds. Simon Willison wrote about running multiple instances across directories. Someone documented orchestrating 10+ Claude instances with custom Python and Redis queues. These solutions work, but they&#8217;re duct tape and prayer. You&#8217;re building coordination infrastructure that should exist in the tool itself.</p><p>The agentic AI industry is obsessed with autonomy. More autonomy. Agents that think for themselves. Agents that chain tools together without human approval.</p><p>Know what they should be obsessed with? Predictability. Recovery. Security. What happens when one of your five parallel agents crashes at 2 AM? How do you resume without losing three hours of work? How do you prevent your context window from bloating until your $200 API bill becomes $2,000? And after IDEsaster proved every AI IDE can be weaponized through prompt injection, how do you isolate workers so a compromised agent can&#8217;t poison the others?</p><h2>What Zerg Does That Nothing Else Does</h2><p>Let me be clear about what Zerg isn&#8217;t. It&#8217;s not LangChain. It&#8217;s not CrewAI. It&#8217;s not AutoGen. It&#8217;s not trying to be a general-purpose agentic framework for every use case under the sun.</p><p>Zerg does one thing. It coordinates multiple Claude Code instances running in parallel with security isolation baked in. It does this with four capabilities I couldn&#8217;t find anywhere else combined into one system.</p><p><strong>Spec-Driven Execution</strong></p><p>Every worker reads from shared files: requirements.md, design.md, and task-graph.json. Workers are stateless. If one crashes, another picks up the task. No context lost. Why? Because context lives in the filesystem, not in conversation history that evaporates when a process dies.</p><p>The task graph defines dependencies and exclusive file ownership. Two workers never touch the same file in the same execution level. Merge conflicts within levels become structurally impossible. Not &#8220;unlikely.&#8221; Not &#8220;rare.&#8221; Impossible.</p><p>This sounds obvious until you look at what everyone else built. Most parallel agent solutions let agents race each other, then deal with conflicts at merge time. That&#8217;s backwards. Zerg eliminates the conflict at the design phase because I got tired of debugging merge disasters at midnight.</p><p><strong>Context Engineering That Actually Works</strong></p><p>Zerg achieves 30-50% token reduction per worker through three mechanisms that nobody else bothered to implement systematically.</p><p>First, command splitting. Nine large commands are split into core documentation (about 30% of tokens) and detail files (about 70%). Workers load only what they need. A worker doing Python refactoring doesn&#8217;t need the full Kubernetes deployment guide sitting in its context window.</p><p>Second, security rule filtering. Instead of loading every security rule for every language, workers load only rules matching the file extensions in their task. A Python-only task doesn&#8217;t get JavaScript security guidance, consuming tokens for no reason.</p><p>Third, task-scoped context. Each worker gets spec excerpts plus dependency context within a 4,000-token budget by default. Not the whole spec. Not everything that might be relevant. Just what&#8217;s needed for this task.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!LAFh!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cb66d4e-b49e-4683-a644-b0a9b8668a21_8192x2320.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!LAFh!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cb66d4e-b49e-4683-a644-b0a9b8668a21_8192x2320.png 424w, https://substackcdn.com/image/fetch/$s_!LAFh!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cb66d4e-b49e-4683-a644-b0a9b8668a21_8192x2320.png 848w, https://substackcdn.com/image/fetch/$s_!LAFh!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cb66d4e-b49e-4683-a644-b0a9b8668a21_8192x2320.png 1272w, https://substackcdn.com/image/fetch/$s_!LAFh!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cb66d4e-b49e-4683-a644-b0a9b8668a21_8192x2320.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!LAFh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cb66d4e-b49e-4683-a644-b0a9b8668a21_8192x2320.png" width="1456" height="412" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0cb66d4e-b49e-4683-a644-b0a9b8668a21_8192x2320.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:412,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:964069,&quot;alt&quot;:&quot;Comparison chart showing 62% token reduction with context engineering&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/187202248?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cb66d4e-b49e-4683-a644-b0a9b8668a21_8192x2320.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Comparison chart showing 62% token reduction with context engineering" title="Comparison chart showing 62% token reduction with context engineering" srcset="https://substackcdn.com/image/fetch/$s_!LAFh!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cb66d4e-b49e-4683-a644-b0a9b8668a21_8192x2320.png 424w, https://substackcdn.com/image/fetch/$s_!LAFh!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cb66d4e-b49e-4683-a644-b0a9b8668a21_8192x2320.png 848w, https://substackcdn.com/image/fetch/$s_!LAFh!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cb66d4e-b49e-4683-a644-b0a9b8668a21_8192x2320.png 1272w, https://substackcdn.com/image/fetch/$s_!LAFh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cb66d4e-b49e-4683-a644-b0a9b8668a21_8192x2320.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure 2: Context Engineering Token Reduction</figcaption></figure></div><p>Token reduction isn&#8217;t just cost optimization. It&#8217;s attack surface reduction. Every token in a worker&#8217;s context window represents potential instruction injection. IDEsaster attacks succeed because agents process everything in their context as potentially actionable instructions. Less context means fewer attack vectors.</p><p><strong>Security That Isn&#8217;t an Afterthought</strong></p><p>After watching IDEsaster and MCP tool poisoning compromise every major AI coding tool, I built Zerg with isolation as an architectural requirement.</p><p>You have the option of containerized isolation runs. Workers execute as non-root. Environment filtering blocks LD_PRELOAD and other injection vectors. Workers can&#8217;t access your SSH keys, cloud credentials, or API tokens unless you explicitly provide them.</p><p>Security rules get fetched automatically from the <strong><a href="https://github.com/TikiTribe/claude-secure-coding-rules">TikiTribe/claude-secure-coding-rules</a></strong> repository during initialization. External sourcing matters: keeping security rules outside your repository prevents poisoned commits from degrading your security baseline. A compromised developer machine can&#8217;t inject malicious &#8220;security rules&#8221; that weaken protections.</p><p>Pre-commit hooks run before any worker commits code. Secret detection scans for exposed credentials. Security rule validation ensures generated code meets your configured baseline. Even if a worker hallucinates insecure code, the commit fails before vulnerable code reaches your repository. It&#8217;s not perfect&#8230; nothing is.</p><p>Git worktree isolation means each worker operates in its own worktree. Workers can&#8217;t see each other&#8217;s in-progress changes. A compromised worker can&#8217;t poison another worker&#8217;s execution by modifying shared files.</p><p><strong>Resilience That Assumes Failure</strong></p><p>Circuit breakers trigger after five consecutive failures, enforcing a 60-second cooldown. Backpressure control with green, yellow, and red zones based on failure rates. Worker crash recovery with heartbeat monitoring and automatic restart.</p><p>Crashes don&#8217;t count against retry limits because infrastructure failures shouldn&#8217;t penalize your task budget. A network timeout isn&#8217;t the same as a logic error. The system knows the difference.</p><p>The diagnostics engine parses errors from Python, JavaScript, Go, and Rust. Bayesian hypothesis testing against 30+ known failure patterns. When five workers are running and something breaks, you need to know whether it&#8217;s one worker&#8217;s problem or everybody&#8217;s problem.</p><p>Got it. Let me write this section based on the architectural implications of each mode.</p><h2>Three Ways to Rush: Tasks, Subprocess, and Container Modes</h2><p>Zerg doesn&#8217;t force you into one execution model. Different projects have different constraints. A solo developer on a MacBook has different needs than an enterprise team running CI/CD pipelines on hardened infrastructure. So Zerg ships with three rush modes, each with distinct tradeoffs.</p><p><strong>Tasks Mode</strong></p><p>The default. Each worker becomes a Claude Code subagent through Claude&#8217;s native Task system, with its own conversation context and tool permissions. The orchestrator coordinates through Claude&#8217;s built-in task management rather than managing processes directly.</p><p>Tasks mode applies Claude Code&#8217;s native guardrails to each worker. Permission boundaries come from Claude&#8217;s task system rather than from OS-level isolation. You get integration with Claude&#8217;s tooling ecosystem, including the ability to specify different models per worker if cost optimization matters. Haiku for simple file operations, Sonnet for complex refactoring, Opus when you need the heavy artillery.</p><p>The tradeoff is coordination overhead. Task spawning goes through Claude&#8217;s API layer, which adds latency compared to raw subprocess spawning. For tasks that complete in seconds, this overhead is noticeable. For tasks that run for minutes, it&#8217;s negligible. Most parallel workloads fall into the second category.</p><p><strong>Subprocess Mode</strong></p><p>Each worker runs as a separate OS process on your local machine. Faster to spin up because there&#8217;s no API coordination layer. Workers share your filesystem through git worktrees, which means they inherit your local environment, your installed tools, your shell configuration.</p><p>Use subprocess mode when you&#8217;re iterating quickly, and latency matters more than isolation. It&#8217;s the lightest-weight option. The tradeoff is that process isolation is weaker than both tasks mode and container mode. A compromised worker could theoretically access memory or environment variables from sibling processes. For most development workflows on trusted codebases, this risk is acceptable. For production pipelines processing untrusted code, it&#8217;s not.</p><p><strong>Container Mode</strong></p><p>Maximum isolation. Each worker spawns inside its own container with a separate filesystem, network namespace, and process tree. Workers can&#8217;t see each other. A compromised worker can&#8217;t escape to poison siblings or exfiltrate credentials from the host.</p><p>This is the mode that addresses IDEsaster-class attacks head-on. When a malicious README injects instructions into a worker&#8217;s context, that worker operates inside a container running as non-root with LD_PRELOAD blocked and environment variables filtered. Even if the attack succeeds at the prompt level, the blast radius stays contained. The worker can&#8217;t reach your SSH keys, can&#8217;t access your cloud credentials, can&#8217;t modify files outside its designated worktree.</p><p>Container mode requires Docker or a compatible runtime. Startup time increases because each worker needs a container image pulled and initialized. For short-lived tasks, this overhead hurts. For longer parallel workloads where security matters, it&#8217;s the right call.</p><p><strong>Picking the Right Mode</strong></p><p>The decision tree is straightforward. Are you processing untrusted code or operating in a high-security environment? Container mode. Are you optimizing for raw speed on a trusted codebase? Subprocess mode. Want Claude&#8217;s native permission system with reasonable defaults? Stick with the tasks mode.</p><p>You can also mix modes across different rush phases. Use subprocess mode during rapid prototyping, then switch to container mode before merging to main. The task graph doesn&#8217;t care which mode executes it. Workers are stateless. The mode determines isolation boundaries, not task logic.</p><h2>How the Architecture Actually Works</h2><p>The design philosophy prioritizes explicit over implicit. Configuration lives in JSON and Markdown files, not environment variables scattered across your system. State lives in the filesystem, not in process memory that dies with a crash. Every decision is auditable because every decision produces a traceable artifact.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!9679!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6fa65e60-099e-4e8f-b103-f445d2cc4148_6491x2480.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!9679!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6fa65e60-099e-4e8f-b103-f445d2cc4148_6491x2480.png 424w, https://substackcdn.com/image/fetch/$s_!9679!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6fa65e60-099e-4e8f-b103-f445d2cc4148_6491x2480.png 848w, https://substackcdn.com/image/fetch/$s_!9679!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6fa65e60-099e-4e8f-b103-f445d2cc4148_6491x2480.png 1272w, https://substackcdn.com/image/fetch/$s_!9679!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6fa65e60-099e-4e8f-b103-f445d2cc4148_6491x2480.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!9679!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6fa65e60-099e-4e8f-b103-f445d2cc4148_6491x2480.png" width="6491" height="2480" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6fa65e60-099e-4e8f-b103-f445d2cc4148_6491x2480.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:2480,&quot;width&quot;:6491,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:889101,&quot;alt&quot;:&quot;Flowchart showing plan, design, rush, merge, and review phases&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/187202248?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0bf4baa-d009-44a7-8ca6-c87196cdb903_6491x2480.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Flowchart showing plan, design, rush, merge, and review phases" title="Flowchart showing plan, design, rush, merge, and review phases" srcset="https://substackcdn.com/image/fetch/$s_!9679!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6fa65e60-099e-4e8f-b103-f445d2cc4148_6491x2480.png 424w, https://substackcdn.com/image/fetch/$s_!9679!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6fa65e60-099e-4e8f-b103-f445d2cc4148_6491x2480.png 848w, https://substackcdn.com/image/fetch/$s_!9679!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6fa65e60-099e-4e8f-b103-f445d2cc4148_6491x2480.png 1272w, https://substackcdn.com/image/fetch/$s_!9679!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6fa65e60-099e-4e8f-b103-f445d2cc4148_6491x2480.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure 3: Zerg Execution Pipeline</figcaption></figure></div><p>The execution flow follows a deliberate progression that mirrors how experienced engineers break down complex work.</p><p><strong>Planning</strong> starts with Socratic discovery through the /zerg:plan command. You describe what you want. Zerg asks clarifying questions. The output is requirements.md. Not code. Not tasks. Requirements that a human can read and verify before anything else happens.</p><p><strong>Design</strong> analyzes the architecture and generates task-graph.json with exclusive file ownership assignments. This is where parallel execution becomes safe. Each file gets assigned to exactly one worker per level. No races. No conflicts.</p><p><strong>Rush</strong> spawns workers in git worktrees and executes tasks in parallel, merging results at level boundaries with quality gates. A level doesn&#8217;t complete until all its workers pass. If something fails, the whole level fails. No partial merges that leave your codebase in a broken state.</p><p><strong>Review</strong> runs automated testing, security scanning, and code review. Not optional. Not &#8220;if you have time.&#8221; Built into the flow.</p><p>The 26 slash commands have /z: shortcuts because I got tired of typing /zerg: every time. /zerg:init becomes /z:init. /zerg:rush becomes /z:rush. Small thing. Adds up.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Gxl9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad3bc8ab-b6fe-4e8c-8511-843c37a27dc1_7312x3897.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Gxl9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad3bc8ab-b6fe-4e8c-8511-843c37a27dc1_7312x3897.png 424w, https://substackcdn.com/image/fetch/$s_!Gxl9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad3bc8ab-b6fe-4e8c-8511-843c37a27dc1_7312x3897.png 848w, https://substackcdn.com/image/fetch/$s_!Gxl9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad3bc8ab-b6fe-4e8c-8511-843c37a27dc1_7312x3897.png 1272w, https://substackcdn.com/image/fetch/$s_!Gxl9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad3bc8ab-b6fe-4e8c-8511-843c37a27dc1_7312x3897.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Gxl9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad3bc8ab-b6fe-4e8c-8511-843c37a27dc1_7312x3897.png" width="1456" height="776" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ad3bc8ab-b6fe-4e8c-8511-843c37a27dc1_7312x3897.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:776,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1384729,&quot;alt&quot;:&quot;Architecture diagram showing orchestrator, security, workers, resilience, and output layers&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/187202248?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad3bc8ab-b6fe-4e8c-8511-843c37a27dc1_7312x3897.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Architecture diagram showing orchestrator, security, workers, resilience, and output layers" title="Architecture diagram showing orchestrator, security, workers, resilience, and output layers" srcset="https://substackcdn.com/image/fetch/$s_!Gxl9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad3bc8ab-b6fe-4e8c-8511-843c37a27dc1_7312x3897.png 424w, https://substackcdn.com/image/fetch/$s_!Gxl9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad3bc8ab-b6fe-4e8c-8511-843c37a27dc1_7312x3897.png 848w, https://substackcdn.com/image/fetch/$s_!Gxl9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad3bc8ab-b6fe-4e8c-8511-843c37a27dc1_7312x3897.png 1272w, https://substackcdn.com/image/fetch/$s_!Gxl9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad3bc8ab-b6fe-4e8c-8511-843c37a27dc1_7312x3897.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure 4:  Zerg Architecture Layers</figcaption></figure></div><h2>Other Tools and Why They Fall Short</h2><p>The Claude Code ecosystem has exploded with orchestration tools. Claude-flow claims 175+ MCP tools and calls itself &#8220;the leading agent orchestration platform.&#8221; Oh-my-claudecode offers 5 execution modes with 32 specialized agents. Code-conductor provides parallel worktree execution.</p><p>None of them solve the challenge of a coordinated security posture, systematic context engineering, and level-based execution with guaranteed file isolation. Most treat security as a feature flag you can toggle. Most leave context management entirely to you. Most assume you want role-based agents with personalities instead of spec-driven workers that do what the task graph says.</p><p>The broader multi-agent orchestration market is projected to reach $8.5 billion by 2026. Organizations using multi-agent architectures report 45% faster problem resolution, but these gains depend on proper orchestration. You can have the fanciest agents in the world. If they&#8217;re stepping on each other, wasting tokens, crashing without recovery, and vulnerable to the same attacks that compromised every AI IDE last year, you&#8217;ve built an expensive disappointment.</p><h2>Getting Started Without Breaking Things</h2><p>Clone the repository. The README walks through a complete example.</p><p>The /zerg:init command establishes your project&#8217;s architecture and security baseline. It creates the configuration structure, fetches OWASP rules from Tikitribe, sets up container isolation if you&#8217;re using devcontainers, installs pre-commit hooks, and initializes audit logging. Everything about Zerg&#8217;s security posture starts with that command.</p><p>Start small. Pick a feature that breaks into independent subtasks. Run /zerg:plan to generate requirements. Actually, read the output. Then /zerg:design to create the task graph. Inspect file ownership assignments. If something looks wrong, fix it before running /zerg:rush. Human oversight at decision points. Not full autonomy. You&#8217;re still responsible for what ships.</p><p>Version 0.2.0 is officially released today, February 7, 2026. This isn&#8217;t a weekend hack. It&#8217;s production infrastructure I&#8217;ve been running on my own projects. Having said that, it&#8217;s released under an MIT license with no warranty.</p><h2>What Building This Taught Me</h2><p>The biggest insight wasn&#8217;t technical. Agentic AI development has a trust problem pure capability can&#8217;t solve. Agents fail. They hallucinate. They consume resources unpredictably. And after IDEsaster, we know they can be weaponized through their context windows. The solution isn&#8217;t smarter agents. It&#8217;s smarter orchestration that assumes failure, contains blast radius, and recovers gracefully.</p><p>Building Zerg forced me to confront failure modes I&#8217;d never considered. What happens when a worker hallucinates a dependency that doesn&#8217;t exist? What if two workers both claim the same file despite the task graph? What if a poisoned README injects malicious instructions? Each edge case required explicit handling.</p><p>The <strong><a href="https://www.rockcyber.com/ai-strategy-and-governance">CARE framework</a></strong> I&#8217;ve developed for AI governance emphasizes that resilience isn&#8217;t a feature you add later. It&#8217;s an architectural decision you make at the beginning, or pay for forever. I&#8217;ve watched enterprise teams learn this the expensive way.</p><p>Context engineering will define the next generation of agent builders. Prompt engineering is table stakes. Understanding how to curate, compress, and isolate context? That&#8217;s the skill separating toy projects from production systems.</p><p><strong>Key Takeaway:</strong> Parallel Claude Code execution was inevitable. Security-first, context-aware orchestration was a choice. Zerg makes that choice for you so you can focus on the work that matters.</p><h3>What to do next</h3><p>Star the repository at <a href="https://Github.com/rocklambros/zerg">Github.com/rocklambros/zerg</a> and try the quick start guide. Open issues when you hit problems. Use it on your own projects and tell me what breaks.</p><p>For hands-on guidance on deploying this in enterprise environments,&nbsp;<a href="https://rockcyber.com/contact">reach out to RockCyber</a>. For more on AI security and practical development with the occasional rant about things that should be obvious but apparently aren&#8217;t, subscribe to <a href="https://rockcybermusings.com/">RockCyber Musings</a>.</p><p>&#128073; <a href="https://github.com/rocklambros/zerg">Star </a><strong><a href="https://github.com/rocklambros/zerg">Zerg</a></strong><a href="https://github.com/rocklambros/zerg">: The original security-first parallel Claude Code orchestrator repository and try the quick start guide.</a></p><p>&#128073; Subscribe for more AI security and governance insights with the occasional rant.</p><p>&#128073; Visit <strong><a href="https://www.rockcyber.com/">RockCyber.com</a></strong> to learn more about how we can help you in your traditional Cybersecurity and AI Security and Governance Journey</p><p>&#128073; Want to save a quick $100K? Check out our AI Governance Tools at <strong><a href="https://aigovernancetoolkit.com/">AIGovernanceToolkit.com</a></strong></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share RockCyber Musings&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share RockCyber Musings</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[Weekly Musings Top 10 AI Security Wrapup: Issue 24 January 30, 2026 - February 5, 2026]]></title><description><![CDATA[Shadow AI Meltdowns, CISA&#8217;s ChatGPT Scandal, and the EU&#8217;s Liability Trap]]></description><link>https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260130-20260205</link><guid isPermaLink="false">https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260130-20260205</guid><dc:creator><![CDATA[Rock Lambros]]></dc:creator><pubDate>Fri, 06 Feb 2026 13:50:59 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!6Rlc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14c6a96c-6998-4ac1-b0d1-4dead290bf59_1024x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6Rlc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14c6a96c-6998-4ac1-b0d1-4dead290bf59_1024x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6Rlc!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14c6a96c-6998-4ac1-b0d1-4dead290bf59_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!6Rlc!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14c6a96c-6998-4ac1-b0d1-4dead290bf59_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!6Rlc!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14c6a96c-6998-4ac1-b0d1-4dead290bf59_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!6Rlc!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14c6a96c-6998-4ac1-b0d1-4dead290bf59_1024x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6Rlc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14c6a96c-6998-4ac1-b0d1-4dead290bf59_1024x1024.png" width="1024" height="1024" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/14c6a96c-6998-4ac1-b0d1-4dead290bf59_1024x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1024,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1233556,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/186998760?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14c6a96c-6998-4ac1-b0d1-4dead290bf59_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!6Rlc!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14c6a96c-6998-4ac1-b0d1-4dead290bf59_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!6Rlc!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14c6a96c-6998-4ac1-b0d1-4dead290bf59_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!6Rlc!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14c6a96c-6998-4ac1-b0d1-4dead290bf59_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!6Rlc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14c6a96c-6998-4ac1-b0d1-4dead290bf59_1024x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The trap isn&#8217;t the AI itself. It&#8217;s the illusion that you control it while your engineers run it on private servers you don&#8217;t know about. This week proved that &#8220;Shadow AI&#8221; is no longer a buzzword for a slide deck. It&#8217;s a lobster claw pinching your infrastructure while you sleep.</p><p>We saw a major open-source agent turn into a security disaster. A federal agency chief got caught breaking his own rules. The European Union quietly shifted the liability burden onto your shoulders. If you thought 2026 was the year we figured this out, you were wrong.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260130-20260205?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260130-20260205?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><h2>1. OpenClawd&#8217;s &#8220;Sovereign&#8221; Security Meltdown</h2><p>OpenClawd (formerly Moltbot, formerly ClawdBot) launched its &#8220;Secure Hosted Platform&#8221; on January 31, followed by a framework integration announcement on February 5. They promised &#8220;Sovereign AI&#8221; that runs on private infrastructure. Security researchers spent the week tearing it apart. Reports surfaced of thousands of &#8220;Clawdbot&#8221; agents running with open ports and no authentication.</p><p><strong>Why it matters</strong></p><ul><li><p>You cannot have &#8220;sovereign&#8221; AI that relies on a central provider&#8217;s management plane. That&#8217;s SaaS with extra steps and more liability.</p></li><li><p>Researchers demonstrated that a simple email sent to an OpenClawd agent could trick it into exfiltrating local files. If your agent reads your email, your email attacks your agent.</p></li><li><p>Token Security reports that 22% of their customers had employees running these agents on corporate networks.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Audit agent permissions. If you have developers running local agent frameworks, block their egress at the firewall immediately.</p></li><li><p>Isolate the executors. Never run an agent on a machine with production credentials. Use ephemeral sandboxes that die after one task.</p></li><li><p>Ignore the label. Treat &#8220;sovereign&#8221; marketing claims as a warning sign.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>I love the irony. Almost poetic. A company sells you &#8220;sovereignty,&#8221; the digital equivalent of a cabin in the woods, by asking you to trust their hosted control plane. It&#8217;s like buying a generator for the apocalypse that requires a Wi-Fi connection to the power company to start.</p><p>The flaw here isn&#8217;t in the code. It&#8217;s in the philosophy. We&#8217;re witnessing the &#8220;SaaS-ification&#8221; of open source, where vendors wrap dangerous, powerful tools in a slick UI and sell them to developers who don&#8217;t want to read the documentation. We&#8217;re giving autonomous shell access to software that can be hypnotized by a phishing email. Think about that. We spent the last twenty years firing system administrators who ran scripts they didn&#8217;t understand. If a junior admin ran <code>curl | bash</code> from a suspicious URL, we&#8217;d walk them out of the building.</p><p>Now? Now we&#8217;re building billion-dollar businesses on bots that don&#8217;t even understand the scripts <em>they</em> write. We&#8217;re deploying agents that have the authority of a senior engineer but the judgment of a toddler. The &#8220;sovereign&#8221; label is dangerous because it lowers your guard. It suggests that because the bits live on your hard drive, the risk is contained. But when that local agent has an open port and a directive to &#8220;read my emails and be helpful,&#8221; it doesn&#8217;t matter where the server is rack-mounted. You haven&#8217;t built a sovereign fortress. You&#8217;ve installed a persistent, intelligent backdoor and handed the keys to anyone who can write a clever prompt.</p><h2>2. CISA Director Caught in ChatGPT Scandal</h2><p>Acting CISA Director Madhu Gottumukkala admitted to uploading sensitive contracting documents to a public instance of ChatGPT. The agency issued guidance on insider threats Friday, January 30. That awkwardly coincided with the internal fallout. The guidance warns against the very behavior their chief exhibited.</p><p><strong>Why it matters</strong></p><ul><li><p>You cannot enforce policy when the person at the top ignores it.</p></li><li><p>Even the agency responsible for critical infrastructure struggles to keep data out of public models.</p></li><li><p>The &#8220;short-term exception&#8221; excuse undermines every zero-trust architecture CISA promotes.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Audit executives. Run a specific check on executive accounts for AI usage. They&#8217;re your highest risk users.</p></li><li><p>Update your AUP. Explicitly define what &#8220;sensitive&#8221; means for LLMs. &#8220;For Official Use Only&#8221; isn&#8217;t enough.</p></li><li><p>Deploy DLP. Static policy documents don&#8217;t stop uploads. You need browser-level enforcement.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>This is the classic security hypocrite move, and it&#8217;s the single greatest destroyer of security culture in any organization. The Director gets a &#8220;temporary exception&#8221; to violate the rules because he&#8217;s &#8220;important&#8221; and &#8220;busy.&#8221; You get a lecture on insider threats and a mandatory 45-minute training video.</p><p>This dynamic is why nobody listens to security teams. We&#8217;re viewed as the &#8220;Department of No&#8221; for the peasants, while the aristocracy does whatever they want. Let&#8217;s look deeper&#8230; Why did he do it? Because the tool is useful. That&#8217;s the uncomfortable truth we have to face. He didn&#8217;t upload those contracts maliciously. He did it because ChatGPT could summarize them in ten seconds, and he didn&#8217;t have an internal tool that could do the same.</p><p>Shadow AI isn&#8217;t driven by malice. It&#8217;s driven by friction. If your secure, approved internal AI takes five minutes to load, requires a VPN, and hallucinates half the time, your executives are going to use ChatGPT. They will paste that top-secret strategy document into the public web because their desire to get the job done outweighs their fear of a theoretical leak. You cannot policy your way out of a utility problem. If you want your engineers and executives to follow the rules, you have to build tools that are better than the contraband. Until then, you&#8217;re shouting into the void while your boss pastes the roadmap into OpenAI.</p><h2>3. Match Group&#8217;s Identity Failure</h2><p>Match Group (parent of Tinder and Hinge) confirmed a data breach on January 30 involving the theft of user data. The threat actor group, ShinyHunters, claimed to have stolen 10 million records. The attack vector wasn&#8217;t a sophisticated zero-day in the AI models. It was a social engineering campaign targeting the company&#8217;s Okta environment.</p><p><strong>Why it matters</strong></p><ul><li><p>Identity is the perimeter. Attackers didn&#8217;t break the encryption. They tricked the admin.</p></li><li><p>The use of voice phishing to compromise SSO credentials is becoming the standard entry point for major breaches.</p></li><li><p>For a company built on privacy, a breach of this magnitude is a catastrophic failure.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Enforce FIDO2. Move privileged access to hardware keys. Phone-based 2FA is dead.</p></li><li><p>Monitor SSO. Alert on impossible travel and device mismatches for all administrative accounts.</p></li><li><p>Drill the help desk. Your support team is the target. Test them on vishing attempts regularly.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>We spend millions securing neural networks, buying AI-powered anomaly detection tools, and hardening our Kubernetes clusters. We spend zero on the guy answering the phone at the IT help desk.</p><p>ShinyHunters didn&#8217;t need a GPU cluster to crack this. They didn&#8217;t need a sophisticated prompt injection attack or a zero-day exploit in the kernel. They needed a convincing story and a phone number. This is the &#8220;Identity Crisis&#8221; of the AI era. As our technical defenses get better, as firewalls get smarter and endpoint protection gets ruthless, attackers pivot to the one operating system that cannot be patched: the human brain.</p><p>You&#8217;re building a castle on a swamp if your AI security strategy doesn&#8217;t start with &#8220;fix the identity provider.&#8221; We&#8217;re seeing a resurgence of old-school con artistry, supercharged by modern tools. Why do attackers need to hack the server when they can just log right on in? This has been happening since the early days of cyber&#8230;They&#8217;re calling your support team, pretending to be the new VP of Marketing who lost their phone, and getting a bypass code. It&#8217;s embarrassing. We talk about &#8220;adversarial machine learning&#8221; while our front door opens with a polite request. Stop buying AI security tools until you&#8217;ve distributed YubiKeys to everyone with admin access. Hardware doesn&#8217;t have feelings, and it can&#8217;t be charmed by a smooth talker.</p><h2>4. Ex-Google Engineer Convicted for Theft</h2><p>A federal jury convicted Linwei Ding on Friday, January 30. The former Google engineer stole over 500 files containing trade secrets about AI supercomputing chips. He was building a rival startup in China while still collecting a paycheck from Google.</p><p><strong>Why it matters</strong></p><ul><li><p>This confirms that the biggest risk to your IP is the employee with a badge.</p></li><li><p>This wasn&#8217;t a side hustle. It was a coordinated effort to transfer capability to a foreign adversary.</p></li><li><p>He uploaded files to his personal Google Cloud account. It took months to catch him.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Monitor egress. Watch for large uploads to personal cloud storage.</p></li><li><p>Review access logs. Ding had access to files he didn&#8217;t need. Implement least privilege.</p></li><li><p>Background checks. Re-screen employees with access to critical IP.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Google has one of the best security teams in the world. They invented BeyondCorp. They have zero-trust down to a science. And they <em>still</em> missed this for months. Linwei Ding wasn&#8217;t using a sophisticated rootkit. He was copying files to his personal notes and uploading them to the cloud.</p><p>This destroys the ego of every CISO reading this. If Google can&#8217;t stop a determined insider from walking out the door with the crown jewels, neither can you. We pretend that our NDAs are magical force fields. We pretend that background checks done five years ago still matter. They don&#8217;t.</p><p>The uncomfortable reality is that the modern tech worker is mobile, opportunistic, and often feels no loyalty to the logo on their paycheck. Combine that with the geopolitical gold rush for AI dominance, and you have a recipe for disaster. Your source code, your model weights, your chip designs are liquid assets now. You need technical controls that scream when someone moves a terabyte of data to an unmanaged device. You need to stop looking for &#8220;hackers&#8221; in hoodies and start watching the quiet engineer who logs in at 2 AM and starts &#8220;backing up&#8221; his work. Trust is not a control. Trust is a vulnerability.</p><h2>5. OT Attacks Surge as Hackers Weaponize AI</h2><p>Forescout released a report on February 4 showing a massive spike in attacks on Operational Technology. The data shows attackers use AI to analyze industrial protocols and find weaknesses. They&#8217;re not breaking IT networks. They&#8217;re coming for the factory floor.</p><p><strong>Why it matters</strong></p><ul><li><p>OT attacks shut down power plants and manufacturing lines.</p></li><li><p>AI lowers the skill required to understand complex industrial protocols like Modbus.</p></li><li><p>Attackers use AI to map networks faster than defenders can patch them.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Segment OT. Air gaps are a myth. Use strict network segmentation.</p></li><li><p>Monitor protocols. Standard IDS signatures miss AI-generated anomalies in industrial traffic.</p></li><li><p>Patch PLCs. I know it&#8217;s hard. Do it anyway.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>I spent years telling people that obscurity is not security. The counter-argument from the OT world was always, &#8220;But Rock, nobody understands our proprietary 1990s protocol! It&#8217;s too weird to hack!&#8221;</p><p>Well, guess what? Now an AI can read your obscure, dusty documentation, analyze your proprietary protocol, and write a Python script to exploit it in five minutes. The &#8220;security by obscurity&#8221; defense is dead. Large Language Models murdered it.</p><p>We&#8217;re entering a terrifying phase where the digital barrier to physical destruction is crumbling. It used to take a nation-state team of experts months to figure out how to spin a centrifuge too fast or shut down a power grid switch. Now, a script kiddie with a customized LLM can parse the traffic and find the &#8220;off&#8221; command. If your factory runs on Windows 95 and hopes/prayers, you&#8217;re in trouble. The industrial world has been lazy, relying on the fact that their systems were too boring and complex to attack. AI doesn&#8217;t get bored, and it loves complexity. Wake up and segment your networks before your assembly line holds you for ransom.</p><h2>6. UNICEF Reports 1.2 Million Deepfake Victims</h2><p>UNICEF released a horrifying report on February 4. At least 1.2 million children have had their images manipulated into sexually explicit deepfakes in the past year. The barrier to creating this content has dropped to zero.</p><p><strong>Why it matters</strong></p><ul><li><p>This is the darkest side of generative AI.</p></li><li><p>Platforms that host or enable this content will face massive regulatory backlash.</p></li><li><p>Your employees are parents. This issue affects them personally and will bleed into the workplace.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Block generation sites. Create distinct categories for &#8220;AI generation&#8221; in your web filter.</p></li><li><p>Support employees. Offer resources for staff dealing with digital harassment.</p></li><li><p>Audit your brand. Ensure your own marketing materials aren&#8217;t being scraped for these datasets.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>This makes me sick. Genuinely turns my stomach. We sit in conference rooms and debate &#8220;AI safety&#8221; in abstract terms, discussing alignment theory, the singularity, and paperclip maximizers. Real human beings, real children, are getting hurt right now.</p><p>The tech industry moved too fast and broke the wrong things. We released powerful image generation tools into the wild with zero safeguards, shrugging our shoulders and saying, &#8220;We can&#8217;t control how people use it.&#8221; That&#8217;s a lie. We prioritize growth over safety every single time.</p><p>This isn&#8217;t a &#8220;society&#8221; problem. It&#8217;s a corporate problem. Your employees are parents. When they come to work terrified because their child is being targeted by classmates using an app you might have invested in, or an open-source model your team is using, that impacts your business. If you build these tools, you have a moral obligation to lock them down. And if you&#8217;re a security leader, you need to be the voice in the room asking, &#8220;How can this be abused?&#8221; before the product ships. We&#8217;re failing the most vulnerable people on the planet because we&#8217;re too enamored with our own cleverness.</p><h2>7. Snyk Launches AI Security Fabric</h2><p>Snyk announced a new &#8220;AI Security Fabric&#8221; on February 3. The tool claims to unify visibility across the software development lifecycle. It focuses on finding vulnerabilities in AI-generated code and the models themselves.</p><p><strong>Why it matters</strong></p><ul><li><p>We&#8217;re finally seeing vendors move beyond point solutions for AI security.</p></li><li><p>Catching AI vulnerabilities in the IDE is cheaper than fixing them in production.</p></li><li><p>You cannot manually review every line of code Copilot writes. You need automated guardrails.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Evaluate your stack. If you use Snyk, turn this feature on.</p></li><li><p>Scan generated code. Treat AI-written code as untrusted input. Scan it for hardcoded secrets.</p></li><li><p>Enforce policy. Block code commits that fail the AI security scan.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>I ignore most vendor press releases. They&#8217;re fluff and buzzwords. But this one is different because it tacitly admits something important: The AI tools we sold you are making your code worse.</p><p>Think about it. Snyk is selling you a vacuum cleaner to clean up the mess that your other AI tools (like Copilot and ChatGPT) are making. Developers are using AI to write code faster, but that code is often insecure, bloated, or hallucinatory. So now we need <em>another</em> AI to watch the first AI and tell us where it screwed up.</p><p>It&#8217;s a racket. A brilliant, necessary racket. We&#8217;re entering the era of &#8220;Machine-Assisted Insecurity,&#8221; where we generate vulnerabilities at machine speed. Naturally, the only solution is to buy remediation at machine speed. I don&#8217;t blame Snyk. They&#8217;re filling a void. But let&#8217;s be clear about what&#8217;s happening: we&#8217;re automating the creation of technical debt. If you don&#8217;t have a &#8220;fabric&#8221; or a &#8220;platform&#8221; or whatever we&#8217;re calling it this week to catch this stuff, you&#8217;re going to drown in a sea of mediocrity generated by a stochastic parrot.</p><h2>8. Thailand Warnings on Deepfake Fraud</h2><p>Thai authorities issued a warning on February 3 about a surge in deepfake fraud targeting working-age professionals. The scams have caused over 23 billion baht in losses. This isn&#8217;t elderly people getting tricked. Savvy professionals are falling for AI-generated video calls.</p><p><strong>Why it matters</strong></p><ul><li><p>Scammers are moving upstream to high-value corporate targets.</p></li><li><p>Video and voice are no longer proof of identity.</p></li><li><p>Thailand is the canary in the coal mine. This tactic is coming to a finance department near you.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Kill voice auth. Stop using voice recognition for password resets.</p></li><li><p>Implement challenge phrases. Establish a verbal code word for authorizing money transfers.</p></li><li><p>Train finance teams. Show them what a high-quality deepfake looks like.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>I&#8217;ve verified this myself, and it&#8217;s terrifying. The tools are no longer &#8220;research previews.&#8221; They&#8217;re commodities. You can clone a voice with three seconds of audio. You can face-swap a video call in real-time with a gaming laptop.</p><p>The warning from Thailand is significant because it destroys the myth that &#8220;only Grandma falls for scams.&#8221; These are working-age professionals, finance officers, and managers getting duped. Why? Because our brains are hardwired to trust our senses. If I see your face and hear your voice, my brain says, &#8220;That&#8217;s you.&#8221;</p><p>We have to retrain a million years of human evolution in about six months. We have to bring back old-school paranoia. Digital trust is broken. If your CFO calls you on WhatsApp and asks for a wire transfer, you have to assume it&#8217;s a lie. Hang up. Call them back on their internal line. Walk down the hall to their office. We need to implement &#8220;Challenge Phrases,&#8221; like a safe word for corporate finance. It sounds ridiculous, but we&#8217;re back to the days of spycraft because the digital medium can no longer be trusted. If you believe your eyes and ears, you will lose your budget.</p><h2>9. Global Push for US AI Standards</h2><p>The White House announced on January 30 a plan to advance US AI cybersecurity standards globally. The goal is to get allies to adopt the NIST frameworks and lock out adversaries. This is soft power with a hard edge.</p><p><strong>Why it matters</strong></p><ul><li><p>A fragmented regulatory environment hurts everyone. This might bring some consistency.</p></li><li><p>If you want to sell AI to US allies, you&#8217;ll need to meet these standards.</p></li><li><p>This draws a clear line between &#8220;Western&#8221; AI governance and the rest of the world.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Align with NIST. If you&#8217;re still using a custom framework, stop. Map everything to the NIST AI RMF.</p></li><li><p>Watch the EU. See how this conflicts with the AI Act. You&#8217;ll likely have to comply with both.</p></li><li><p>Prepare for audits. Global standards mean global certification requirements.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>About time. We&#8217;ve been letting every country, county, and city council invent their own rules for AI safety. It&#8217;s a mess. The internet works because we all agreed on TCP/IP. We didn&#8217;t have a &#8220;French Internet Protocol&#8221; and a &#8220;German Internet Protocol.&#8221;</p><p>AI needs the same thing. I don&#8217;t even care if the NIST standard is perfect. I care that it&#8217;s <em>common</em>. If we can get the G7 countries to agree on a baseline for what &#8220;secure AI&#8221; looks like, we can stop wasting time mapping controls between twelve different spreadsheets.</p><p>Let&#8217;s not be naive. This is also a trade war weapon. By pushing US standards, the White House is trying to ensure that American tech giants write the rules of the road for the next century. If you&#8217;re a CISO, this simplifies your life in the long run but complicates it in the short term. You&#8217;re going to be the pawn in a regulatory chess match between Washington and Brussels. My advice? Pick NIST as your north star. It&#8217;s the most practical, and it&#8217;s the one the guys with the aircraft carriers are backing.</p><h2>10. CISA Plans for an AI-ISAC</h2><p>On February 3, CISA official Nick Andersen outlined plans to replace the disbanded Critical Infrastructure Council with a new structure. This includes a dedicated focus on AI threat sharing, effectively an &#8220;AI-ISAC&#8221; (Information Sharing and Analysis Center).</p><p><strong>Why it matters</strong></p><ul><li><p>We currently rely on Twitter threads for AI threat intel. An ISAC brings structure and verification (but we&#8217;ll see at what speed).</p></li><li><p>When CISA builds an ISAC, regulation follows.</p></li><li><p>The focus on industrial control systems suggests the government is worried about AI impacting physical infrastructure.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Prepare legal. Start the conversation with your legal team about the liability protections required for sharing adversarial AI data.</p></li><li><p>Audit intel feeds. If you don&#8217;t consume threat data related to AI models, budget for it.</p></li><li><p>Volunteer early. If you have the maturity, join the working groups.</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>We have ISACs for everything. We have an ISAC for water, for automotive, for aviation, for space. But for the technology rewriting the entire global economy, we&#8217;ve been relying on random Substack posts and Twitter threads.</p><p>It&#8217;s absurd that I find out about major jailbreaks from a 19-year-old on X before I hear about it from a government agency. An AI-ISAC is the grown-up table. It&#8217;s the move from &#8220;AI is a hobby&#8221; to &#8220;AI is critical infrastructure.&#8221;</p><p>If you want to know whether that weird prompt injection hitting your chatbot is a random troll or a targeted campaign by a persistent threat actor, you need this data. You need to know what other companies are seeing. Here&#8217;s the catch: ISACs only work if you share back. And right now, most companies are terrified to admit they have an AI security problem. We need to get over the shame. If your model gets tricked, share the prompt. It&#8217;s the only way we build herd immunity.</p><h2>The One Thing You Won&#8217;t Hear About But You Need To: The EU AI Act&#8217;s &#8220;Self-Assessment&#8221; Trap</h2><p>While everyone was watching the OpenClawd disaster, a critical change to the EU AI Act went into effect on February 1. The media ignored it. You shouldn&#8217;t. The EU has quietly shifted the compliance model for &#8220;high-risk&#8221; AI systems. Instead of a mandatory external audit by a regulator, the new rule allows &#8220;conformity self-assessment&#8221; for a wide range of applications.</p><p>This sounds like a win. It isn&#8217;t. It&#8217;s a trap.</p><p>By removing the external gatekeeper, the EU has shifted 100% of the liability onto you. There&#8217;s no longer a regulator to blame if things go wrong. If your self-assessment is found lacking after an incident, the fines are astronomical. They handed you the rope and told you to tie the knot yourself.</p><p><strong>Why it matters</strong></p><ul><li><p>You&#8217;re now the judge and jury of your own compliance. If you&#8217;re wrong, you&#8217;re also the victim.</p></li><li><p>Directors can no longer point to a &#8220;passed&#8221; regulatory audit as a shield.</p></li><li><p>Internal teams will pressure you to sign off on self-assessments to speed up deployment.</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Refuse to sign. Don&#8217;t sign a self-assessment without a third-party review. Hire an external firm to &#8220;shadow&#8221; audit you.</p></li><li><p>Update risk registers. Mark every &#8220;self-assessed&#8221; system as high residual risk.</p></li><li><p>Train legal teams. Make sure they understand that &#8220;self-assessment&#8221; doesn&#8217;t mean &#8220;optional compliance.&#8221;</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Regulators are smart. Lazy, but smart. They realized about six months ago that they don&#8217;t have the staff, the budget, or the technical talent to audit every AI model in Europe. It&#8217;s mathematically impossible. So what did they do? Did they reduce the regulations? No.</p><p>They outsourced the enforcement to <em>you</em>.</p><p>This is the ultimate &#8220;cover your ass&#8221; move by the EU. Now, when an AI discriminates against a customer or leaks medical data, the regulator can stand in front of the cameras and say, &#8220;We had strict rules! This company certified that they followed them! They lied to us!&#8221;</p><p>It absolves the government of failure and places the entire burden on your specific signature. Don&#8217;t fall for it. Don&#8217;t let your product managers bully you into signing a self-assessment to hit a launch date. Treat a self-assessment with <em>more</em> rigor than a government audit, because in a government audit, if they miss something, it&#8217;s partly their fault. In a self-assessment, the penalty for lying to yourself is bankruptcy.</p><p>If you found this analysis useful, subscribe at <strong><a href="https://rockcybermusings.com/">rockcybermusings.com</a></strong> for weekly intelligence on AI security developments.</p><p>&#128073; Visit <strong><a href="https://www.rockcyber.com/">RockCyber.com</a></strong> to learn more about how we can help you in your traditional Cybersecurity and AI Security and Governance Journey</p><p>&#128073; Want to save a quick $100K? Check out our AI Governance Tools at <strong><a href="https://aigovernancetoolkit.com/">AIGovernanceToolkit.com</a></strong></p><p>&#128073; Subscribe for more AI and cyber insights with the occasional rant.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share RockCyber Musings&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share RockCyber Musings</span></a></p><h2>References</h2><p>Australian Cyber Security Magazine. (2026, February 2). <em>Cybercriminals hijack AI hosting service to compromise users.</em> https://australiancybersecuritymagazine.com.au/cybercriminals-hijack-ai-hosting-service</p><p>Bitdefender. (2026, January 30). <em>Breach at Tinder, Hinge and OkCupid parent Match Group exposes user data.</em> https://www.bitdefender.com/en-us/blog/hotforsecurity/breach-tinder-hinge-okcupid-match-group-exposes-user-data</p><p>CyberScoop. (2026, February 3). <em>What&#8217;s next for DHS&#8217;s forthcoming replacement critical infrastructure protection panel.</em> https://cyberscoop.com/dhs-critical-infrastructure-panel-replacement</p><p>Digital Bricks. (2026, February 1). <em>The change to the EU AI Act that no one is talking about.</em> https://digitalbricks.eu/eu-ai-act-self-assessment</p><p>IT Security Guru. (2026, February 4). <em>OT attacks surge as threat actors embrace cloud and AI.</em> https://www.itsecurityguru.org/ot-attacks-surge-ai-cloud</p><p>Markets Financial Content. (2026, February 5). <em>Openclawd integrates Openclaw: Scaling sovereign AI in the cloud.</em> https://markets.financialcontent.com/openclawd-openclaw-sovereign-ai</p><p>Markets Business Insider. (2026, February 3). <em>Snyk unveils the AI Security Fabric.</em> https://markets.businessinsider.com/snyk-ai-security-fabric</p><p>Nation Thailand. (2026, February 3). <em>Working-age people targeted by AI deepfake scams, warns AOC 1441.</em> https://www.nationthailand.com/deepfake-scams-working-age</p><p>Newsfile Corp. (2026, January 31). <em>OpenClaw introduces secure hosted Clawdbot platform.</em> https://newsfilecorp.com/openclaw-clawdbot-platform</p><p>SC Media. (2026, January 30). <em>CISA issues insider threat guidance amidst AI misuse concerns.</em> https://www.scworld.com/cisa-insider-threat-guidance-ai</p><p>SC Media. (2026, January 30). <em>Global adoption of US&#8217;s AI cybersecurity standards advanced by Trump admin.</em> https://www.scworld.com/us-ai-cybersecurity-standards-global</p><p>The Hacker News. (2026, January 30). <em>Ex-Google engineer convicted for stealing AI secrets for China startup.</em> https://thehackernews.com/google-engineer-convicted-ai-secrets-china</p><p>UNICEF. (2026, February 4). <em>Deepfake abuse is abuse.</em> https://www.unicef.org/reports/deepfake-abuse</p><p>.</p>]]></content:encoded></item><item><title><![CDATA[NIST Proposed an AI Standards Evaluation Framework That Pretends Attackers Don’t Exist]]></title><description><![CDATA[I submitted 33 comments to NIST GCR 26-069. The proposed AI standards evaluation framework ignores adversarial environments and will fail for security standards.]]></description><link>https://www.rockcybermusings.com/p/nist-ai-standards-framework-ignores-attackers</link><guid isPermaLink="false">https://www.rockcybermusings.com/p/nist-ai-standards-framework-ignores-attackers</guid><dc:creator><![CDATA[Rock Lambros]]></dc:creator><pubDate>Tue, 03 Feb 2026 13:50:52 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!acvk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31da3d0e-6f77-47ae-821a-b2303e5d880d_2048x2048.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!acvk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31da3d0e-6f77-47ae-821a-b2303e5d880d_2048x2048.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!acvk!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31da3d0e-6f77-47ae-821a-b2303e5d880d_2048x2048.png 424w, https://substackcdn.com/image/fetch/$s_!acvk!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31da3d0e-6f77-47ae-821a-b2303e5d880d_2048x2048.png 848w, https://substackcdn.com/image/fetch/$s_!acvk!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31da3d0e-6f77-47ae-821a-b2303e5d880d_2048x2048.png 1272w, https://substackcdn.com/image/fetch/$s_!acvk!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31da3d0e-6f77-47ae-821a-b2303e5d880d_2048x2048.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!acvk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31da3d0e-6f77-47ae-821a-b2303e5d880d_2048x2048.png" width="1456" height="1456" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/31da3d0e-6f77-47ae-821a-b2303e5d880d_2048x2048.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1456,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:7626398,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/185814202?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31da3d0e-6f77-47ae-821a-b2303e5d880d_2048x2048.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!acvk!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31da3d0e-6f77-47ae-821a-b2303e5d880d_2048x2048.png 424w, https://substackcdn.com/image/fetch/$s_!acvk!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31da3d0e-6f77-47ae-821a-b2303e5d880d_2048x2048.png 848w, https://substackcdn.com/image/fetch/$s_!acvk!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31da3d0e-6f77-47ae-821a-b2303e5d880d_2048x2048.png 1272w, https://substackcdn.com/image/fetch/$s_!acvk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31da3d0e-6f77-47ae-821a-b2303e5d880d_2048x2048.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/p/nist-ai-standards-framework-ignores-attackers?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/p/nist-ai-standards-framework-ignores-attackers?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p>NIST released &#8220;A Possible Approach for Evaluating AI Standards Development&#8221; (<strong><a href="https://www.nist.gov/artificial-intelligence/ai-standards">GCR 26-069</a></strong>) on January 15, 2026. I read all 30 pages. Then I read them again, looking for the part about security. It&#8217;s not there.</p><p>The document proposes a theory-of-change methodology to measure whether AI standards achieve their goals. Sounds reasonable. The problem? The entire framework assumes a world where nobody is actively trying to break things. No adversaries. No attackers. No threat actors probing for weaknesses. Just cooperative stakeholders holding hands and measuring outcomes together.</p><p>I submitted 33 formal comments. Some of them were polite. Most of them pointed out that you can&#8217;t evaluate security standards using a methodology designed for data formatting standards. The two operate in fundamentally different realities.</p><p>This is my second formal NIST response this month. Two weeks ago, I submitted comments to the <a href="https://www.rockcybermusings.com/p/nist-ai-agent-rfi-2025-0035-human-oversight-wrong-fix">CAISI Request for Information on AI Agent Security</a>, arguing that authorization scope matters more than human oversight for bounding agent risk. I&#8217;m starting to see a pattern. NIST asks good questions. Then, NIST proposes frameworks that miss how adversarial environments actually work. It&#8217;s exhausting.</p><h2>Why This Document Made Me Write 33 Comments</h2><p>I&#8217;ve spent the past year and a half contributing to the OWASP GenAI Security Project and the OWASP AI Exchange. When you&#8217;ve helped catalog threat categories for AI systems and map mitigations for each, you notice when a &#8220;comprehensive&#8221; evaluation framework ignores all of them. The OWASP Top 10 for Agentic Applications came out on December 10, 2025. It covers goal hijacking, tool misuse, identity abuse, memory poisoning, cascading failures, and rogue agent behavior. Real risks. Documented incidents.</p><p>GCR 26-069 mentions none of this. Zero references to autonomous agents. Zero references to agentic AI. Zero references to multi-agent systems, tool use, or agent coordination. No MITRE ATLAS. No OWASP. No adversarial ML evaluation methods.</p><p>You know where cybersecurity experts appear in this document? Once. Buried in a list of stakeholders who might help &#8220;reduce the risks of reidentification harm.&#8221; That&#8217;s a privacy concern. Not a security concern. One mention. As an afterthought.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!t-aA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2abe22d-cc96-438a-bdff-befc38e38a0a_1795x1185.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!t-aA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2abe22d-cc96-438a-bdff-befc38e38a0a_1795x1185.png 424w, https://substackcdn.com/image/fetch/$s_!t-aA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2abe22d-cc96-438a-bdff-befc38e38a0a_1795x1185.png 848w, https://substackcdn.com/image/fetch/$s_!t-aA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2abe22d-cc96-438a-bdff-befc38e38a0a_1795x1185.png 1272w, https://substackcdn.com/image/fetch/$s_!t-aA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2abe22d-cc96-438a-bdff-befc38e38a0a_1795x1185.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!t-aA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2abe22d-cc96-438a-bdff-befc38e38a0a_1795x1185.png" width="1456" height="961" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b2abe22d-cc96-438a-bdff-befc38e38a0a_1795x1185.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:961,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:122726,&quot;alt&quot;:&quot;Table showing count of security-related terms in NIST GCR 26-069 document revealing critical omissions&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/185814202?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2abe22d-cc96-438a-bdff-befc38e38a0a_1795x1185.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Table showing count of security-related terms in NIST GCR 26-069 document revealing critical omissions" title="Table showing count of security-related terms in NIST GCR 26-069 document revealing critical omissions" srcset="https://substackcdn.com/image/fetch/$s_!t-aA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2abe22d-cc96-438a-bdff-befc38e38a0a_1795x1185.png 424w, https://substackcdn.com/image/fetch/$s_!t-aA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2abe22d-cc96-438a-bdff-befc38e38a0a_1795x1185.png 848w, https://substackcdn.com/image/fetch/$s_!t-aA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2abe22d-cc96-438a-bdff-befc38e38a0a_1795x1185.png 1272w, https://substackcdn.com/image/fetch/$s_!t-aA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2abe22d-cc96-438a-bdff-befc38e38a0a_1795x1185.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure 1: Security-Related Terms in NIST GCR 26-069</figcaption></figure></div><p>NIST AI 100-5, the Plan for Global Engagement on AI Standards, identifies &#8220;security and privacy&#8221; as one of six priority standardization areas. GCR 26-069 is supposed to tell us how to evaluate whether those priority standards work. A methodology that ignores adversarial environments can&#8217;t evaluate security effectiveness. It will produce conclusions that sound rigorous and mean nothing.</p><p>The timing makes this worse. EU AI Act conformity assessment kicks in August 2026. prEN 18286, the first harmonized standard for AI, entered public enquiry in October 2025. These standards will shape global AI governance. We need evaluation frameworks that actually work for security. This one doesn&#8217;t.</p><h2>The Counterfactual Fantasy</h2><p>The document loves counterfactual analysis. Box 5 asks: &#8220;What would have happened in the alternative state of the world?&#8221; For data integration standards, the example used throughout the entire document, this makes sense. You compare outcomes with and without the standard. The underlying process is stable. Apples to apples.</p><p>Security doesn&#8217;t work that way. Attack rates depend on attacker motivation. Attacker motivation changes when defenses change. You reduce vulnerability in Area A; attackers move to Area B. This is called attack displacement. It&#8217;s been documented for decades. Any security professional who&#8217;s worked an incident knows this.</p><p>Traditional &#8220;treatment vs. control&#8221; experimental designs fail for security because the control group isn&#8217;t static. The adversary adapts. That&#8217;s what adversaries do. It&#8217;s literally in the name.</p><p>Let me make this concrete. You want to evaluate whether an AI security standard reduced incidents. You compare incident rates before and after adoption. Incidents dropped 40%. Victory? Maybe. Or maybe attackers moved to organizations that didn&#8217;t adopt the standard. Or they pivoted to attack vectors that the standard doesn&#8217;t cover. Or they&#8217;re waiting. Attackers are patient when they need to be.</p><p>Measuring defender outcomes in adversarial environments produces garbage data dressed up as insight. The framework should measure attacker cost instead. Time-to-compromise increases. Skill threshold required to succeed. Attack surface reduction. Exploit chain complexity. These metrics sidestep the adaptation problem. Even if attackers shift targets, a standard that increases the cost to attackers has demonstrable value.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!YBsZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4021a768-b063-4076-ac50-da89f9afada6_1777x1030.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!YBsZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4021a768-b063-4076-ac50-da89f9afada6_1777x1030.png 424w, https://substackcdn.com/image/fetch/$s_!YBsZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4021a768-b063-4076-ac50-da89f9afada6_1777x1030.png 848w, https://substackcdn.com/image/fetch/$s_!YBsZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4021a768-b063-4076-ac50-da89f9afada6_1777x1030.png 1272w, https://substackcdn.com/image/fetch/$s_!YBsZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4021a768-b063-4076-ac50-da89f9afada6_1777x1030.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!YBsZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4021a768-b063-4076-ac50-da89f9afada6_1777x1030.png" width="1456" height="844" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4021a768-b063-4076-ac50-da89f9afada6_1777x1030.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:844,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:93404,&quot;alt&quot;:&quot;Comparison bar chart showing why attacker cost metrics outperform defender outcome metrics for security standards evaluation&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/185814202?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4021a768-b063-4076-ac50-da89f9afada6_1777x1030.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Comparison bar chart showing why attacker cost metrics outperform defender outcome metrics for security standards evaluation" title="Comparison bar chart showing why attacker cost metrics outperform defender outcome metrics for security standards evaluation" srcset="https://substackcdn.com/image/fetch/$s_!YBsZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4021a768-b063-4076-ac50-da89f9afada6_1777x1030.png 424w, https://substackcdn.com/image/fetch/$s_!YBsZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4021a768-b063-4076-ac50-da89f9afada6_1777x1030.png 848w, https://substackcdn.com/image/fetch/$s_!YBsZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4021a768-b063-4076-ac50-da89f9afada6_1777x1030.png 1272w, https://substackcdn.com/image/fetch/$s_!YBsZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4021a768-b063-4076-ac50-da89f9afada6_1777x1030.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure 2: Security Standards Evaluation: Defender Outcomes vs. Attacker Cost </figcaption></figure></div><p></p><p>I proposed adding &#8220;Box 5a: Security Counterfactual Considerations.&#8221; Apparently, we need to spell out that a security evaluation requires considering attackers.</p><h2>Process Theater vs. Actual Effectiveness</h2><p>The document draws a pretty arrow from standards development activities to societal goals. Inputs lead to activities. Activities produce outputs. Outputs generate outcomes. Outcomes achieve goals. It&#8217;s a nice diagram. It&#8217;s also conflating two completely different questions.</p><p><strong>Question one:</strong> Did the standards development organization run a good process? Were inputs adequate? Were activities effective? Did outputs ship on time? These evaluate process.</p><p><strong>Question two:</strong> Does the standard actually work when deployed? Does it reduce harm? Does it resist adversarial exploitation? Does it hold up when someone motivated tries to break it?</p><p>These require different methodologies. Different data. Different expertise. Different timeframes. The document mostly focuses on question one while claiming to address question two.</p><p>A standard can emerge from a flawless SDO process and still fail operationally. The threat landscape evolved. The attack techniques changed. The assumptions embedded in the standard no longer match reality. Process success doesn&#8217;t guarantee effectiveness. Anyone who&#8217;s watched a beautifully documented policy get shredded by a real incident understands this.</p><p>Security effectiveness requires continuous measurement of deployed systems. Runtime outcomes. Incident response metrics. Vulnerability management data. Threat detection rates. The static theory-of-change model can&#8217;t capture any of this. It&#8217;s a snapshot methodology applied to a moving target.</p><p>I proposed adding an &#8220;Operational Feedback Loop&#8221; connecting runtime observations back to standards development. Make it iterative instead of linear. Acknowledge that the world changes between when you write a standard and when you evaluate it.</p><h2>Where&#8217;s the Red Team?</h2><p>The evaluation methodology relies on observational studies, stakeholder surveys, and outcome measurement. For security standards, this is like checking whether a lock works by asking people if they feel safe.</p><p>You can survey organizations about their security posture. You can measure incident rates. Neither tells you whether the standard defends against the threats it claims to address. You&#8217;re measuring perception and lagging indicators. You&#8217;re not measuring actual resistance to attack.</p><p>Adversarial testing fills this gap. Red teaming. Penetration testing. Adversarial ML evaluation. These methods actively probe for weaknesses rather than waiting for someone to exploit them. The ICLR 2025 research on AI safeguards found that defenses &#8220;often fail against novel attack methods not seen during development.&#8221; Minor changes to attack parameters produce dramatically different success rates. Static evaluation misses all of this.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!rfFe!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b64c745-9280-49db-8e2a-3c22f0dea48c_1779x1183.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!rfFe!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b64c745-9280-49db-8e2a-3c22f0dea48c_1779x1183.png 424w, https://substackcdn.com/image/fetch/$s_!rfFe!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b64c745-9280-49db-8e2a-3c22f0dea48c_1779x1183.png 848w, https://substackcdn.com/image/fetch/$s_!rfFe!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b64c745-9280-49db-8e2a-3c22f0dea48c_1779x1183.png 1272w, https://substackcdn.com/image/fetch/$s_!rfFe!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b64c745-9280-49db-8e2a-3c22f0dea48c_1779x1183.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!rfFe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b64c745-9280-49db-8e2a-3c22f0dea48c_1779x1183.png" width="1456" height="968" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3b64c745-9280-49db-8e2a-3c22f0dea48c_1779x1183.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:968,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:124288,&quot;alt&quot;:&quot;Horizontal bar chart comparing effectiveness of different evaluation methods for AI security standards&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/185814202?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b64c745-9280-49db-8e2a-3c22f0dea48c_1779x1183.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Horizontal bar chart comparing effectiveness of different evaluation methods for AI security standards" title="Horizontal bar chart comparing effectiveness of different evaluation methods for AI security standards" srcset="https://substackcdn.com/image/fetch/$s_!rfFe!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b64c745-9280-49db-8e2a-3c22f0dea48c_1779x1183.png 424w, https://substackcdn.com/image/fetch/$s_!rfFe!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b64c745-9280-49db-8e2a-3c22f0dea48c_1779x1183.png 848w, https://substackcdn.com/image/fetch/$s_!rfFe!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b64c745-9280-49db-8e2a-3c22f0dea48c_1779x1183.png 1272w, https://substackcdn.com/image/fetch/$s_!rfFe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b64c745-9280-49db-8e2a-3c22f0dea48c_1779x1183.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure 3: Evaluation Methods for AI Security Standards</figcaption></figure></div><p>For security-related standards, evaluation should include a red-team assessment of implementations, penetration testing against compliant systems, adversarial ML evaluation using MITRE ATLAS techniques, and comparisons between compliant and non-compliant implementations. This provides direct evidence. Not surveys. Not feelings. Evidence.</p><p>The OWASP Top 10 for Agentic Applications offers a ready-made threat taxonomy. Goal hijacking resistance (ASI01). Tool misuse prevention (ASI02). Identity and privilege abuse controls (ASI03). Supply chain integrity (ASI04). Memory poisoning resistance (ASI06). Ten documented risk categories with real incidents behind them. Standards can be evaluated against these attack patterns. We don&#8217;t have to wait for the next breach to find out if the standard works.</p><h2>Validity Problems Nobody Mentioned</h2><p>Section 2.2 discusses internal validity, construct validity, self-selection bias, and external validity. Standard methodology considerations. All framed for benign interventions where nobody is trying to make you fail.</p><p>Security standards face validity threats that the document doesn&#8217;t acknowledge&#8230;</p><p>Adversarial adaptation breaks your baseline. The baseline isn&#8217;t stable because attackers respond to defenses. Every security measurement includes this noise.</p><p>Security metrics are garbage. Organizations detect and report a fraction of actual intrusions. Your denominator is wrong. Your numerator is wrong. Your confidence interval should be a shrug emoji.</p><p>The counterfactual is unobservable. &#8220;What would attackers have done?&#8221; isn&#8217;t something you can measure. You can&#8217;t run a controlled experiment on motivated adversaries. They don&#8217;t fill out consent forms.</p><p>Temporal validity fails fast. A standard effective against 2024 attack techniques may be useless against 2026 techniques. MITRE ATLAS regularly adds new adversarial ML techniques. How long does a security standard remain effective? The framework doesn&#8217;t ask.</p><p>These aren&#8217;t minor gaps. There are fundamental problems with applying this methodology to security standards. The framework needs security-specific validity guidance, or it will produce findings that look rigorous and mislead everyone who reads them.</p><h2>What I Told NIST</h2><p>My 33 comments include specific recommendations. Not just complaints. Fixes.</p><p>First, add a security-focused illustrative example. The document uses data integration throughout. Fine. Add a parallel example, such as &#8220;Autonomous Agent Credential Management,&#8221; that walks through the same structure with security-centric measures. Show how security inputs differ (threat intelligence, red team findings, incident data). Show security-specific activities (adversarial testing, penetration testing). Show security outputs (threat coverage matrices, attack resistance specifications). Show security outcomes (attacker cost increase, exploit chain complexity). Show security goals (adversarial robustness, resilience under attack). Make the framework demonstrate that it can handle security. Don&#8217;t just assume it.</p><p>Second, add Adversarial Evaluators as a stakeholder category. Traditional stakeholder engagement assumes cooperative participants seeking shared outcomes. Security evaluation requires people who deliberately adopt attacker mindsets. Red team operators. Penetration testers. Adversarial ML researchers. Threat intelligence analysts. Their value comes from challenging consensus, not building it. That&#8217;s how you formally integrate adversarial perspectives.</p><p>Third, distinguish voluntary from mandatory conformity outcomes. The EU AI Act creates a binary measure: whether an AI standard meets the &#8220;presumption of conformity&#8221; under Article 40. This should be an explicit evaluation criterion. Voluntary conformity exhibits self-selection bias. Mandatory conformity creates natural experiments. Different mechanisms. Different evaluation approaches.</p><p>Fourth, specify measurement infrastructure requirements. Security evaluation requires telemetry from deployed systems, threat intelligence feeds, vulnerability databases, and incident reporting mechanisms. Without this infrastructure, security outcomes are unmeasurable. The document should specify which data-collection capabilities are prerequisites for meaningful evaluation. Otherwise, people will try to evaluate security standards without the data needed to do it.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-1em!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe68327f9-04cc-43f9-9e82-0bcf0ebacb3b_2077x1187.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-1em!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe68327f9-04cc-43f9-9e82-0bcf0ebacb3b_2077x1187.png 424w, https://substackcdn.com/image/fetch/$s_!-1em!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe68327f9-04cc-43f9-9e82-0bcf0ebacb3b_2077x1187.png 848w, https://substackcdn.com/image/fetch/$s_!-1em!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe68327f9-04cc-43f9-9e82-0bcf0ebacb3b_2077x1187.png 1272w, https://substackcdn.com/image/fetch/$s_!-1em!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe68327f9-04cc-43f9-9e82-0bcf0ebacb3b_2077x1187.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-1em!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe68327f9-04cc-43f9-9e82-0bcf0ebacb3b_2077x1187.png" width="1456" height="832" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e68327f9-04cc-43f9-9e82-0bcf0ebacb3b_2077x1187.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:832,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:125959,&quot;alt&quot;:&quot;Bar chart showing coverage  in NIST GCR 26-069 for security standards evaluation criteria&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/185814202?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe68327f9-04cc-43f9-9e82-0bcf0ebacb3b_2077x1187.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Bar chart showing coverage  in NIST GCR 26-069 for security standards evaluation criteria" title="Bar chart showing coverage  in NIST GCR 26-069 for security standards evaluation criteria" srcset="https://substackcdn.com/image/fetch/$s_!-1em!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe68327f9-04cc-43f9-9e82-0bcf0ebacb3b_2077x1187.png 424w, https://substackcdn.com/image/fetch/$s_!-1em!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe68327f9-04cc-43f9-9e82-0bcf0ebacb3b_2077x1187.png 848w, https://substackcdn.com/image/fetch/$s_!-1em!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe68327f9-04cc-43f9-9e82-0bcf0ebacb3b_2077x1187.png 1272w, https://substackcdn.com/image/fetch/$s_!-1em!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe68327f9-04cc-43f9-9e82-0bcf0ebacb3b_2077x1187.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure 4: Framework Gaps for Security Standards Evaluation</figcaption></figure></div><p><strong>Key Takeaway:</strong> This framework will produce misleading conclusions about security standards because it ignores adversarial adaptation, conflates process with effectiveness, and lacks a methodology for measuring what actually matters: whether standards increase the cost to attackers.</p><h3>What to do next</h3><p>If you&#8217;re responsible for AI security standards adoption, don&#8217;t confuse compliance with security. A standard that emerged from a rigorous SDO process can still fail operationally if threats evolved since publication. Build your own threat-informed evaluation. Don&#8217;t trust process-focused assessments to tell you whether you&#8217;re actually protected.</p><p>For organizations building AI governance programs, the <a href="https://www.rockcyber.com/ai-strategy-and-governance">CARE Framework</a> provides structured risk assessment that accounts for adversarial considerations. The <a href="https://www.rockcyber.com/ai-strategy-and-governance">RISE Framework</a> addresses organizational readiness for the continuous evaluation that AI security demands.</p><p>NIST welcomes feedback on GCR 26-069 via email to <strong><a href="mailto:ai-standards@nist.gov">ai-standards@nist.gov</a></strong>. They&#8217;re planning an online event to discuss the approach. The CAISI AI Agent RFI comment period runs through March 9, 2026. If you haven&#8217;t read my analysis of why <a href="https://www.rockcybermusings.com/p/nist-ai-agent-rfi-2025-0035-human-oversight-wrong-fix">authorization scope beats human oversight for agent security</a>, the arguments connect to what I&#8217;ve outlined here. Both documents share the same blind spot.</p><p>More security practitioners need to engage with these processes. Otherwise, NIST guidance will reflect the views of people who&#8217;ve never had to defend a production system against a motivated attacker.</p><p>&#128073; Subscribe for more AI security and governance insights with the occasional rant.</p><p>&#128073; Visit <strong><a href="https://www.rockcyber.com/">RockCyber.com</a></strong> to learn more about how we can help you in your traditional Cybersecurity and AI Security and Governance Journey</p><p>&#128073; Want to save a quick $100K? Check out our AI Governance Tools at <strong><a href="https://aigovernancetoolkit.com/">AIGovernanceToolkit.com</a></strong></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share RockCyber Musings&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share RockCyber Musings</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[Weekly Musings Top 10 AI Security Wrapup: Issue 23 January 23, 2026 - January 29, 2026]]></title><description><![CDATA[Fortinet Zero-Days, Moltbot's Shadow IT Crisis, and DeepSeek's Million-Record Leak]]></description><link>https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260123-20260129</link><guid isPermaLink="false">https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260123-20260129</guid><dc:creator><![CDATA[Rock Lambros]]></dc:creator><pubDate>Fri, 30 Jan 2026 13:50:27 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Vmvc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeac34d9-824c-405a-b35b-324a578d4817_1024x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Vmvc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeac34d9-824c-405a-b35b-324a578d4817_1024x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Vmvc!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeac34d9-824c-405a-b35b-324a578d4817_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!Vmvc!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeac34d9-824c-405a-b35b-324a578d4817_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!Vmvc!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeac34d9-824c-405a-b35b-324a578d4817_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!Vmvc!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeac34d9-824c-405a-b35b-324a578d4817_1024x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Vmvc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeac34d9-824c-405a-b35b-324a578d4817_1024x1024.png" width="1024" height="1024" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/feac34d9-824c-405a-b35b-324a578d4817_1024x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1024,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1233556,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.rockcybermusings.com/i/186282636?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeac34d9-824c-405a-b35b-324a578d4817_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Vmvc!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeac34d9-824c-405a-b35b-324a578d4817_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!Vmvc!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeac34d9-824c-405a-b35b-324a578d4817_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!Vmvc!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeac34d9-824c-405a-b35b-324a578d4817_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!Vmvc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeac34d9-824c-405a-b35b-324a578d4817_1024x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260123-20260129?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/p/weekly-musings-top-10-ai-security-20260123-20260129?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p>Another week, another Fortinet zero-day. At this point, I&#8217;m starting to wonder if Fortinet&#8217;s security team schedules their patch releases around my newsletter deadlines. But this week brought more than the usual perimeter carnage. 22% of enterprises have employees running Moltbot without IT approval, and the tool&#8217;s security architecture is a mess of exposed admin ports and plaintext credentials. North Korean hackers deployed what appears to be the first AI-generated APT malware caught in the wild. And DeepSeek left over a million user conversations sitting in a publicly accessible database. If you&#8217;re still wondering whether AI security belongs on your board&#8217;s agenda, stop wondering.</p><p>The through-line connecting this week&#8217;s chaos is execution speed. Attackers are automating faster than defenders can patch. AI coding assistants are spreading faster than security teams can evaluate them. Criminal infrastructure now operates at scales that require coordinated industry response. The gap between vulnerability disclosure and weaponization continues shrinking toward zero.</p><p>For CISOs, the implications are clear: your security program&#8217;s velocity determines your exposure window. Patch management isn&#8217;t a quarterly exercise anymore. Neither is shadow IT discovery. Both are competitive advantages.</p><h3>1. Fortinet&#8217;s 14th Zero-Day in Four Years Proves Perimeter Security is a Leaky Boat</h3><p>Fortinet disclosed CVE-2026-24858, a critical authentication bypass vulnerability in FortiCloud SSO affecting FortiOS, FortiManager, FortiAnalyzer, FortiProxy, and FortiWeb. The flaw earned a CVSS score of 9.4. Attackers exploited it in the wild before disclosure, creating backdoor admin accounts and exfiltrating device configurations. CISA added it to the Known Exploited Vulnerabilities catalog on January 27 with a February 13 remediation deadline. Arctic Wolf observed attacks executing within seconds of identifying vulnerable targets (BleepingComputer). Coalition Insurance noted that this is Fortinet&#8217;s 14th zero-day advisory in less than 4 years (CyberScoop).</p><p><strong>Why it matters</strong></p><ul><li><p>Attackers targeted fully-patched systems, meaning regular patching alone provided no protection</p></li><li><p>Automated exploitation windows have collapsed to seconds, not hours or days</p></li><li><p>Configuration exfiltration enables follow-on attacks even after patching</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Apply the January 28 patches immediately if you haven&#8217;t already</p></li><li><p>Audit all Fortinet admin accounts created since January 20 for unauthorized entries</p></li><li><p>Review VPN configurations for unexpected changes and rotate credentials for affected devices</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Fourteen zero-days in four years. Let that sink in. At some point, we need to have an honest conversation about whether network perimeter appliances create more attack surface than they protect. These devices sit at the edge of your network with broad privileges, and every vendor in this space has gotten hammered.</p><p>I&#8217;m not saying ditch your firewall tomorrow, but if your security strategy depends on the assumption that your perimeter devices are secure, you&#8217;re building on sand. The attackers who exploited this flaw created backdoor accounts and exfiltrated configs before Fortinet knew the vulnerability existed. Your zero-trust architecture shouldn&#8217;t be a slide deck. It should be your insurance policy for exactly this scenario. For more on building resilient security architecture, visit <a href="https://www.rockcyber.com/">RockCyber</a>.</p><h3>2. Moltbot&#8217;s Exploding Security Crisis Threatens Every Organization Using AI Coding Assistants</h3><p>While the fake VS Code extension grabbed headlines, the deeper story involves Moltbot&#8217;s own security architecture. Security researchers have found hundreds of exposed Moltbot instances running with unauthenticated admin ports accessible from the internet. Credentials and API keys appear in plaintext in configuration files. Token Security found that 22% of their enterprise customers have employees running Moltbot without IT knowledge or approval (The Register). A proof-of-concept supply chain attack via MoltHub achieved 4,000 downloads in under eight hours. Google VP of Security Engineering Heather Adkins publicly warned: &#8220;Don&#8217;t run Clawdbot.&#8221;</p><p><strong>Why it matters</strong></p><ul><li><p>Moltbot has direct access to source code, development infrastructure, and secrets</p></li><li><p>Shadow IT adoption has outpaced security evaluation for these tools</p></li><li><p>The attack surface includes not just the tool itself but the entire ecosystem of extensions and integrations</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Conduct an immediate audit of Moltbot usage across your organization including personal devices</p></li><li><p>If Moltbot is approved for use, ensure instances run behind authentication and network segmentation</p></li><li><p>Establish formal evaluation and approval processes for AI coding assistants before developers adopt them independently</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>This is the story that should terrify every CISO, and it&#8217;s barely making news outside security circles.</p><p>Moltbot and similar AI coding assistants represent a category of tooling that didn&#8217;t exist two years ago. They&#8217;re now running on developer machines across your organization, often without security review, often with access to your most sensitive assets: source code, API keys, production credentials, deployment pipelines.</p><p>The 22% figure from Token Security means nearly a quarter of enterprise employees are running AI coding tools their IT departments don&#8217;t know about. These tools ask for broad file system access, network connectivity, and permission to execute code. When one of them is compromised or misconfigured, attackers get everything.</p><p>We&#8217;ve spent years hardening our software supply chains against malicious packages and compromised libraries. AI coding assistants represent the same class of risk but with broader access and less scrutiny. If you don&#8217;t have visibility into what AI tools your developers are running, you don&#8217;t understand your attack surface. Fix that before you become this year&#8217;s case study.</p><h3>3. Sandworm Drops DynoWiper on Poland&#8217;s Power Grid, Marking Decade Anniversary of Ukraine Blackout</h3><p>Russia&#8217;s GRU-linked Sandworm group deployed a new wiper malware called DynoWiper against Poland&#8217;s power grid on December 29-30, 2025. ESET publicly attributed the attack on January 24, 2026. The attack targeted two heat-and-power plants plus renewable energy management systems. Polish authorities thwarted the attack before it caused disruption, but stated it could have affected 500,000 people (ESET). The timing coincided with the 10-year anniversary of Sandworm&#8217;s 2015 BlackEnergy attack, which caused Ukraine&#8217;s first malware-induced blackout. Dragos reported that some equipment was damaged beyond repair, despite the defense's operational success (The Register).</p><p><strong>Why it matters</strong></p><ul><li><p>This is the first destructive cyberattack against a NATO member&#8217;s critical infrastructure attributed to a nation-state</p></li><li><p>Sandworm&#8217;s operational tempo and capability continue escalating despite years of sanctions and indictments</p></li><li><p>Even unsuccessful attacks can cause physical equipment damage</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Critical infrastructure operators should review network segmentation between IT and OT environments</p></li><li><p>Implement anomaly detection specifically for wiper malware behaviors, including mass file deletion and MBR overwrites</p></li><li><p>Coordinate with national CERTs and ISACs for threat intelligence on Sandworm TTPs</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Poland won this fight. That&#8217;s worth acknowledging. Their &#8364;1 billion cybersecurity budget and daily experience fending off 20-50 attacks created the defensive muscle memory needed to stop Sandworm&#8217;s wiper before it caused blackouts. Most countries facing this threat don&#8217;t have that operational maturity.</p><p>The anniversary timing isn&#8217;t a coincidence. It&#8217;s psychological warfare. 10 years after proving they could cut off Ukraine&#8217;s power, Sandworm wanted to demonstrate they could reach NATO infrastructure. They failed operationally but succeeded at the message: we&#8217;re here, we&#8217;re capable, and your critical infrastructure isn&#8217;t safe. If you run OT environments, assume you&#8217;re already a target and plan your defenses accordingly.</p><h3>4. DeepSeek Leaves Over a Million User Conversations in Publicly Accessible Database</h3><p>Wiz Research discovered on January 29 that the Chinese AI company DeepSeek had left a ClickHouse database publicly accessible and unauthenticated. The exposed data included over one million records containing chat histories, API keys, backend logs, and system metadata (Wiz). The database was secured after Wiz&#8217;s disclosure, but the exposure duration remains unknown. This discovery came amid ongoing regulatory scrutiny of DeepSeek&#8217;s data practices, including an Italian investigation into GDPR compliance and government bans in Australia, Taiwan, and South Korea (The Hacker News). Cisco security testing found DeepSeek&#8217;s R1 model failed to block any jailbreak attempts, suggesting broader security architecture issues.</p><p><strong>Why it matters</strong></p><ul><li><p>Chat histories with an AI assistant can contain sensitive personal, business, and technical information</p></li><li><p>API key exposure enables unauthorized access to paid services and potential impersonation</p></li><li><p>The combination of weak application security and aggressive data collection creates compounding risks</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Audit your organization for unauthorized DeepSeek usage, particularly among technical staff</p></li><li><p>If you&#8217;ve used DeepSeek, assume any data shared with it may have been exposed, and rotate relevant credentials</p></li><li><p>Update acceptable use policies to address emerging AI services and their data handling practices</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>DeepSeek burst onto the scene as the cheap, capable Chinese alternative to OpenAI. Security researchers have been poking at it for weeks, and every examination reveals new problems. Cisco found it fails 100% of jailbreak tests. KELA demonstrated it can be manipulated to produce dangerous outputs. Now, Wiz shows they left the database unlocked.</p><p>This isn&#8217;t a case of sophisticated attackers finding obscure flaws. This is a basic operational security failure. If DeepSeek can&#8217;t keep its own database locked, why would you trust it with your organization&#8217;s conversations? I get the appeal of cost-effective AI tools. But cheap becomes expensive fast when your data ends up in an open database. For guidance on evaluating AI tool risks, check out <a href="https://rockcybermusings.com/">Rock&#8217;s Cyber Musings</a>.</p><h3>5. Google Dismantles IPIDEA Proxy Network Used by 550+ Threat Groups</h3><p>Google&#8217;s Threat Intelligence Group announced on January 29 the disruption of IPIDEA, one of the world&#8217;s largest residential proxy networks. In a single week in January 2026, Google observed more than 550 threat groups from China, DPRK, Iran, and Russia using IPIDEA exit nodes (Google Cloud Blog). The operation involved legal action against command-and-control domains, coordination with Cloudflare for DNS disruption, and Google Play Protect blocking infected apps. Google identified 600+ trojanized Android apps and 3,000+ Windows binaries distributing IPIDEA&#8217;s proxy software. The disruption reduced IPIDEA&#8217;s available device pool by millions (The Register).</p><p><strong>Why it matters</strong></p><ul><li><p>Residential proxies allow attackers to hide malicious traffic behind legitimate consumer IP addresses, evading geographic and reputation-based blocking</p></li><li><p>The scale of 550+ threat groups using a single infrastructure reveals how criminals share operational resources</p></li><li><p>Users whose devices were enrolled became unwitting participants in attacks and exposed their own networks to compromise</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Block known IPIDEA IP ranges at your network perimeter where feasible</p></li><li><p>Implement network traffic analysis capable of detecting residential proxy behavior patterns</p></li><li><p>Educate users about the risks of apps promising to &#8220;monetize&#8221; their bandwidth</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>550 threat groups. One proxy network. One week. That&#8217;s the scale of the shared infrastructure criminals operate today.</p><p>Google deserves credit for coordinating this disruption. Taking down distributed infrastructure requires legal action, technical coordination across multiple companies, and sustained effort. But IPIDEA isn&#8217;t unique. It operated 19 residential proxy brands under centralized control. When this one gets degraded, criminals will migrate to competitors.</p><p>The deeper problem is the business model: pay app developers to embed proxy SDKs, enroll millions of devices without clear user consent, then sell access to anyone willing to pay. Until we address that economic model, these networks will keep regenerating. Your defense can&#8217;t depend on Google&#8217;s enforcement efforts. Assume attackers will always have residential IP access and design your detection accordingly.</p><h3>6. North Korean Konni Group Deploys AI-Generated Malware Against Blockchain Developers</h3><p>Check Point Research reported on January 23 that the North Korean Konni group (also known as Opal Sleet/TA406) is using AI-generated PowerShell malware to target blockchain developers in Japan, Australia, and India. The malware exhibits clear signs of LLM-assisted development, including structured documentation, modular code layout, and placeholder comments like &#8220;# &#8592; your permanent project UUID&#8221; that are characteristic of AI-generated code (BleepingComputer). Attacks begin with Discord-hosted phishing links delivering ZIP archives containing malicious LNK shortcuts. This campaign marks a shift from Konni&#8217;s traditional focus on South Korean diplomatic targets toward APAC blockchain and cryptocurrency developers (Check Point).</p><p><strong>Why it matters</strong></p><ul><li><p>This is among the first documented cases of APT groups using AI to accelerate malware development</p></li><li><p>AI-assisted malware can iterate faster and customize more easily, challenging signature-based detection</p></li><li><p>The targeting shift toward blockchain developers indicates North Korea&#8217;s continued focus on cryptocurrency theft</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Brief blockchain and cryptocurrency development teams on targeted spear-phishing tactics</p></li><li><p>Block Discord CDN URLs at the perimeter or implement additional inspection for files downloaded from Discord</p></li><li><p>Update endpoint detection rules to identify PowerShell behaviors associated with AI-generated code patterns</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>We&#8217;ve been warning about AI-assisted malware as a future threat. The future arrived this week.</p><p>What makes this significant isn&#8217;t that the malware is dramatically more sophisticated. It&#8217;s that the development process accelerated. Clean documentation. Modular structure. Placeholder comments explaining customization. The Konni operators didn&#8217;t become better programmers. They got faster.</p><p>For defenders, the implication is uncomfortable. Threat actors who previously took weeks to develop custom tooling can now iterate in days or hours. The asymmetry that favored attackers just got worse. Your threat models need updating. Your detection capabilities need to assume a higher adversary velocity. And your developers working on anything cryptocurrency-related need to understand they&#8217;re targets, not just builders.</p><h3>7. CISA Adds VMware vCenter Vulnerability to KEV Catalog After Active Exploitation Confirmed</h3><p>CISA added CVE-2024-37079, a critical heap overflow vulnerability in VMware vCenter Server, to its Known Exploited Vulnerabilities catalog on January 23, 2026. The vulnerability carries a CVSS score of 9.8 and allows unauthenticated remote code execution via specially crafted network packets (CISA). Broadcom updated its security advisory on January 23 to confirm active exploitation in the wild, seven months after patches were first released in June 2024. Federal civilian agencies must remediate by February 13, 2026 (BleepingComputer). The vulnerability affects the DCE/RPC protocol implementation in vCenter Server.</p><p><strong>Why it matters</strong></p><ul><li><p>vCenter Server is the management plane for VMware environments, making it a high-value target</p></li><li><p>The seven-month gap between patch availability and confirmed exploitation highlights the cost of delayed patching</p></li><li><p>Unauthenticated RCE vulnerabilities in the management infrastructure enable complete environment compromise</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Verify all vCenter Server instances are patched to versions released after June 2024</p></li><li><p>Restrict network access to vCenter management interfaces to authorized administrator networks only</p></li><li><p>Review vCenter audit logs for evidence of exploitation attempts since June 2024</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>This vulnerability was patched seven months ago. Let that be your reminder that attackers don&#8217;t follow your quarterly patch cycle.</p><p>The pattern is familiar: critical vulnerability disclosed, patch released, organizations deprioritize because &#8220;we haven&#8217;t seen exploitation yet,&#8221; then CISA adds it to the KEV catalog confirming it&#8217;s being actively exploited. The time to patch was June 2024. The second-best time is right now.</p><p>vCenter compromise gives attackers access to your entire virtualized infrastructure. They can access VMs, exfiltrate data, and move laterally without touching the network. If you&#8217;ve been treating vCenter patching as lower priority because it&#8217;s &#8220;just management infrastructure,&#8221; you&#8217;ve had the risk assessment backwards.</p><h3>8. Fake AI Coding Assistant Drops Remote Access Malware on VS Code Users</h3><p>Aikido Security detected on January 27 a malicious Visual Studio Code extension named &#8220;ClawdBot Agent - AI Coding Assistant&#8221; mimicking the legitimate Moltbot coding tool. The extension dropped ConnectWise ScreenConnect for remote access on victims&#8217; machines (The Hacker News). Microsoft removed the extension from the VS Code Marketplace after notification. The attack capitalized on Moltbot&#8217;s popularity, which has over 85,000 GitHub stars. This incident follows broader concerns about Moltbot&#8217;s security posture, including hundreds of exposed instances with unauthenticated admin ports and API keys stored in plaintext (BleepingComputer).</p><p><strong>Why it matters</strong></p><ul><li><p>Developer tools represent high-value targets with access to source code, secrets, and infrastructure credentials</p></li><li><p>The VS Code Marketplace&#8217;s extension approval process allowed a malicious extension to reach users</p></li><li><p>Supply chain attacks targeting developers can compromise entire software ecosystems</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Audit VS Code extensions installed across your development environment for suspicious entries</p></li><li><p>Implement allowlisting for approved extensions rather than relying on marketplace vetting alone</p></li><li><p>Scan for ConnectWise ScreenConnect installations that weren&#8217;t deployed by IT</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Developers downloading coding assistants from official marketplaces expect some baseline vetting. That expectation is increasingly misplaced. The VS Code Marketplace, npm, PyPI, and every other package repository has become an attack surface.</p><p>This particular attack got caught quickly because Aikido was watching. How many similar attacks have succeeded without detection? The extension was named to exploit confusion with a popular legitimate tool. The payload was commodity remote access software. This isn&#8217;t sophisticated. It&#8217;s opportunistic, scalable, and apparently effective enough to keep happening.</p><p>If your developers install their own tools without security review, you have supply chain exposure you&#8217;re not measuring. The answer isn&#8217;t locking everything down and killing productivity. It&#8217;s building detection capabilities that catch malicious tooling when it inevitably slips through.</p><h3>9. WEF Report: 94% of Leaders Expect AI to Dominate Cybersecurity in 2026</h3><p>The World Economic Forum released its Global Cybersecurity Outlook 2026 report in January, surveying executives and security leaders worldwide. Key findings: 94% expect AI to be the most consequential force shaping cybersecurity this year. 87% reported experiencing rising AI-related vulnerabilities in 2025. Cyber-enabled fraud has overtaken ransomware as CEOs&#8217; top concern. 31% of respondents expressed low confidence in their nation&#8217;s ability to respond to critical infrastructure attacks, up from 26% the previous year (World Economic Forum). The report noted that 64% of organizations now factor geopolitically motivated attacks into their risk strategies.</p><p><strong>Why it matters</strong></p><ul><li><p>The perception gap between AI as an opportunity and AI as a threat continues widening</p></li><li><p>CEO concern shifting from ransomware to fraud reflects changing attacker economics</p></li><li><p>Declining confidence in national cyber preparedness signals growing infrastructure vulnerability</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Update board reporting to address AI-specific security risks alongside traditional threat categories</p></li><li><p>Review fraud detection capabilities given the shifting attacker focus toward BEC and social engineering</p></li><li><p>Incorporate geopolitical threat intelligence into risk assessments for organizations with international exposure</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Survey says: everyone thinks AI will change everything. This tracks with what I hear from every CISO I talk to. The challenge is translating that awareness into operational reality.</p><p>What caught my attention is the shift from fraud to ransomware. Ransomware gets the headlines and the policy attention. But fraud, phishing, and BEC attacks now worry CEOs more. Why? Because they&#8217;re working. Ransomware payments are declining as organizations improve backups and refuse to pay. Fraud attacks are getting more convincing thanks to deepfakes and AI-generated content.</p><p>The 31% expressing low confidence in national cyber preparedness should concern policymakers. If a third of security leaders don&#8217;t trust their country&#8217;s ability to respond to critical infrastructure attacks, that&#8217;s a vote of no confidence in government cyber capabilities. Poland just demonstrated that good national defense is possible. Most countries haven&#8217;t made those investments.</p><h3>10. EU AI Act High-Risk Deadlines Loom as Code of Practice Deadline Passes</h3><p>The EU AI Office&#8217;s first draft Code of Practice on AI-generated content transparency closed for feedback on January 23, 2026. High-risk AI system provisions take effect August 2, 2026, giving organizations roughly six months to achieve compliance. The EU AI Office gains full enforcement authority on that date, with fine authority reaching 3% of global turnover (EU AI Act). The European Commission&#8217;s Digital Omnibus proposal aims to streamline compliance requirements across overlapping regulations, including the AI Act, GDPR, and Digital Services Act. Each member state must establish at least one regulatory sandbox by August 2, 2026 (K&amp;L Gates).</p><p><strong>Why it matters</strong></p><ul><li><p>Six months to compliance is insufficient for organizations that haven&#8217;t started preparation</p></li><li><p>3% global turnover fines rival GDPR enforcement exposure for major violations</p></li><li><p>Regulatory sandbox requirements signal an EU intent to enable compliant AI development, not just prohibit violations</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Inventory AI systems and classify against EU AI Act risk categories immediately if you haven&#8217;t already</p></li><li><p>Engage legal counsel with EU AI Act expertise to assess compliance gaps for high-risk systems</p></li><li><p>Monitor member state sandbox programs for compliance pathways relevant to your AI use cases</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>Six months until the high-risk provisions take effect. If your organization uses AI in healthcare, education, employment, or critical infrastructure, and you&#8217;re reading about EU AI Act compliance for the first time, you&#8217;re behind.</p><p>I&#8217;ve talked to organizations that assume they can treat AI compliance like early GDPR: wait for enforcement actions, learn from others&#8217; mistakes, then respond. That worked poorly for GDPR. It will work less well here because the AI Act&#8217;s technical requirements require architectural changes, not just policy updates.</p><p>The silver lining is regulatory sandboxes. The EU is signaling that it wants to enable compliant AI development, not just punish violations. If you&#8217;re building AI systems that might face scrutiny, engaging with sandbox programs early gives you a compliance pathway and a regulator relationship that pure avoidance strategies lack.</p><h3>The One Thing You Won&#8217;t Hear About But You Need To</h3><p><strong>WhatsApp Launches Lockdown Mode for High-Risk Users</strong></p><p>Meta announced WhatsApp&#8217;s new &#8220;Strict Account Settings&#8221; feature on January 27, providing a one-click security mode for journalists, activists, and other high-risk users. The feature blocks media from unknown senders, disables link previews, silences calls from unknown contacts, and enables two-step verification by default (BleepingComputer). The announcement follows similar offerings from Apple (Lockdown Mode, 2022) and Google (Android Advanced Protection Mode, 2025). Citizen Lab researcher John Scott-Railton called the announcement &#8220;a very welcome development&#8221; (TechRepublic). The rollout comes days after Meta faced a lawsuit alleging false privacy claims, which WhatsApp head Will Cathcart called &#8220;a no-merit, headline-seeking lawsuit.&#8221;</p><p><strong>Why it matters</strong></p><ul><li><p>WhatsApp&#8217;s 2+ billion users include many potential targets of state-sponsored surveillance</p></li><li><p>Zero-click exploits often leverage media processing and link previews as attack vectors</p></li><li><p>Industry standardization of &#8220;lockdown modes&#8221; creates consistent expectations for high-risk user protection</p></li></ul><p><strong>What to do about it</strong></p><ul><li><p>Identify employees in high-risk roles and recommend enabling Strict Account Settings</p></li><li><p>Update security awareness training to cover lockdown features across platforms</p></li><li><p>Review organizational policies for communication tools used by executives and other potential targets</p></li></ul><p><strong>Rock&#8217;s Musings</strong></p><p>WhatsApp, Apple, and Google now all offer lockdown modes for high-risk users. The security industry has normalized the idea that some people face threats requiring extreme countermeasures. That&#8217;s progress.</p><p>The features themselves are common sense: block unknown attachments, disable link previews, mute unknown callers. These are attack vectors that spyware vendors have exploited for years. What took so long? Better late than never, I suppose.</p><p>If you have employees who might be targeted by nation-states or sophisticated criminals, talk to them about these features. Executives, journalists, researchers, and anyone handling sensitive information. The threat isn&#8217;t theoretical. Pegasus infections happen. These tools aren&#8217;t perfect protection, but they meaningfully shrink the attack surface.</p><p>If you found this analysis useful, subscribe at <a href="https://rockcybermusings.com/">rockcybermusings.com</a> for weekly intelligence on AI security developments.</p><p>&#128073; Visit <strong><a href="https://www.rockcyber.com/">RockCyber.com</a></strong> to learn more about how we can help you in your traditional Cybersecurity and AI Security and Governance Journey</p><p>&#128073; Want to save a quick $100K? Check out our AI Governance Tools at <strong><a href="https://aigovernancetoolkit.com/">AIGovernanceToolkit.com</a></strong></p><p>&#128073; Subscribe for more AI and cyber insights with the occasional rant.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading RockCyber Musings! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.rockcybermusings.com/?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share RockCyber Musings&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.rockcybermusings.com/?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share RockCyber Musings</span></a></p><h2>References</h2><p>Aikido Security. (2026, January 27). <em>Malicious VS Code extension analysis</em>. https://www.aikido.dev</p><p>BleepingComputer. (2026, January 23). <em>Konni hackers target blockchain engineers with AI-built malware</em>. https://www.bleepingcomputer.com/news/security/konni-hackers-target-blockchain-engineers-with-ai-built-malware/</p><p>BleepingComputer. (2026, January 24). <em>CISA adds actively exploited VMware vCenter flaw to KEV catalog</em>. https://www.bleepingcomputer.com/news/security/cisa-warns-of-actively-exploited-vmware-vcenter-flaw/</p><p>BleepingComputer. (2026, January 27). <em>Fortinet warns of new zero-day actively exploited in attacks</em>. https://www.bleepingcomputer.com/news/security/fortinet-warns-of-new-zero-day-actively-exploited-in-attacks/</p><p>BleepingComputer. (2026, January 28). <em>New WhatsApp lockdown feature protects high-risk users from hackers</em>. https://www.bleepingcomputer.com/news/security/whatsapp-gets-new-lockdown-feature-that-blocks-cyberattacks/</p><p>BleepingComputer. (2026, January 29). <em>Google disrupts IPIDEA proxy network abused by criminals</em>. https://www.bleepingcomputer.com/news/security/google-disrupts-ipidea-proxy-network-abused-by-criminals/</p><p>Broadcom. (2026, January 23). <em>VMSA-2024-0012.1: VMware vCenter Server security advisory update</em>. https://support.broadcom.com/web/ecx/support-content-notification/-/external/content/SecurityAdvisories/0/24453</p><p>Check Point Research. (2026, January 23). <em>AI-powered KONNI malware targets developers</em>. https://blog.checkpoint.com/research/ai-powered-north-korean-konni-malware-targets-developers</p><p>CISA. (2026, January 23). <em>CISA adds one known exploited vulnerability to catalog</em>. https://www.cisa.gov/news-events/alerts/2026/01/23/cisa-adds-one-known-exploited-vulnerability-catalog</p><p>CISA. (2026, January 27). <em>CISA adds Fortinet vulnerability to KEV catalog</em>. https://www.cisa.gov/known-exploited-vulnerabilities-catalog</p><p>CyberScoop. (2026, January 28). <em>Coalition Insurance issues 14th Fortinet zero-day advisory</em>. https://cyberscoop.com/fortinet-zero-day-coalition-insurance-advisory/</p><p>ESET. (2026, January 24). <em>ESET Research: Sandworm behind cyberattack on Poland&#8217;s power grid in late 2025</em>. https://www.welivesecurity.com/en/eset-research/eset-research-sandworm-cyberattack-poland-power-grid-late-2025/</p><p>EU AI Act. (2026). <em>High-risk AI system requirements and enforcement timeline</em>. https://artificialintelligenceact.eu/</p><p>Fortinet. (2026, January 27). <em>FortiCloud SSO authentication bypass vulnerability advisory</em>. https://www.fortinet.com/psirt</p><p>Google Cloud Blog. (2026, January 29). <em>Disrupting IPIDEA: Taking action against residential proxy abuse</em>. https://cloud.google.com/blog/topics/threat-intelligence/disrupting-ipidea-residential-proxy</p><p>Help Net Security. (2026, January 26). <em>ESET attributes DynoWiper-powered attack on Poland&#8217;s power grid to Russia-aligned Sandworm group</em>. https://www.helpnetsecurity.com/2026/01/26/poland-energy-malware-attack/</p><p>K&amp;L Gates. (2026, January). <em>EU AI Act compliance update: High-risk deadlines and regulatory sandboxes</em>. https://www.klgates.com/</p><p>TechRepublic. (2026, January 28). <em>WhatsApp adds one-tap security settings for added privacy</em>. https://www.techrepublic.com/article/news-whatsapp-strict-account-settings-lockdown-security-mode/</p><p>The Hacker News. (2026, January 26). <em>Konni hackers deploy AI-generated PowerShell backdoor</em>. https://thehackernews.com/2026/01/konni-hackers-deploy-ai-generated.html</p><p>The Hacker News. (2026, January 28). <em>Fake Moltbot VS Code extension drops remote access malware</em>. https://thehackernews.com/2026/01/fake-moltbot-vscode-extension.html</p><p>The Hacker News. (2026, January 30). <em>Italy blocks DeepSeek over data privacy concerns</em>. https://thehackernews.com/2026/01/italy-blocks-deepseek-privacy.html</p><p>The Register. (2026, January 26). <em>ESET: Russia likely behind Poland power grid attack</em>. https://www.theregister.com/2026/01/26/moscow_likely_behind_wiper_attack/</p><p>The Register. (2026, January 27). <em>Fake AI coding assistant malware hits VS Code marketplace</em>. https://www.theregister.com/2026/01/27/moltbot_vscode_malware/</p><p>The Register. (2026, January 29). <em>Google cripples IPIDEA proxy network abused by crims</em>. https://www.theregister.com/2026/01/29/google_ipidea_crime_network/</p><p>Wiz Research. (2026, January 29). <em>DeepSeek database exposure analysis</em>. https://www.wiz.io/blog/deepseek-database-exposed</p><p>World Economic Forum. (2026, January 12). <em>Global Cybersecurity Outlook 2026</em>. https://reports.weforum.org/docs/WEF_Global_Cybersecurity_Outlook_2026.pdf</p><p>Zetter, K. (2026, January 28). <em>Cyberattack targeting Poland&#8217;s energy grid used a wiper</em>. Zero Day. https://www.zetter-zeroday.com/cyberattack-targeting-polands-energy-grid-used-a-wiper/</p>]]></content:encoded></item></channel></rss>