Friday 8 May 2026

When Your AI Coding Agent Becomes the Attacker: The TrustFall Vulnerability

AI coding agents are becoming a supply chain attack vector — and the tools you trust most may be the easiest to compromise.

Lead story

When Your AI Coding Agent Becomes the Attacker: The TrustFall Vulnerability

Researchers have disclosed a new class of attack called "TrustFall" that turns AI coding agents — Claude Code, Cursor CLI, Gemini CLI, GitHub Copilot CLI — into unwitting participants in supply chain compromises. The technique exploits a simple but uncomfortable truth: these tools are designed to be helpful, and helpfulness can be weaponised.

Here's how it works. An attacker embeds malicious instructions inside a repository — in a README, a config file, a comment, anywhere the agent will read. When a developer opens that project and the AI agent parses the files, the hidden instructions trigger code execution or redirect the agent's actions without the user ever knowing. The warning dialogs these tools display are, according to researchers at Adversa AI, too vague and too easily dismissed. Anthropic's official response — essentially "users shouldn't click OK without reading" — is technically true and practically useless.

The deeper problem here is architectural. AI coding agents are built to ingest context from their environment, which is exactly what makes them valuable. They read your codebase, understand your dependencies, follow your project conventions. But that same appetite for context is what makes them susceptible to prompt injection at scale. A malicious open-source package, a compromised repo, a poisoned template — any of these can become the instruction set for an agent that has broad file system access, shell execution rights, and often OAuth tokens connected to your SaaS stack.

That last point is where it gets serious. Separately, researchers at Mitiga found that Claude Code's OAuth tokens can be silently hijacked through malicious Model Context Protocol (MCP) server configurations. An attacker who redirects MCP traffic doesn't just get code execution — they can maintain persistent access to every SaaS platform the developer's Claude session was connected to. GitHub, Jira, Slack, cloud provider consoles. The token theft is passive and leaves almost no trace.

Why this matters now: AI coding agents have gone from curiosity to critical infrastructure in about 18 months. Anthropic just raised Claude Code's usage limits after signing a new deal with SpaceX, and the tool is now deeply embedded in enterprise development workflows alongside competitors from Google, Microsoft, and Cursor. The attack surface these agents represent has scaled with their adoption — but the security model largely hasn't.

For Australian organisations, the exposure is real. Claude Code, Copilot, and Gemini CLI are all widely deployed across Australian tech teams and enterprise development shops. Under the Privacy Act and SOCI Act obligations, an OAuth token compromise that grants access to cloud infrastructure or customer data systems isn't just a developer problem — it's a notifiable incident. Security teams that haven't yet reviewed what permissions their AI coding agents hold, or what repositories they're being pointed at, should treat this as a prompt.

What to watch: Whether the major vendors — Anthropic, Google, Microsoft — respond with substantive changes to their agent permission models, or whether we get another round of "use it responsibly" guidance. The TrustFall researchers argue the fix has to be at the tooling level, not the user behaviour level. They're right.

Also today

Australia's ACSC Warns of ClickFix Campaign Spreading Vidar Stealer

The Australian Cyber Security Centre has issued an active warning about a social engineering campaign using a technique called ClickFix — where users are tricked into running malicious commands by fake browser error pop-ups — to deliver the Vidar information-stealing malware. Vidar is a well-established infostealer capable of harvesting passwords, browser cookies, cryptocurrency wallets, and two-factor authentication data. The ACSC warning signals the campaign has meaningful reach inside Australia, making this one for security teams and IT admins to brief staff on immediately, particularly in organisations where employees access sensitive systems via browser-based tools.

Bleeping Computer ↗

Ivanti EPMM Zero-Day Grants Admin-Level Remote Code Execution

Ivanti is warning customers of active exploitation of a freshly disclosed vulnerability in its Endpoint Manager Mobile product — CVE-2026-6973, rated 7.2 CVSS. The flaw allows a remotely authenticated attacker with admin credentials to achieve full remote code execution, affecting EPMM versions prior to 12.6.1.1, 12.7.0.1, and 12.8.0.1. EPMM is widely deployed in enterprise and government environments to manage mobile device fleets, including in Australian federal and state agencies. Ivanti has had a rough run of high-severity vulnerabilities in its enterprise products over the past two years — organisations still running unpatched versions should treat this as urgent.

Bleeping Computer ↗

PCPJack Worm Exploits Five CVEs to Evict Rival Hackers From Cloud Systems

A newly disclosed credential-theft framework called PCPJack is doing something unusual: it's actively hunting down and removing infections left by a competing threat actor, TeamPCP, while simultaneously harvesting credentials from the same compromised cloud environments. The toolset exploits five known CVEs to spread worm-like across exposed cloud infrastructure, then exfiltrates credentials for cloud platforms, container registries, developer tools, productivity suites, and financial services through attacker-controlled channels. Think of it as a hostile cloud takeover — one criminal crew evicting another and quietly looting the place on the way through. A reminder that unpatched cloud exposure doesn't attract just one attacker.

The Hacker News ↗

Palo Alto Zero-Day Tied to Chinese State Hacking Campaign

New details have emerged on the PAN-OS firewall zero-day we covered Thursday — Palo Alto Networks now says exploitation traces back to at least April 9, and the campaign bears hallmarks consistent with Chinese state-sponsored activity, though the company has stopped short of a formal attribution. The CVE-2026-0300 buffer overflow allows unauthenticated remote code execution via the User-ID Authentication Portal. Given the target profile — government and critical infrastructure networks — and the espionage tradecraft observed, this is shaping up as a significant nation-state intrusion campaign. Australian organisations using Palo Alto appliances should verify patch status immediately.

SecurityWeek ↗

Anthropic's Mythos AI Found 271 Real Bugs in Firefox — Almost No False Positives

Mozilla says it has "completely bought in" on AI-assisted vulnerability discovery after Anthropic's Mythos system surfaced 271 security flaws in Firefox — with a false positive rate that researchers described as nearly zero. That's a remarkable signal-to-noise ratio for automated bug finding, which has historically been plagued by false positives that burn analyst time. Mozilla says the findings span high-severity issues, and the programme has fundamentally changed how the organisation approaches security testing. This is a genuine inflection point for AI-assisted security research, and other browser vendors and open-source projects will be watching closely.

Ars Technica ↗

OpenAI Starts Testing Ads in ChatGPT

OpenAI has confirmed it is testing advertisements inside ChatGPT, marking a significant strategic shift for a product that has so far been funded entirely by subscriptions and API revenue. The company says ads will be clearly labelled, won't influence the answers ChatGPT gives, and will come with privacy protections and user controls. Sceptics will note that "answers won't be influenced" is a promise that's easy to make and hard to verify — especially as ad revenue creates structural pressure over time. For the broader AI industry, this is a meaningful moment: it suggests even the most well-capitalised AI labs are looking at traditional internet monetisation models to sustain free-tier access.

OpenAI Blog ↗

Thousands of Vibe-Coded Apps Are Leaking Sensitive Data in Public

A WIRED investigation found that thousands of web applications built using AI-assisted "vibe coding" platforms — Lovable, Base44, Replit, Netlify and others — are exposing corporate and personal data on the open internet. The tools let anyone spin up a functional web app in minutes without writing code, but the ease of creation doesn't come with guardrails around data handling. Exposed data includes credentials, customer records, and internal business information. This is the supply chain risk of democratised software development: the attack surface expands faster than any security team can monitor it. Australian businesses using these platforms to build internal tools should audit what data those apps touch.

WIRED Security ↗

EU Strikes Deal to Simplify AI Act, Bans Deepfake Nudification Tools

European leaders have agreed on a package to streamline the AI Act, pushing back enforcement of rules covering high-risk AI systems — including those involving biometrics, employment decisions, and critical infrastructure — to December 2027. The deal also introduces a specific ban on AI-powered nudification tools, which generate non-consensual intimate imagery. The simplification responds to industry pressure that the original framework was too burdensome for smaller developers. For Australia, the contrast is notable: the EU is actively legislating AI harms while Australia's AI governance framework remains largely voluntary. The Online Safety Act covers some deepfake content, but a specific nudification tool ban doesn't yet exist.

The Record ↗

Australia Still Lacks Modern Electronic Surveillance Laws — Seven Years After the Last Review

A sharp piece in The Mandarin notes that Australia still has no modern electronic surveillance legislation, despite multiple reviews over the past seven years warning that the existing framework is outdated, fragmented, and increasingly ill-equipped to handle contemporary intelligence and law enforcement needs. The gap is particularly pointed given the government's simultaneous push to expand digital capabilities across agencies. The article lands the week after Australia launched its new Cyber Incident Review Board — a moment of institutional ambition that sits in odd tension with this foundational legislative paralysis. For privacy advocates and civil liberties groups, the delay represents compounding risk.

The Mandarin ↗

Real-Time Deepfake Software Is Powering an Emerging Wave of Scams

404 Media obtained and analysed "Haotian AI", a Chinese-developed real-time deepfake platform actively marketed to scammers. The software can substitute a fraudster's face and voice with anyone else's during live video calls on WhatsApp, Zoom, and Microsoft Teams — in real time, with convincing results. The investigation found it being used in romance scams, business email compromise fraud, and identity verification bypass. This is the commercial productisation of deepfake fraud, and it's moving faster than most verification systems can adapt. Australia's ACSC has previously flagged deepfake-enabled fraud as an emerging threat vector for financial institutions and government services.

404 Media ↗

ShinyHunters Claims Second Instructure Hack, Defaces School Login Pages

The cybercrime group ShinyHunters has claimed a second breach of Instructure, the company behind the Canvas learning management system used across thousands of universities and schools globally. As proof, the group defaced login pages of several Instructure customer institutions with an extortion message. Instructure's Canvas platform is widely deployed in Australian universities and TAFEs — if a significant data breach is confirmed, affected institutions could face notification obligations under the Privacy Act's Notifiable Data Breach scheme. The incident also raises broader questions about how deeply educational institutions have assessed the vendor risk of their core learning infrastructure.

TechCrunch ↗

Sources consulted