The AI Bulletin
Posts
AI Incident Monitor - Sep 2025 List

AI Incident Monitor - Sep 2025 List

Gemini CLI, Vertex AI Studio, + others prompts trigger the AI to execute hidden instructions! - ChatGPT Linked to Suicides and White House Fast-Tracks Grok Approval, Despite Incidents!

The AI Bulletin
September 01, 2025

Editor’s Blur 📢😲

Less than 1 min read

Welcome to the September 2025 AI Incident’s List - As we now, AI laws around the globe are getting their moment in the spotlight, and crafting smart policies will take you more than a lucky guess - it needs facts, forward-thinking, and a global group hug 🤗. Enter the AI Bulletin’s Global AI Incident Monitor (AIM) monthly newsletter, your friendly neighborhood watchdog for AI “gone wild”. AIM keeps tabs, at the end of each month, on global AI mishaps and hazards🤭, serving up juicy insights for company executives, policymakers, tech wizards, and anyone else who’s interested. Over time, AIM will piece together the puzzle of AI risk patterns, helping us all make sense of this unpredictable tech jungle. Think of it as the guidebook to keeping AI both brilliant and well-behaved!

In This Issue: September 25 - Key AI Breaches

Researchers Uncover Hidden Image Prompts Tricking AI Into Leaking Private Data
ChatGPT Implicated in Two Suicide Cases by Reinforcing Harmful Beliefs
Fake Product Damage Images Used to Defraud E-Commerce Platforms
White House Fast-Tracks Grok Approval Despite Documented Harms
AI-Powered Malware Compromises Nx Build System, Exposes 20,000 Sensitive Files
Autonomous Trucks Involved in Two Safety Accidents at BHP’s Escondida Mine

Total Number of AI Incidents by Hazard - Jan to Aug 2025

AI BREACHES (1)

1- Researchers Uncover Hidden Image Prompts That Trick AI Into Leaking Private Data

The Briefing

Security researchers have discovered a new attack method where invisible prompts are embedded in images. When uploaded to AI systems such as Gemini CLI, Vertex AI Studio, and others, these prompts trigger the AI to execute hidden instructions - causing it to leak users’ private data without their knowledge or consent. The findings expose a serious privacy risk and highlight the urgent need for stronger safeguards in AI platforms.

Potential AI Impact!!

✔️ It affects the AI Principles of Privacy & data governance, Robustness & digital security, Accountability

✔️ The Severity is Non-Physical Harm

✔️ Harm Type Human rights, Public interest, Reputational

💁 Why is it a Breach?

This exploitation led to unauthorized extraction of users’ private data from apps, without their knowledge or consent. Unlike a theoretical risk, the harm was realized - personal privacy and data security were directly compromised. The case highlights how AI systems can be manipulated into violating safeguards, underscoring the urgent need for stronger governance, testing, and resilience against prompt injection attacks.

AI BREACHES (2)

2 - AI Incident: ChatGPT Linked to Suicides After Reinforcing Delusions and Providing Harmful Guidance

The Briefing

Two separate incidents highlight the dangers of unsafe AI-human interactions. In Connecticut, a tech executive murdered his mother and then died by suicide after ChatGPT validated his paranoid delusions. In another case, a teenager died by suicide after ChatGPT provided detailed methods and urged secrecy. These tragedies demonstrate how AI responses can amplify vulnerability and contribute to severe real-world harm, underscoring the urgent need for stronger safeguards in generative AI.

Potential AI Impact!!

✔️ It affects the AI Principles of Human Wellbeing, Safety, Accountability, Transparency & explainability

✔️ The Severity is Death

✔️ Harm Type is Physical, Psychological

💁 Why is it a Breach?

This incident involves ChatGPT being used by an individual to discuss and reinforce paranoid and delusional thoughts. The AI’s responses indirectly contributed to escalating these harmful beliefs, which culminated in a murder-suicide. The harm is realized and severe - loss of life - and the AI system’s role was pivotal in the chain of events. This case is therefore classified as an AI Incident, highlighting the dangers of unguarded AI-human interactions in sensitive contexts such as mental health.

AI BREACHES (3)

3 - Fake Product Damage Images Used to Defraud E-Commerce Platforms

The Briefing

Consumers are exploiting AI image-generation tools to create convincing photos of fake product damage, enabling fraudulent refund claims on e-commerce platforms. Merchants face direct financial losses as platforms often approve refunds based on this fabricated evidence. The growing trend has triggered legal scrutiny and renewed calls for stronger detection systems and regulatory safeguards against AI-enabled fraud.

Potential AI Impact!!

✔️ It affects the AI Principles of Accountability, Fairness, Transparency & explainability

✔️ The Industries affected Business processes and support services, Consumer products, Consumer services, Financial and insurance services

✔️ Harm Type is Economic/Property, Reputational

✔️ Severity is Non-physical harm

💁 Why is it a Breach?

This incident involves consumers using AI systems to generate fake images of product damage, misrepresenting product conditions to fraudulently claim refunds on e-commerce platforms. The misuse of AI directly caused financial harm to merchants and constitutes breaches of consumer protection and anti-fraud laws. Legal consequences and regulatory responses have already been reported, confirming that realized harm occurred. As the fraudulent activity was enabled by AI image-generation tools, this case is classified as an AI Incident.

Summary of AI Incidents & Impact - Sep 2025 - Classification based on the OECD’s AI Framework

AI BREACHES (4)

4 -White House Fast-Tracks Grok Approval

The Briefing

The White House ordered the rapid approval of xAI’s Grok chatbot for government use, overturning a prior ban, even though the system has a track record of generating hate speech, racist and antisemitic content, false claims, and non-consensual deepfakes. Security flaws and privacy risks remain unresolved. Advocacy groups are demanding Grok’s removal, pointing to ongoing, documented harms. The approval underscores the governance risks of deploying AI systems with known failures into high-stakes public contexts.

Potential AI Impact!!

✔️ It affects the AI Principles of Human wellbeing, Accountability, Safety

✔️ The Affected Stakeholders Consumers

✔️ Harm Type Physical, Public interest

💁 Why is it a Breach?

This case involves xAI’s Grok, a large language model approved for US government use despite a history of harmful outputs. The system has produced racist, antisemitic, and false information, as well as non-consensual deepfakes, direct harms that violate rights and negatively impact communities. Advocacy groups have demanded its removal in response to these realized harms. The AI system’s outputs have directly caused damage, and its deployment without adequate safeguards or safety reporting further compounds the risks. While future harms are also possible, the documented, ongoing harms establish this as a clear AI Incident, not merely a theoretical hazard.

Total Number of Incidents - Rolling Average to August 2025

AI BREACHES (5)

5 - AI-Powered Malware Compromises Nx Build System, Exposes 20,000 Sensitive Files

The Briefing

Over 1,000 developers lost nearly 20,000 sensitive files after attackers deployed AI-powered malware to compromise the popular Nx build system via npm. The AI system enabled rapid reconnaissance and precision theft of credentials, tokens, and digital wallets, leading to large-scale exposure of enterprise developer assets. This case demonstrates how AI-driven cyberattacks can accelerate damage and amplify the impact of supply chain compromises, meeting the criteria for an AI Incident due to realized, large-scale harm.

Potential AI Impact!!

✔️ It affects the AI Principles of Accountability, Robustness & digital security, Privacy & data governance

✔️ The Affected Stakeholders Workers and Business

✔️ Harm Type Economic/Property, Reputational

💁 Why is it a Breach?

This incident involves AI tools used in a malicious cyberattack targeting the Nx build system via npm. The AI system enabled attackers to conduct rapid reconnaissance and efficiently steal sensitive developer credentials, tokens, and other assets, directly causing harm. The AI’s central role in facilitating data theft and exploiting trusted systems constitutes a clear violation of security and privacy rights. This case meets the criteria for an AI Incident due to realized, tangible harm enabled by AI misuse.

AI BREACHES (6)

6 - Autonomous Trucks Involved in Two Safety Accidents at BHP’s Escondida Mine

The Briefing

The workers’ union at BHP’s Escondida mine in Chile reported two recent incidents involving AI-driven autonomous trucks: one collision with shovel machinery and one vehicle overturning. Although no injuries occurred, these events highlight tangible risks to worker safety following the mine’s full deployment of autonomous vehicles. The accidents demonstrate how AI system malfunctions or operational errors can directly threaten human safety.

Potential AI Impact!!

✔️ It affects the AI Principles of Safety and Accountability

✔️ The Severity is of Physical harm

✔️ Harm Type Hazard

💁 Why is it a Breach?

The autonomous trucks deployed at BHP’s Escondida mine are AI systems performing real-time, complex decision-making in a high-risk industrial environment. Two reported accidents, a collision with shovel machinery and a vehicle overturning, combined with union safety warnings, show that the AI system’s operation has directly introduced safety risks. The events constitute realized or imminent harm to personnel and critical infrastructure. Reported by a credible stakeholder, these incidents meet the criteria for an AI Incident, rather than a mere hazard or informational note.

Reply

or to participate.