Do you manage security risks when adopting AI solutions?

Loading last updated info...

AI tools are powerful, but they introduce attack surfaces that didn't exist before. It has never been easier to write malicious code, trick systems into leaking data, or poison software supply chains - and AI is at the centre of all of it. The threat landscape is evolving fast, with real breaches happening now.

You should understand the specific risks and have concrete defences in place.

Prompt Injection - The #1 LLM Risk

Prompt injection is the most critical risk in AI applications (OWASP LLM01:2025). It works by embedding malicious instructions into content that an LLM processes.

There are two forms:

Direct - A user crafts input that overrides the system prompt (e.g. "ignore all previous instructions and...")
Indirect - Malicious instructions are hidden in emails, documents, web pages, or code comments that the AI processes without the user ever seeing them

This is not theoretical. In 2025, a zero-click vulnerability in Microsoft 365 Copilot (CVE-2025-32711, known as "EchoLeak") allowed an attacker to send a single crafted email that, when Copilot summarized it, silently exfiltrated OneDrive files, SharePoint content, and Teams messages to an attacker-controlled server. No user interaction required.

Similarly, a GitHub Copilot RCE vulnerability (CVE-2025-53773) used invisible Unicode characters in public repo code comments to tell Copilot to enable auto-approve mode and then execute arbitrary shell commands.

Mitigations:

Treat all external data (emails, documents, web pages) entering an LLM context as untrusted
Implement privilege separation between AI processing and action execution
Never enable auto-approval/"YOLO" modes in AI coding tools on untrusted codebases
Deploy prompt injection detection (but be aware that classifiers can be bypassed)

AI-Powered Supply Chain Attacks

AI code-generation tools hallucinate package names - and attackers exploit this. The attack is called slopsquatting: an LLM suggests a plausible-sounding but nonexistent package, an attacker registers that name on npm/PyPI with a malicious payload, and the next developer who follows the AI's suggestion installs malware.

Research across 16 code-generation models and 756,000 code samples found that almost 20% recommended nonexistent packages. When a hallucination occurred, 43% of the time the same fake package name was suggested consistently, making it a reliable attack vector. One hallucinated package, "huggingface-cli", was uploaded as a placeholder and received over 30,000 downloads in three months.

Malicious package uploads to open-source repositories jumped 156% year-over-year in 2025. In one incident, the legitimate @solana/web3.js library was compromised - versions 1.95.6 and 1.95.7 contained code that stole users' private keys and drained cryptocurrency wallets.

Mitigations:

Verify every package name before installing - especially when the suggestion comes from an AI
Wait 7-14 days after a new package release before adopting it
Use lockfiles, pin dependency versions, and maintain SBOMs
Use tools like Socket.dev or Snyk to detect suspicious packages

MCP and Tool-Use Risks - When LLMs Are the Glue

The Model Context Protocol (MCP) and similar tool-use frameworks let LLMs connect to external APIs, databases, and services. This is powerful, but it means the LLM becomes an intermediary that trusts tool descriptions and executes actions with real credentials. If any part of that chain is compromised, the blast radius is enormous.

Tool poisoning is a real attack: malicious instructions are embedded in MCP tool descriptions (metadata the LLM reads but users typically don't see). Invariant Labs demonstrated that a malicious MCP server posing as a trivia game could silently exfiltrate a user's entire WhatsApp message history by hijacking a legitimate WhatsApp MCP server running in the same agent context.

Cursor IDE had two critical vulnerabilities in 2025:

CurXecute (CVE-2025-54135) - A crafted Slack message, when summarized by Cursor's AI, rewrote MCP configs and executed arbitrary commands
MCPoison (CVE-2025-54136) - An attacker commits a benign MCP config to a shared repo, teammates approve it, then the attacker silently modifies it to execute backdoor commands

Over 1,800 MCP servers were found publicly accessible without authentication. An unofficial Postmark MCP server with 1,500 weekly downloads was modified to silently BCC all emails to the attacker's address.

Mitigations:

Audit MCP server tool descriptions for hidden instructions (use tools like mcp-scan)
Require re-approval when MCP configurations change (not just on first use)
Run MCP servers in sandboxed environments with minimal permissions
Never grant MCP servers blanket access to credentials or sensitive data
Monitor MCP tool invocations for unexpected parameter values

AI-Generated Malware

AI has dramatically lowered the barrier to writing sophisticated malware. In January 2026, Check Point Research identified VoidLink - the first documented advanced malware framework authored almost entirely by AI. A single threat actor used an AI coding assistant to produce 88,000 lines of code in under a week, including custom loaders, rootkits, and modular plugins targeting cloud environments (AWS, GCP, Azure).

AI also enables polymorphic malware that rewrites itself at runtime. The proof-of-concept "BlackMamba" reaches out to an LLM API during execution to synthesize unique keylogging code on the fly - every run produces a different payload, defeating signature-based detection. 76% of detected malware now exhibits AI-driven polymorphism.

This means traditional antivirus is increasingly ineffective. You need behavioral analysis, not just signature matching.

Mitigations:

Move from signature-based detection to behavioural analysis and anomaly detection (EDR)
Monitor for executables making runtime API calls to LLM services
Implement application allowlisting
Use AI-powered security tools that can detect AI-generated threats

Shadow AI and Data Leakage

Shadow AI - employees using unauthorized AI tools - is now the #1 data exfiltration channel, responsible for 32% of all unauthorized corporate data movement. 71% of office workers admit to using AI tools without IT approval, and 77% of employees paste data into AI tools, with 82% of that activity coming from personal (unmanaged) accounts.

The consequences are real. Samsung had to ban all employee use of generative AI after employees pasted proprietary semiconductor source code into ChatGPT on three separate occasions. AI-associated data breaches cost an average of $4.80 million per incident.

Mitigations:

Provide sanctioned, enterprise-managed AI tools (e.g. Azure OpenAI, enterprise GitHub Copilot) so employees don't resort to personal accounts
Deploy DLP tools that monitor AI tool interactions (copy/paste, file uploads)
Establish and enforce clear AI acceptable use policies
Block unauthorized AI service domains at the network level
Ensure enterprise AI tools are opted out of model training

Agentic AI and OpenClaw - Autonomous Agents Are High-Risk

AI agents that autonomously use tools, access file systems, call APIs, and send messages represent the highest-risk category. When an agent gets compromised via prompt injection, every action it takes passes security checks because it's using legitimate credentials.

OpenClaw is a cautionary tale. This open-source AI assistant gained over 145,000 GitHub stars in two weeks, with 100,000+ users granting it autonomous access to their operating systems, messaging platforms, and corporate services. A critical vulnerability (CVE-2026-25253) enabled one-click remote code execution through token exfiltration. Thousands of exposed control panels were found leaking API keys and private messages.

OWASP released a dedicated Top 10 for Agentic Applications in late 2025, highlighting risks like agent goal hijacking, tool misuse, privilege abuse, and rogue agents.

Mitigations:

Apply principle of least privilege - scope agent permissions narrowly
Require human-in-the-loop approval for destructive or sensitive operations
Use time-limited, scoped tokens rather than long-lived credentials
Monitor agent activity for anomalous behaviour
Never give a single agent broad cross-system access

Summary

The OWASP Top 10 for LLM Applications (2025) provides a comprehensive framework for understanding LLM-specific risks:

Prompt Injection - Manipulating LLM behaviour via crafted inputs (covered above)
Sensitive Information Disclosure - LLMs revealing confidential data in responses, e.g. training data extraction or system prompt leakage through clever questioning
Supply Chain Vulnerabilities - Compromised third-party components, poisoned training data, and malicious plugins/extensions (covered above)
Data and Model Poisoning - Tampering with training data or fine-tuning to introduce backdoors, biases, or targeted misbehaviour into models
Improper Output Handling - Failing to validate or sanitize LLM outputs before passing them to downstream systems (e.g. an LLM generating SQL that gets executed directly, or HTML that gets rendered without escaping)
Excessive Agency - Granting LLMs too many capabilities, permissions, or autonomy without adequate guardrails (covered above under Agentic AI)
System Prompt Leakage - Exposing internal system prompts that reveal architecture details, security controls, filtering logic, or access to other tools
Vector and Embedding Weaknesses - Vulnerabilities in RAG pipelines and embedding databases, such as poisoning vector stores with adversarial documents that get retrieved and influence responses
Misinformation - LLMs generating false or misleading information presented as fact, including hallucinated citations, fabricated statistics, and confident but wrong technical advice
Unbounded Consumption - Resource exhaustion through excessive token/compute usage, either via denial-of-service attacks or poorly constrained agent loops that run up costs

AI adoption is essential, but it requires treating AI tools with the same rigour as any other attack surface - arguably more, because these tools are actively processing your data, writing your code, and increasingly acting autonomously on your behalf. Follow general security best practices and layer on the AI-specific controls above.

Do you manage security risks when adopting AI solutions?

Prompt Injection - The #1 LLM Risk

AI-Powered Supply Chain Attacks

MCP and Tool-Use Risks - When LLMs Are the Glue

AI-Generated Malware

Shadow AI and Data Leakage

Agentic AI and OpenClaw - Autonomous Agents Are High-Risk

Summary

Categories

Authors

Need help?

Do you manage security risks when adopting AI solutions?

Prompt Injection - The #1 LLM Risk

AI-Powered Supply Chain Attacks

MCP and Tool-Use Risks - When LLMs Are the Glue

AI-Generated Malware

Shadow AI and Data Leakage

Agentic AI and OpenClaw - Autonomous Agents Are High-Risk

Summary

Categories

Authors

Need help?