AI tools are powerful, but they introduce attack surfaces that didn't exist before. It has never been easier to write malicious code, trick systems into leaking data, or poison software supply chains - and AI is at the centre of all of it. The threat landscape is evolving fast, with real breaches happening now.
You should understand the specific risks and have concrete defences in place.
Prompt injection is the most critical risk in AI applications (OWASP LLM01:2025). It works by embedding malicious instructions into content that an LLM processes.
There are two forms:
This is not theoretical. In 2025, a zero-click vulnerability in Microsoft 365 Copilot (CVE-2025-32711, known as "EchoLeak") allowed an attacker to send a single crafted email that, when Copilot summarized it, silently exfiltrated OneDrive files, SharePoint content, and Teams messages to an attacker-controlled server. No user interaction required.
Similarly, a GitHub Copilot RCE vulnerability (CVE-2025-53773) used invisible Unicode characters in public repo code comments to tell Copilot to enable auto-approve mode and then execute arbitrary shell commands.
Mitigations:
AI code-generation tools hallucinate package names - and attackers exploit this. The attack is called slopsquatting: an LLM suggests a plausible-sounding but nonexistent package, an attacker registers that name on npm/PyPI with a malicious payload, and the next developer who follows the AI's suggestion installs malware.
Research across 16 code-generation models and 756,000 code samples found that almost 20% recommended nonexistent packages. When a hallucination occurred, 43% of the time the same fake package name was suggested consistently, making it a reliable attack vector. One hallucinated package, "huggingface-cli", was uploaded as a placeholder and received over 30,000 downloads in three months.
Malicious package uploads to open-source repositories jumped 156% year-over-year in 2025. In one incident, the legitimate @solana/web3.js library was compromised - versions 1.95.6 and 1.95.7 contained code that stole users' private keys and drained cryptocurrency wallets.
Mitigations:
The Model Context Protocol (MCP) and similar tool-use frameworks let LLMs connect to external APIs, databases, and services. This is powerful, but it means the LLM becomes an intermediary that trusts tool descriptions and executes actions with real credentials. If any part of that chain is compromised, the blast radius is enormous.
Tool poisoning is a real attack: malicious instructions are embedded in MCP tool descriptions (metadata the LLM reads but users typically don't see). Invariant Labs demonstrated that a malicious MCP server posing as a trivia game could silently exfiltrate a user's entire WhatsApp message history by hijacking a legitimate WhatsApp MCP server running in the same agent context.
Cursor IDE had two critical vulnerabilities in 2025:
Over 1,800 MCP servers were found publicly accessible without authentication. An unofficial Postmark MCP server with 1,500 weekly downloads was modified to silently BCC all emails to the attacker's address.
Mitigations:
AI has dramatically lowered the barrier to writing sophisticated malware. In January 2026, Check Point Research identified VoidLink - the first documented advanced malware framework authored almost entirely by AI. A single threat actor used an AI coding assistant to produce 88,000 lines of code in under a week, including custom loaders, rootkits, and modular plugins targeting cloud environments (AWS, GCP, Azure).
AI also enables polymorphic malware that rewrites itself at runtime. The proof-of-concept "BlackMamba" reaches out to an LLM API during execution to synthesize unique keylogging code on the fly - every run produces a different payload, defeating signature-based detection. 76% of detected malware now exhibits AI-driven polymorphism.
This means traditional antivirus is increasingly ineffective. You need behavioral analysis, not just signature matching.
Mitigations:
Shadow AI - employees using unauthorized AI tools - is now the #1 data exfiltration channel, responsible for 32% of all unauthorized corporate data movement. 71% of office workers admit to using AI tools without IT approval, and 77% of employees paste data into AI tools, with 82% of that activity coming from personal (unmanaged) accounts.
The consequences are real. Samsung had to ban all employee use of generative AI after employees pasted proprietary semiconductor source code into ChatGPT on three separate occasions. AI-associated data breaches cost an average of $4.80 million per incident.
Mitigations:
AI agents that autonomously use tools, access file systems, call APIs, and send messages represent the highest-risk category. When an agent gets compromised via prompt injection, every action it takes passes security checks because it's using legitimate credentials.
OpenClaw is a cautionary tale. This open-source AI assistant gained over 145,000 GitHub stars in two weeks, with 100,000+ users granting it autonomous access to their operating systems, messaging platforms, and corporate services. A critical vulnerability (CVE-2026-25253) enabled one-click remote code execution through token exfiltration. Thousands of exposed control panels were found leaking API keys and private messages.
OWASP released a dedicated Top 10 for Agentic Applications in late 2025, highlighting risks like agent goal hijacking, tool misuse, privilege abuse, and rogue agents.
Mitigations:
The OWASP Top 10 for LLM Applications (2025) provides a comprehensive framework for understanding LLM-specific risks:
AI adoption is essential, but it requires treating AI tools with the same rigour as any other attack surface - arguably more, because these tools are actively processing your data, writing your code, and increasingly acting autonomously on your behalf. Follow general security best practices and layer on the AI-specific controls above.