Article summary
The recent rise of AI agents has been accelerated by the use of model context protocol (MCP) servers. MCP has enabled complex integrations into AI workflows. It feels like every major service provider is racing to release MCP compatibility at the moment – some implementations more useful than others.
MCP’s high power integrations compounded with the ease of configuring them with tools like Claude have supercharged adoption of this new technology. With that has come a broad new attack surface. You hear reports of novel vulnerabilities and security risks associated with MCP regularly. Here, I’ll outline a few of these risks, including real-world examples in well-known services. Then we’ll discuss how you can continue to make use of MCP while protecting yourself.
Risks
Prompt Injection
Prompt injection has been a known issue with LLMs since their origination. Prior to mass-adoption of MCP servers, these exploits were largely limited to getting ChatBots to say inappropriate things or expose internal knowledge. Now that these ChatBots are orchestrating complex integrations between different service providers, the security risks have become more insidious.
In May 2025, Invariant Labs reported on a GitHub MCP vulnerability that risked exposing private repository data publicly. In this exploit an attacker could open an issue on a public repository with a malicious prompt injection payload. When an unknowing developer asked an MCP enabled agent to parse their public issues, the prompt injection would overwrite the original instructions to prompt the agent to open a new PR exposing info from any private repositories the developer had given access to. Notably, any developer with any amount of access to a private repo could have exposed their entire organization to this flaw simply by playing with GitHub’s official MCP server in their own “safe” local sandbox.
Bait and Switch
MCP also exposes a class of risks I refer to as bait-and-switch. This encompasses attacks you may have heard of like tool poisoning and rug pulls. The primary risk in these attacks is the lack of visibility a user has into the actual actions happening on the server-side. MCP users only configure their agent using MCP clients from services like NPM or UV. They may be presented a sanitized view of the server’s capabilities, but nothing prevents a malicious MCP server from performing nefarious actions contrary to the posted descriptions.
In my experience, the client package versioning schemes present a false sense of security. Installing a specific client version does not guarantee that behavior won’t change without a version upgrade. This has always been the case with client-side packages, but the problem is amplified considering that developers aren’t responsible for which APIs are being used in the workflow.
Broad Attack Surface
A more generic risk factor to be aware of with MCP is how broad the attack surface can be in some cases. As services scramble for acceptance into the MCP ecosystem, they’ve exposed large swaths of functionality to agents. Considering the maturity level of MCP, the unknown risk factors crossed with a broad attack surface can leave developers unsure of what is even possible.
In another recent vulnerability announcement, the browser-use MCP tool left agents free to browse the web beyond specified domain allow lists. When exposing an MCP tool to the open internet, prompt injection is an immediate concern. In this case, prompt injection could be used to redirect an agent to browsing sites that it thought were in the domain allow list, but were only masquerading as such. I don’t feel it is necessary to explain the inherent risk in allowing an agent free rein within your browser.
Protecting Yourself
I don’t write this post to discourage anyone from trying out MCP integrations with LLMs. In fact, I use these tools all the time in many of my side projects. Instead I hope to encourage an attitude of cautious curiosity. As discussed above, it’s not safe to simply trust big name tech companies to get this right on the first iteration. Here’s some tips for protecting yourself while experimenting with MCP.
Human in the Loop
It’s fun to build agents that have full autonomy over their decisions and actions. However, this poses security risks if not properly constrained. When building, agents keep the human in the loop before the agent can make any constructive or destructive actions. If your agent platform doesn’t offer human-in-the-loop interruptions, consider a new tool.
Principle of Least Privilege
The principle of least privilege is an old-school security concept, but it’s more relevant than ever in an extremely interconnected internet. When creating tokens or keys to authorize MCP servers access to third-party tools, be careful with what privileges you grant. GitHub has an interface for generating access tokens with tightly scoped permissions. Carefully considering the permissions granted could have prevented devs from being impacted by the vulnerability discussed above.
Trusted Sources
If you have spent any time recently on tech adjacent sub-reddits or Hacker News, you’ll know that there is a non-stop stream of vibe-coded MCP servers being announced. A majority of these are likely harmless, but it can be extremely hard to vet the creators and their intentions. As I’ve stated, I wouldn’t blindly trust the big-name tech firms to produce safe MCP servers, but at the very least you should be able to trust that they aren’t being actively malicious.
Observability
Agent observability allows you to keep a close eye on what outputs and actions your agents are using, even the intermediate steps of workflows. This can be handy when building agents locally, but it’s critical if you’re deploying an agent into a production system. LangSmith is a great platform if you’re in the LangChain ecosystem. Otherwise, I’ve experimented with LangFuse as an open-source alternative.
Looking to the Future
The AI ecosystem is moving fast. I anticipate MCP, and other protocols like it, will continue to evolve rapidly. In fact, while I was writing this post an MCP spec update was released with an eye toward security improvements. I’ve yet to see anything that addresses the fundamental issues I’ve described here, though.
I plan to continue experimenting with LLMs and MCP with cautious curiosity. I think there’s a lot of value to be unlocked in this space. But that’s only meaningful if we can operate in the space safely.
Very helpful MCP server consultation post. The content is structured logically and easy to understand.
If anyone wants to dive deeper into MCP server development consultation, this guide is excellent: https://mobisoftinfotech.com/services/mcp-server-development-consultation
It gave me new strategies to apply right away. Very useful resource.