GitLab Duo Vulnerability Allowed Exploitation of AI Responses through Concealed Prompts

164

Publish date: 23.05.2025

Blog

164

23.05.2025

Cybersecurity researchers have identified a vulnerability in GitLab’s AI assistant, Duo, specifically an indirect prompt injection flaw that could enable attackers to exfiltrate source code and inject untrusted HTML into responses, potentially redirecting victims to malicious sites.

GitLab Duo, an AI-powered coding assistant leveraging Anthropic’s Claude models, was introduced in June 2023 to assist users in writing, reviewing, and editing code. However, according to findings from Legit Security, GitLab Duo Chat has vulnerabilities that allow attackers to compromise private projects, manipulate code suggestions presented to other users, and even exfiltrate undisclosed zero-day vulnerabilities.

Prompt injection is a common class of vulnerabilities in AI systems that can be exploited by malicious actors to manipulate language model behavior, leading to unwanted outcomes. Indirect prompt injections are particularly stealthy; they embed rogue instructions within other contexts, such as documents or web pages that the model processes.

Recent studies have revealed that language models (LLMs) are also susceptible to jailbreak techniques. These attacks manipulate the model into producing harmful information that disregards established ethical guidelines, instead of requiring prompt engineering. Furthermore, techniques like Prompt Leakage (PLeak) may inadvertently expose critical preset instructions to attackers, which they can then exploit.

For organizations, the ramifications are significant. Sensitive internal information, such as operational guidelines, filtering parameters, and user roles, can leak, potentially resulting in data breaches, trade secret disclosures, or regulatory infractions.

The research indicates that it is possible to insert hidden comments in GitLab’s merge request descriptions, commit messages, or source code that can lead to data leaks or HTML injection into Duo’s responses. Techniques such as Base16 encoding, Unicode smuggling, and rendering in concealed text can further obscure these concealed prompts, which GitLab did not sufficiently scrutinize, allowing insertion across its site.

The consequence is that Duo interprets the entire context of a page, making it vulnerable to instructions embedded in comments or code. An attacker could trick the AI system into generating malicious JavaScript or redirecting users to phishing sites that capture credentials.

In addition to the above, token exploits leverage Duo’s access to merge request details to insert hidden prompts into project descriptions, enabling attackers to exfiltrate private source code to their own servers. This vulnerability arises from the assistant’s markdown rendering capabilities, which process and execute HTML code in the user’s browser.

Following responsible disclosure by Legit Security on February 12, 2025, GitLab has addressed these vulnerabilities. This incident underscores the dual nature of AI assistants: while they enhance productivity, they also introduce substantial risk when embedded deeply into development workflows.

Legit Security’s findings emphasize that hidden instructions can be embedded in seemingly harmless project content to manipulate behavior and exfiltrate sensitive information, demonstrating the unintended consequences AI systems can generate.

The security community remains vigilant, highlighting similar vulnerabilities found in other AI tools, such as Microsoft Copilot for SharePoint, which also poses risks of unauthorized access to sensitive information.

AI frameworks, such as ElizaOS, which enable decentralized operations within Web3, may also be manipulated through injected instructions, leading to significant security challenges. The cascading effects of a single breach underscore the necessity for comprehensive detection and mitigation strategies throughout the entire ecosystem.

Prompt injections and other vulnerabilities aside, hallucination remains another pressing issue affecting LLMs. This occurs when a model generates inaccurate or fabricated responses, often exacerbated by directives for conciseness, which may impose restrictions on the model’s ability to provide accurate or helpful answers.

Overall, organizations must prioritize the strengthening of AI security measures, undertake thorough security assessments, and incorporate stringent input sanitization protocols to mitigate potential risks and safeguard sensitive data against emerging threats.