AI Agents and the Non-Human Identity Challenge: Strategies for Secure Large-Scale AI Deployment

مقالات

Artificial intelligence is transforming enterprise productivity, from code completion tools to chatbots that efficiently retrieve information from internal knowledge bases. However, as these AI agents operate across various corporate services, they contribute to the increasing number of non-human identities (NHIs) present in cloud environments.

Currently, organizations face the challenge of managing a staggering ratio of approximately 45 machine identities for every human user. Service accounts, CI/CD bots, containers, and AI agents all require secure methods of authentication, typically utilizing API keys, tokens, or certificates. According to GitGuardian’s recent findings, over 23.7 million secrets were exposed on public GitHub in 2024 alone. Notably, repositories employing Copilot exhibited secret leaks 40 percent more frequently.

Unlike human users who are subject to security policies regarding credential rotation and permission management, NHIs frequently lack such oversight. This absence of management leads to a tangled network of connections that attackers could exploit, often long after secrets have been generated. The rising adoption of AI technologies, particularly large language models (LLMs) and retrieval-augmented generation (RAG), has considerably accelerated the occurrence of this risk-laden sprawl.

An illustrative scenario involves an internal support chatbot powered by an LLM. If prompted to explain how to access a development environment, the bot might inadvertently disclose valid credentials found in a Confluence page. Such exposure could lead to significant security challenges, especially if developers are instructed to utilize plaintext credentials.

Despite these risks, organizations can enhance their security posture by implementing effective governance frameworks around NHIs and secret management. The following actionable controls can help mitigate risks associated with AI-driven NHIs:

1. Audit and Clean Up Data Sources
LLMs initially relied on well-defined datasets, but the advent of RAG has enabled them to access a wider range of external data. This expanded access raises the risk of exposing internal secrets if they are present in the data sources used. Platforms like Jira, Slack, and Confluence were not designed considering AI or the potential for secrets exposure. To safeguard against leakage, organizations should eliminate any sensitive information before AI models can access them, utilizing tools like GitGuardian for effective secret management.

2. Centralize Your Existing NHIs Management
Effective management of NHIs begins with comprehensive auditing. Without a complete inventory of service accounts, bots, and agents, it is challenging to establish security protocols around new NHIs associated with AI implementations. All NHIs possess authentication secrets; thus, focusing on the storage and management of these secrets is crucial. Solutions such as HashiCorp Vault and AWS Secrets Manager can facilitate central management of these identities.

3. Prevent Secrets Leaks in LLM Deployments
Model Context Protocol (MCP) has standardized AI service access, reducing the likelihood of hardcoding credentials during integrations. However, research has shown a concerning 5.2% of MCP servers contain hardcoded secrets, surpassing the 4.6% average in public repositories. Early-stage safeguards during development, such as implementing secrets detection mechanisms, can prevent these vulnerabilities from reaching production environments.

4. Improve Logging Security
LLMs generate outputs based on probabilistic responses, resulting in extensive logging of prompts, contexts, and responses. If credentials are exposed within these logs, the risk of multiple leaks increases, especially when logs are stored in inadequately secured cloud environments. Implementing sanitization processes before logging sensitive information can significantly mitigate this risk, aided by tools that integrate with existing workflows.

5. Restrict AI Data Access
It is crucial to evaluate the access permissions granted to AI systems. While certain internal tools might benefit from broader access for efficiency, customer-facing AI applications should adhere to strict access controls. Establishing a principle of least access can prevent potential misuse and ensure secure operations.

In addition to these controls, fostering a culture of awareness among developers is essential. Adequate training and communication can help bridge the gap between AI security policies and practical implementation, ensuring teams are aligned on best practices.

The continual elevation of AI adoption within enterprises will depend on the rigorous management of non-human identities, equipping organizations to monitor, govern, and securely scale their AI initiatives. By prioritizing security from the outset, businesses can harness the capabilities of intelligent automation while maintaining robust safeguards against potential threats.