You can't secure what you can't see. As AI agents proliferate across enterprise environments, organizations face a critical challenge: most don't know how many AI identities are operating in their systems, what those agents have access to, or who's responsible for them. This visibility gap creates Shadow AI – untracked autonomous systems that represent both security vulnerabilities and compliance risks.
The Scale of the Unknown
Enterprise environments today contain a staggering number of machine identities. Research indicates machine identities outnumber human identities by ratios of 45:1 on average, with some organizations reaching 80:1. This includes traditional service accounts, API keys, and certificates – but increasingly, it also includes AI agents, LLM integrations, and autonomous workflow systems.
The challenge is that many of these identities were created without centralized oversight. Developers spin up API keys for AI coding assistants. Marketing teams connect AI tools to CRM systems. Operations staff integrate AI monitoring agents with cloud infrastructure. Each integration creates one or more machine identities that may not appear in any inventory.
When asked "how many AI agents are operating in your environment?", most organizations can only guess. This is a fundamental governance failure – you cannot apply least privilege, detect anomalies, or ensure compliance for identities you don't know exist.
Understanding What You're Looking For
Before launching a discovery effort, clarify what types of AI and machine identities to search for:
AI Agents and Assistants
These are AI-powered tools that take autonomous actions: coding assistants that commit code, customer service bots that modify records, procurement agents that initiate purchases. They typically connect via OAuth tokens, API keys, or service accounts.
LLM API Integrations
Applications calling large language model APIs (OpenAI, Anthropic, Azure OpenAI, etc.) represent AI identities even if they don't "act autonomously." Each integration typically has an API key and may have access to sensitive data that's sent to the model.
ML/AI Pipelines
Machine learning training and inference pipelines often run under service accounts with access to data stores, compute resources, and model registries. These are AI identities that require governance.
RPA and Automation Bots
Robotic Process Automation bots, while not always "AI" in the LLM sense, are autonomous agents that interact with systems. They typically have their own credentials and permissions.
Third-Party AI Integrations
SaaS platforms increasingly embed AI features that require API access to your systems. Each third-party AI integration creates machine identities in your environment.
Discovery Strategies
Comprehensive discovery requires multiple approaches working together:
Cloud IAM Analysis
Start with your cloud providers' IAM systems. AWS IAM, Azure AD, GCP IAM – these platforms log every service account, role, and API credential. Query for:
- Service accounts and their associated permissions
- API keys and their age (look for long-lived keys that may be AI integrations)
- OAuth applications with API access
- Managed identities used by AI/ML services
- Cross-account access that might indicate third-party AI tools
Most cloud platforms provide APIs to enumerate these identities. Build automated discovery that runs regularly to catch new additions.
Secret and Credential Scanning
AI integrations typically require secrets. Scan your secrets management systems (HashiCorp Vault, AWS Secrets Manager, Azure Key Vault) for credentials that appear to be AI-related – look for naming patterns like "openai," "anthropic," "copilot," or "agent."
Also scan code repositories for embedded secrets. Tools like GitGuardian and truffleHog can identify API keys in code. Research shows millions of credentials are exposed in public repositories – your private repos likely contain similar issues. This scanning often reveals Shadow AI integrations that were set up outside formal processes.
Network and API Traffic Analysis
Monitor network traffic and API gateway logs for calls to known AI service endpoints. If systems in your environment are calling api.openai.com, you have AI integrations to track. API gateways can log which internal identities are making these calls.
This approach catches AI integrations that might not appear in IAM inventories – for example, an application using a hardcoded API key rather than a formal service account.
Developer and Team Surveys
Sometimes the most effective discovery is asking. Survey development and operations teams about AI tools they're using. What coding assistants are active? What AI APIs are integrated into applications? What automation or agent systems are deployed?
Frame this as an inventory exercise, not an audit, to encourage honest reporting. Many Shadow AI deployments happen because teams don't realize they need approval – once they understand the security implications, they're often willing to bring integrations into governance.
Cloud and DevOps Platform Integration
Integrate with your CI/CD systems, container orchestrators, and serverless platforms. These often have their own identity systems:
- GitHub/GitLab service accounts for AI-assisted pipelines
- Kubernetes service accounts for AI workloads
- Lambda/Cloud Functions execution roles for AI inference
Building Your AI Identity Inventory
Discovery is just the first step. The goal is a comprehensive inventory that enables ongoing governance. For each AI identity, capture:
Identity Details
- Name/ID
- Type (agent, API key, service account, etc.)
- Platform/environment
- Creation date
Access and Permissions
- What systems can this identity access?
- What permissions does it have?
- Is the access read-only, read-write, or admin?
Context
- What is the purpose of this AI identity?
- What application or workflow uses it?
- Is the AI internal or a third-party service?
Ownership
- Who created this identity?
- Who is responsible for it today?
- What team or department owns it?
Security Posture
- Are credentials rotated regularly?
- Is access scoped to least privilege?
- Is activity being logged?
Addressing What You Find
Discovery typically surfaces concerning findings:
Orphaned AI Identities
Identities without clear owners – perhaps created by departed employees or for discontinued projects. These are prime targets for cleanup. Disable them if possible, or establish ownership before continuing to use them.
Overprivileged Agents
AI identities with far more access than needed for their function. An AI assistant that only needs to read documentation shouldn't have write access to production systems. Right-size these permissions immediately.
Long-Lived Credentials
API keys and secrets that haven't been rotated in months or years. These represent standing access that attackers could exploit. Implement rotation policies and move toward ephemeral credentials.
Shadow AI Deployments
AI tools operating without security oversight. These need to be brought into formal governance – assigned owners, documented purposes, and appropriate controls.
Redundant Integrations
Multiple identities for the same purpose, perhaps created by different team members. Consolidate these to reduce your identity footprint and simplify management.
Continuous Discovery and Governance
Discovery isn't a one-time project. AI agents proliferate continuously as teams adopt new tools and integrations. Implement:
Automated Scanning: Regular automated discovery that runs weekly or daily, not annually.
Integration with Provisioning: Ensure new AI identities are captured automatically when created through formal channels.
Alert on New Discoveries: When automated scanning finds identities not in your inventory, alert security teams for review.
Regular Attestation: Require identity owners to regularly confirm their AI agents are still needed and appropriately configured.
The Foundation for Agent Governance
Comprehensive discovery and inventory is the foundation for everything else in AI agent governance. You cannot apply least privilege to unknown identities. You cannot detect anomalous behavior without knowing what's normal. You cannot ensure compliance without knowing what agents exist.
Investing in discovery now prevents the accumulation of technical debt that becomes increasingly difficult to address. As AI adoption accelerates, organizations that establish visibility early will have a significant advantage in scaling AI safely.
Start with discovery. Everything else follows from knowing what you're protecting.
