Are you sure you want to delete this access key?
sidebar_label | image | date |
---|---|---|
Excessive agency in LLMs | /img/blog/excessive-agency/detecting-excessive-agency.svg | 2024-10-08 |
Excessive agency in LLMs is a broad security risk where AI systems can do more than they should. This happens when they're given too much access or power. There are three main types:
This is different from insecure output handling. It's about what the LLM can do, not just what it says.
Example: A customer service chatbot that can read customer info is fine. But if it can also change or delete records, that's excessive agency.
The OWASP Top 10 for LLM Apps lists this as a major concern. To fix it, developers need to carefully limit what their AI can do.
Excessive agency in LLMs often stems from well-intentioned but poorly implemented features. This vulnerability type arises when AI systems are granted broader capabilities or access than necessary for their intended functions.
Excessive agency arises in a handful of ways, but there are a few common patterns.
Overreaching tools: Most LLM providers have first-class support for "tools" or functions, which is just a fancy way of saying "APIs that are exposed to the LLM".
These APIs frequently include unnecessary functionality. For example, document summarization tool might also have edit and delete capabilities, expanding the potential attack surface.
Insecure APIs: LLMs cannot be trusted with access to backend systems. We often see IDOR-style vulnerabilities where an LLM can make arbitrary data references.
Excessive database privileges: LLM agents often connect to databases with more permissions than required.
Privileged account misuse: Using high-level credentials for routine tasks creates unnecessary exposure. A support chatbot shouldn't access sensitive employee data with admin-level permissions.
Development artifacts: Test or debug features meant may accidentally remain in production environments. Even if they're not explicitly mentioned in prompts, they can be invoked.
These scenarios share a common thread: granting LLMs more power than their core tasks demand.
Developers must carefully consider the principle of least privilege when integrating LLMs into their systems. By limiting access and functionality to only what's essential, they can significantly reduce the risk of excessive agency vulnerabilities.
Excessive agency in LLMs poses significant risks across security and business domains.
Unauthorized data access: LLMs with excessive permissions may retrieve and expose sensitive information beyond their intended scope. Depending on your application, this could include proprietary company info or documents (e.g. in a RAG system).
Remote execution: Attackers exploiting excessive agency could potentially run arbitrary functions. This risk is amplified in environments where LLMs have meaningful system access, which is more likely as companies begin to explore agentic systems.
Privacy breaches: Overly-permissioned AI assistants might inadvertently disclose private user information in responses. Such leaks not only erode user trust but can also violate data protection regulations like GDPR or CCPA.
Financial loss: LLMs with direct access to business systems could be manipulated.
Reputational damage: An AI acting inappropriately or leaking confidential information can severely impact a company's image. Rebuilding trust after breaches is costly and time-consuming.
Operational disruptions: LLMs with excessive admin privileges might alter critical configurations or delete important data.
Legal liabilities: Companies may face lawsuits if their AI systems cause harm due to excessive permissions or improper access control, especially in regulated industries like healthcare or finance.
The insidious nature of excessive agency means vulnerabilities can exist undetected for extended periods before exploitation. This underscores the importance of regular security audits and continuous monitoring of LLM activities.
Mitigating excessive agency requires work at all stages of LLM development and deployment.
Start by constraining the LLM's operational scope:
Consider an LLM-powered code assistant. Rather than granting it full repository access, create an internal API that only allows read operations on specific files or directories.
Enforce strict access management, so even when someone comes along and jailbreaks your app, they can't do any damage.
For instance, a support chatbot should operate with credentials that only permit access to data relevant to the customer it's currently assisting.
There's more you can do to limit downside risk from excessive agency vulnerabilities.
Human oversight: Require manual approval for high-stakes actions.
Throttling: Rate limit API calls to slow down exploration and attacks.
Robust monitoring: Use logging tools to detect anomalous LLM behavior patterns.
Input & output sanitization: Moderate inputs and outputs to prevent unintended actions.
These layers of protection serve as a crucial safety net, capturing issues that might evade primary defenses.
Preventing excessive agency requires constant vigilance. When expanding features or integrating new systems, always question: "What's the minimum level of access this LLM needs to fulfill its role?" Then, provide exactly that — nothing more.
Identifying excessive agency problems requires proactive monitoring and rigorous testing. Passive safeguards alone are insufficient.
Challenge your system's boundaries. Red team your system with inputs designed to push the AI beyond its intended limits. Focus on these key areas:
Unauthorized Access: Unintended data access.
Manipulation and Injection: Ways to trick the LLM into doing things.
Scope and Capability Expansion: Attempts to trigger actions beyond the LLM's intended scope.
Automated tools can stress-test your system with thousands of diverse inputs and uncover edge cases that human testers might overlook.
LLMs are particularly susceptible to injections or jailbreaks that exploit scenarios within your application-level prompt. Be sure to include social engineering-type attacks in your test suite.
Log every AI action, particularly interactions with external systems, to create a comprehensive audit trail.
Implement alerts for anomalous patterns:
Extend monitoring beyond the LLM itself. Scrutinize connected systems for unexpected data changes, atypical transaction patterns, or access attempts from unfamiliar sources.
Standard observability and security tools like Datadog and Sentry are often sufficient for this. You probably don't need a specialized AI tool for monitoring!
The future will be harder, not easier, when it comes to excessive agency. This is because generative AI applications are moving in the direction of increasing data access and agency. They'll also become more complex and prevalent in our daily lives.
If you're looking to test for excessive agency, our software can help. Check out the LLM red teaming guide to get started, or contact us for personalized assistance.
Press p or to see the previous file or, n or to see the next file
Browsing data directories saved to S3 is possible with DAGsHub. Let's configure your repository to easily display your data in the context of any commit!
promptfoo is now integrated with AWS S3!
Are you sure you want to delete this access key?
Browsing data directories saved to Google Cloud Storage is possible with DAGsHub. Let's configure your repository to easily display your data in the context of any commit!
promptfoo is now integrated with Google Cloud Storage!
Are you sure you want to delete this access key?
Browsing data directories saved to Azure Cloud Storage is possible with DAGsHub. Let's configure your repository to easily display your data in the context of any commit!
promptfoo is now integrated with Azure Cloud Storage!
Are you sure you want to delete this access key?
Browsing data directories saved to S3 compatible storage is possible with DAGsHub. Let's configure your repository to easily display your data in the context of any commit!
promptfoo is now integrated with your S3 compatible storage!
Are you sure you want to delete this access key?