AI agents or Operators are the latest trend in AI. At their core, agents are autonomous systems that not only process natural language using Large Language Models (LLMs) but also decide when and how to act by interfacing with external tools and APIs. These smart entities blend machine learning with decision-making capabilities, enabling them to perform complex tasks, interact in real-time, use the most efficient tools to perform certain tasks, and even learn from their environment—all without constant human intervention.
Tool Calling Agents
LLMs on their own are still susceptible to hallucinations and mistakes, but the ability to call tools opens up a whole new world of possibilities where error-prone tasks can be offloaded to a more traditional non-AI tool, like using a calculator to do math rather than relying on the LLM for it. Similarly, tool calling allows AI agents to interface with other applications and services, like Gmail, Calendar, Slack, and Google Drive, to do more for you than just answering questions.
Tool calling can be generally classified into these:
- Unauthenticated Tools: These are tools that don't require authentication and are generally simpler to configure, like a calculator, a weather API, or a unit converter.
- Authenticated Tools: These are tools that require authentication and are generally complex to configure, such as an API provided by another internal application or services like Gmail, Calendar, Slack, and Google Drive.
In an enterprise environment, the authenticated tools can be further classified into:
- First-Party Tools: These are services that are part of the same infrastructure and require authentication, like an API provided by another internal application or a microservice.
- Third-Party Tools: These are external services that require more complex authentication. They could be services that are authenticated using service accounts or API keys, like a payment API, or services that need to be authenticated by the end user, such as Gmail, Calendar, Slack, and Google Drive.
Tool calling provides many key advantages to an AI agent:
- Autonomous Decision-Making: AI agents analyze context and determine the next best action, reducing the need for manual oversight.
- Optimized Task Execution: They can connect to various services and APIs, enabling them to perform specialized tasks like data retrieval, scheduling, and communication.
- Scalability and Adaptability: As systems evolve, agents can easily incorporate new functionalities, adapting to changes in workflow or priorities.
- Improved User Experience: Their ability to tackle diverse tasks leads to more personalized and efficient interactions, making technology feel both intuitive and proactive.
Tool Calling Agent: A Conceptual Example
Imagine an AI personal assistant that consolidates your digital life by dynamically accessing multiple tools to help you stay organized and efficient. Here’s how it could work:
- Gmail Integration: The assistant regularly scans your inbox to generate concise summaries. It highlights urgent emails, categorizes conversations by importance, and even suggests drafts for quick replies.
- Calendar Management: By interfacing with your calendar, it can remind you of upcoming meetings, check for scheduling conflicts, and even propose the best time slots for new appointments based on your availability.
- Slack Notifications: For team communications, the assistant monitors Slack channels. It identifies key messages and creates action items, ensuring you never miss an important update from your colleagues.
- Google Drive Access: Whether you need immediate access to the latest project document or a file related to a current task, the assistant retrieves pertinent documents from Google Drive on demand. It can create document summaries and even create new documents based on your instructions.
With tool-calling capabilities, the possibilities are endless. In this conceptual scenario, the AI agent embodies a digital personal secretary—one that not only processes information but also proactively collates data from connected services to provide comprehensive task management. This level of integration not only enhances efficiency but also ushers in a new era of intelligent automation, where digital assistants serve as reliable, all-in-one solutions that tailor themselves to your personal and professional needs.
Security Challenges with Tool Calling AI Agents
Building such an assistant is not that difficult. Thanks to frameworks like LangChain, LlamaIndex, and Vercel AI, you can get started quickly. The difficult part is doing it securely so that you can protect the user's data and credentials.
Many current solutions involve storing credentials and secrets in the AI agent application’s environment or letting the agent impersonate the user. Storing credentials securely in the environment might work for APIs that require API keys or service accounts. However, this is not a good idea for cases where the end users need to authenticate to the service, as it can lead to security vulnerabilities and excessive scope and access for the AI agent.
When you build an AI agent, you need to consider all its other security implications as well. You don't want an LLM to have unlimited access to your personal data like email and documents, and more importantly, you don't want to provide your credentials to the LLM to access these tools. Let's face it, regardless of how secure the LLM is, there is always a possibility of it getting manipulated into divulging sensitive information or doing something you don't want it to do.
Tool Calling with the Help of Auth0
This is where Auth0 comes to the rescue. As the leading identity provider (IdP) for modern applications, our upcoming product, Auth for GenAI, provides standardized ways built on top of OAuth and OpenID Connect to call APIs of tools on behalf of the end user from your AI agent.
Auth0 brokers a secure and controlled handshake between the AI agents and the services you want the agent to interact with on your behalf – in the form of scoped access tokens. This way, the agent and LLM do not have access to the credentials and can only call the tools with the permissions you have defined in Auth0. This also means your AI agent only needs to talk to Auth0 for authentication and not the tools directly, making integrations easier.
Call first-party APIs on users' behalf
When your AI agent, secured with Auth0, needs to call another application also secured with Auth0, you can use standard OAuth 2.0 flows, like the Authorization Code Flow, to get an API token from that application with user consent. In this case, Auth0 will manage the retrieval of new access tokens using cached refresh tokens when the API token expires.
Call third-party APIs on users' behalf
When your AI agent, secured with Auth0, wants to call external services like Gmail, Calendar, Slack, and Google Drive, Auth0 can help the agent get access tokens for the external service on behalf of the end user. In this case, Auth0 brokers the API access token from the external service to the AI agent.
This is made possible by Federated API token exchange, which is a way to obtain an access token from an external identity provider without the need for the user to re-authenticate every time. The end user authenticates and connects the external service once, and Auth0 intermediates the authentication and authorization process and provides the API access token to the AI agent. Auth0 takes care of storing refresh tokens and getting new access tokens when the current access token expires.
Security Best Practices for Tool Calling AI Agents
Here is a summary of security best practices to consider when building a tool-calling AI agent:
- Least Privilege: Grant the minimum permissions needed (e.g., read-only access).
- Audit Logs: Track API calls and credential usage.
- Proxying: Use an IdP like Auth0 as a broker between the AI agent and the tools to protect the user's credentials.
- Encryption: Encrypt credentials at rest and in transit (HTTPS/TLS).
- Token Rotation: Regularly rotate API keys and tokens.
- Sandboxing: Run tools in isolated environments when possible to limit damage from leaks.
Learn More about AI Agents and Auth for GenAI
This post offers a glimpse into the transformative potential of AI agents and highlights how secure and intelligent tool calling can empower AI agents to deliver a more connected and personalized experience. Stay tuned for a series of posts on tool-calling Agents with Auth for GenAI, where we will build the conceptual personal assistant from the previous section in Node.js and Python, showing how to integrate with different types of services securely using your favorite AI frameworks and Auth0.
Before you go, we have some great news to share: we are working on more content and sample apps in collaboration with amazing GenAI frameworks like LlamaIndex, LangChain, CrewAI, Vercel AI, and GenKit. Auth for GenAI is our upcoming product to help you protect your user's information in GenAI-powered applications. Make sure to join the Auth0 Lab Discord server to hear more and ask questions.
About the author
data:image/s3,"s3://crabby-images/f12c4/f12c414107766a3e35b1cf38f58a6e54a3363f11" alt="Deepu K Sasidharan"
Deepu K Sasidharan
Staff Developer Advocate