Security Considerations in the Time of AI Engineering

Note:

This blog post is based on interviews conducted with startup founders, engineering managers, data scientists, and risk analysts in the Artificial Intelligence and Machine Learning space.

Evaluating tools that can help optimize your development processes? Maybe you’re entirely building alongside Devin, the first AI software engineer to get your app to market faster, generating code snippets for your React app with V0.dev, or using GitHub CoPilot to help patch your website’s security flaws.

While productivity and efficiency are hallmarks of development success, security and privacy are just as paramount to protect your product and maintain customer trust. According to Snyk’s 2023 AI Code Security report, 80% of developers are bypassing security policies to use AI.

But modern-day Generative AI tools don’t typically have the built-in capability to determine whether the data it is exposed to is sensitive. It’s completely up to the user to take precautions and set up the proper protocols when choosing to use one of these tools. Let’s explore the potential risks of using AI tools in the dev process and what questions you could ask yourself to help make more informed decisions – for you, your product, and your customers.

Productivity vs. Risk

According to Snyk’s blog, How to Embed Security in AI-assisted software development, “Code generation, using generative AI… is the area that stands to offer the greatest productivity benefits — and the highest potential for introducing risk.“

AI tools can help accelerate the process of taking an idea and building an MVP with limited resources. Although engineering productivity cannot be tracked by a single metric for all, it’s easy to see how these tools are already helping developers reduce time spent in generating code and improving the quality of code.

Like with all third-party tools, to strike the perfect balance between productivity and risk, it’s important to recognize ways in which these tools can hurt your product and customer security, now and in the future. In many situations, risk cannot be quantified, but companies can have an increased or decreased risk factor (risk tolerance) depending on:

Industry
Product type and use case
Company stage

For example, risk tolerance can be higher for startups in growth mode, especially in industries outside of security.

Risk Prevention Strategies

Let’s explore the most common risks associated with AI tools and mitigation strategies outlined by some engineers.

Nonprofit OWASP (The Open Worldwide Application Security Project) documented a list of potential security risks when deploying and managing Large Language Models (LLMs) – some examples include:

Sensitive Information Disclosure (LLM06) - LLMs can inadvertently disclose sensitive, proprietary, or confidential information leading to privacy breaches, intellectual property theft, and unauthorized access
Prompt Injection (LLM01) - Attackers can manipulate LLMs through crafted inputs, causing the LLMs to execute the attacker’s intentions
Overreliance (LLM09) - Systems or people overly dependent on LLMs without oversight may face misinformation, miscommunication, legal issues, and security vulnerabilities due to incorrect or inappropriate content generated by LLMs

When adding these tools to your tech stack, do these questions come to mind?

How are you deciding which tools to use to increase your efficiency and productivity?
Do you have an internal vetting process for third-party vendors? How are you updating this for AI tools?
Will you be prompting the AI tools with customer data and PII or proprietary restricted information? Are you in line with your applicable regulatory compliance frameworks? (GDPR, etc.)
Does this tool abide by your corporate guidance?
Do you have customer permission to use their data in this way?
How could these tools impact you?
What risks will you, your product, and your customers face?
What guidance or guardrails does the tool have to protect your data as a user? Two popular types of AI developer tools are those for engineering productivity and those for general use, and the available configurations may vary.

We interviewed a few engineers (anonymously) about what they’re doing to minimize the risks associated with AI developer tools. Here’s a breakdown of some of the most common concerns and questions engineers are asking to reduce any harmful and inadvertent impact while coding with Gen AI tools.

PI and copyright concerns
- Once one of these tools generates code, who is the owner?
- What happens if the tool is suggesting the same code to another customer, maybe even a competitor?
- If you use AI tools in order to create a customer-facing chatbot, and a customer includes PII in their prompt – who is in charge of monitoring and removing this data and ensuring it isn’t being used to re-train models?
Overreliance
- Generating a few lines of code vs a few files of code – how familiar are you with the code in your product, and how much do you trust it?
- Are you aware of the potential risk of AI hallucinations and how are you reducing it?
  - AI hallucinations are outputs from models that are nonsensical or inaccurate. For example: Question: where can I find the help center? Context: you’re looking for a website like [domain].com/help or support.[domain].com AI answer: the help center is located on the ground floor of this building.
How are you updating your code and content review processes to deal with Gen AI?
Data Exposure
- Can you make sure your own data won't be used to answer prompts for competitors or malicious actors?
- What are practical and technical strategies you can take to prevent data exposure?
- Do you have strict policies and training for engineers when they are using these tools, or are you exploring third-party tools that tackle data leak prevention?

There’s no specific path to being completely secure from the risk of AI tools today, but there are many different strategies you can take to build strong guardrails to protect code and mitigate risk. Ensuring restrictions are very specific when engineers are using AI tools like Devin and code review is a must.

AI Developer Tools Now and Tomorrow

AI tools for developer productivity are becoming increasingly popular. Further down the line, a startup founder might be able to ideate, possibly write pseudocode, and use these tools to bring their idea to life. They might be able to start testing in a matter of minutes, not months.

If you’re an early-stage company, you might be eager to get started and may not have as much internal guidance as a large corporation. We hope this blog helps you start to outline the potential risks and considerations when evaluating AI tools for developer productivity.

Auth0 for Startups is empowering early-stage startups – tell us what we can do to best support you and your team!

These materials and any recommendations within are not legal, privacy, security, compliance, or business advice. These materials are intended for general informational purposes only and may not reflect the most current security, privacy, and legal developments or all relevant issues. You are responsible for obtaining legal, security, privacy, compliance, or business advice from your own lawyer or other professional advisor and should not rely on the recommendations herein. Okta is not liable to you for any loss or damages that may result from your implementation of any recommendations in these materials. Okta makes no representations, warranties, or other assurances regarding the content of these materials. Information regarding Okta's contractual assurances to its customers can be found at okta.com/agreements.

Security Considerations in the Time of AI Engineering

Productivity vs. Risk

Risk Prevention Strategies

AI Developer Tools Now and Tomorrow

What to Expect When Your Auth0 Startup Plan Expires?

Comparing Different Plans from Auth0 by Okta

Why Auth0 by Okta?