Learn how to plan for failures and changes in your identity project.
In the field of software testing, there is a concept known as the Happy Path. Happy Path testing means testing one or more software components for a use case which results in an expected outcome and does not encounter any error conditions. In other words, the Happy Path is when everything goes according to plan. Unfortunately, life doesn’t always go according to plan.
In Part 1 of this series, we learned how to create a realistic project plan to have a good Identity Management project deployment. Now, in part 2, we will cover some things that can change over time or go wrong with the identity aspects of a project. Our goal is to raise awareness of common issues so you can establish a plan for them in advance. Some of them are fairly likely to occur. Others are less likely to occur, but if you don’t have a plan for how to respond, and something happens, you will be very, very sorry!
Preparing in advance for changes and problems helps you handle them quickly and appropriately. Failure to have contingency plans can result in chaos during a crisis if people don’t know what to do. Additionally, testing these plans ensures that response during an actual crisis is quick and effective. You’ll want a well-coordinated response for any identity-related issues that arise. You can easily achieve that by defining processes ahead of time for each of the following scenarios and training staff on how to handle each scenario. Proper planning will ensure you deliver good customer experience while minimizing risk and liability.
"Learn how planning for identity-related issues in projects is essential to ensure good customer experience while minimizing risk and liability during a system crisis."
Plan to Accommodate User Changes
User Profile Updates and Deleted Accounts
The most common change that you’ll need to handle is simply allowing users to update their user profile. You should make sure you’ve provided a means for the user to update their profile, including their name, email address, and phone number. When a change is made, if the attribute involved is the primary identifier for an account, you should reserve the old name from future use when allowing the user to change that identifier. You should also reserve the account identifier for any accounts which have been deleted. In general, you should reserve any account names used in the past so they cannot be used by new owners in the future.
Allowing a new account to reuse the name of a previously renamed or deleted account might enable someone to create a new account in the name of an account owned by someone else in the past. The new owner could then request a restore from a time when the original account existed. This would enable unauthorized access by the new owner to the data from the original account. If you establish a policy of reserving the identifiers associated with renamed or deleted accounts, you eliminate the possibility of this happening. The important thing is to understand the risk and have policies and procedures in place to prevent the exposure of data from previously renamed or deleted accounts to new, unauthorized owners.
In a related scenario, you should establish a process for orphaned accounts. This scenario occurs when an account administrator has left the organization owning an account, but no one still associated with the organization has access to the account. In this case, a representative from the account organization files a helpdesk request asking for access. You need a process to vet such requests to ensure they come from people genuinely associated with the organization and not from an imposter.
You may need to get creative about how you validate ownership. You may be able to validate the requestor using some secret about the account, that only a legitimate owner would know. If the account has a domain associated with it, you can visit the website at that domain and see if the requestor is listed on the site. You can check for a “contact us” or support link for the organization and ask for help to validate that the request is legitimate through an independent channel. You should make sure your support team is prepared with standard validation mechanisms for this scenario. If the prescribed mechanisms are not possible for a particular scenario, ensure that they consult with a member of your security team to devise adequate validation before granting access to an orphaned account. It’s also a good idea to avoid the problem in the first place by advising customers to have at least one backup administrator registered.
Identities in Flux
Another scenario to consider is identities or accounts that will change from one type to another. Examples are worker identities that switch from temporary contractor to permanent employee or vice-versa. Another example might be a customer that changes their subscription from one level, perhaps paid by a credit card, to another, higher level, paid by corporate invoice. You should examine whether there is a chance of issues if the change does not happen instantaneously in all systems at once. Consider how long it will take for identity changes to ripple through a set of systems and whether client applications could receive erroneous data if transactions are submitted during the propagation of changes.
In the case of workers changing status from temporary to permanent, you should consider whether the user could have two accounts at the same time. This can happen if a user’s new account is created before the old one is removed. In this case, be sure to test what will happen. By investigating in advance whether such situations can occur, you will ensure applications are prepared to handle them appropriately.
Death and Digital Inheritance
If your site handles consumer-facing accounts, you may want to establish a process for handling a situation where a user has passed away. This may include providing users the ability to designate an authorized contact in the event of death and specifying what rights or privileges the contact would have. You should also obtain legal advice so that your digital inheritance procedures are aligned with legislation for the jurisdictions applicable to your application. Having a clear policy and process in place will provide a prompt and clear customer experience to a user’s heirs at a difficult time.
Plan for the Possibility of a Compromise
Unfortunately, passwords are sometimes compromised. You can help avoid an unauthorized account take over if you regularly check your list of accounts against the various databases of compromised accounts available on the internet. If a user’s username/password combination exists in any of those databases, you should notify the user that their account uses compromised credentials and advise them to change their password. There’s no need to develop this yourself if you use Auth0 because you can opt to protect your users and their data through the additional service of anomaly detection.
Auth0 maintains a continuously-updated collection of breached credentials, with hundreds of millions of entries. All password-based login attempts are checked against this database, and any matches are blocked in real-time. Auth0 offers a free tier to get started with modern authentication. Check it out, or sign up for a free Auth0 account here!
If a user’s account is compromised and taken over by an imposter, the user may call your support desk asking for help reclaiming their account. You will need to establish a process to validate that the requester is the legitimate account owner, keeping in mind that the imposter who took over the account would have access to all the information in the account and may have changed addresses, phone numbers, and even security questions.
The process to validate an account takeover situation must avoid depending on anything the attacker can see in the account and therefore can be cumbersome. If there is transaction history that is not accessible to a user upon login, an account takeover victim could be asked to provide information on past transactions to corroborate their ownership claim. Another possible approach is to check for recent changes to an account profile and retrieve past values for addresses or phone numbers. Then you can ask a user to join a conference call and provide proof of ownership of former address via government-issued photo ID, followed up by notarized validation of the ID and a code sent to the former address. Alternatively, you could prove ownership of a former phone number via interaction with the device. You will want to work with your support and security teams in advance to design an appropriate process that takes into account the specifics for your application.
"Learn how proper planning for an identity project helps mitigate the risk that comes from a user’s account being compromised and taken over by an imposter."
Brute Force Attacks
Sometimes hackers try to compromise a user’s passwords through repeated login attempts with various passwords. If your system comes under a brute force attack, you can help protect users by implementing monitoring on the number of failed login attempts and notifying users if there have been a large number of failed login attempts for their account. One best practice approach is to temporarily block an account for a short period of time if there have been many failed login attempts in a row. The idea is to slow down an attacker so that the attack becomes infeasible for them while allowing the legitimate user to log in a short while later. This prevents a hacker from blocking legitimate users’ access by simply trying to log in and failing.
Stolen, Lost or Damaged Phone
If your application relies at all on a user’s mobile device, you will need to implement a process to assist customers whose phone has been stolen, lost or broken. Many sites with sensitive content have added multi-factor authentication (MFA) features to better protect user’s accounts. Some forms of MFA rely on codes generated by or messages sent to a mobile phone. If a user’s phone is lost, damaged, or stolen, the user will be temporarily shut out of their account. If the phone was open and in use at the time of the theft, the attacker might gain access to the user’s accounts. In all such cases, you should have a process that validates the user’s account ownership through alternate means to ensure that only the legitimate user of the account can register a new device. You should also prepare clear instructions to help the user de-activate the old device and register a new one.
Resetting a user's multi-factor account is easy with Auth0. The tenant administrator can reset the multi-factor authentication. The next time the user logs in, they will need to set up their MFA just like a new user.
Identity protocols in use today often depend on secrets that contribute to their security. In the case of SAML, a service provider and identity provider exchange signed and optionally encrypted messages. The digital signing and decryption of messages depend on private keys. With the OAuth protocol, a client web application (confidential client) may use a secret which it uses to authenticate itself to an authorization server. You should have procedures in place to ensure private keys and secrets are well protected from compromise. It is also a good idea to plan in advance how to recover if any such secrets are compromised. A plan should address factors such as the following:
- What risks or exposure would arise from the compromise of such secrets?
- What must be done to shut off the immediate risk?
- What must happen to recover from the exposure and resume secure operations?
- What process is required to replace compromised secrets and will other parties need to be notified or involved? If so, do you have contact information and a plan?
- What data should be collected for audit or forensic purposes?
- Did the exposure put any user data at risk, requiring notification within a specific time frame such as the 72-hour timeframe required by GDPR?
Find a list of GDPR regulations and how Auth0 can help you comply with them.
Having a documented plan in place, and reviewing it periodically, will ensure a smooth, professional response to any compromise of secrets.
Compromised User Data
One of the worst scenarios to contemplate is the theft or breach of large numbers of user passwords or user profiles. You must prepare for this possibility so that if the unthinkable happens, you can act quickly to contain and assess the damage as well as take appropriate action to protect and, if required, notify your users. There is so much to do when a data breach occurs that it is simply not feasible to wait until it happens to figure out what to do! Your plan should include:
- Clear ownership for the response effort
- Who should be involved
- What needs to occur
- How to prioritize response actions
- What information to capture during the response
- What must be done to meet requirements for notification to government agencies and users
- Process to follow for PR and messaging about the breach
Team members should be educated on and regularly review the response plan so they are prepared to move quickly if a breach occurs.
Plan for Common Configuration Changes and Outages
Server Time Synchronization Failure
From a configuration perspective, there are several other types of changes or failures to plan for. Not all failures require complex precautions. Preventing servers from getting out of time synchronization is actually easy. Identity protocols often rely on servers being closely synchronized in time because there are short time limits in which authentication transactions are expected to occur. If an application redirects a user to an Identity Provider for authentication and the servers are not in sync, the response from the Identity Provider might expire before the application server receives it. A simple remedy is to ensure NTP (Network Time Protocol) is running on every server. If your authentications suddenly stop working and there are errors indicating timeouts, you should check whether NTP is running on all involved servers.
Another good check is to make sure you know where your system relies on digital certificates and track their expiration dates. This will enable you to plan for smooth certificate rotation as the expiration date approaches. If you are using the SAML protocol, a public key in a certificate is used for validating signatures on requests and responses. It may optionally be used for end-to-end encryption of messages between the client application and identity provider. SAML authentication requests will fail if the certificates in the identity infrastructure expire. You should set up reminders for approximately two months in advance of expiration so you can plan for a smooth rotation. Prior to planned rotation, you’ll want to collaboratively plan with federated sites on the process and timing. You may want to warn your support team in advance of any certificate rotation so that if anything stops working they are aware of this possible cause.
Downtime of Your Identity Infrastructure
Don’t forget to make a plan for downtime. Nothing boots you and your users off the Happy Path faster than unplanned downtime. Identity infrastructure controls access to many other systems, making it especially important to plan for identity systems outages You will want to have automated monitoring in place to be notified if any component of your identity management infrastructure goes down. You will want your monitoring to include checking end-to-end function by sending synthetic user authentication transactions in addition to basic monitoring of each individual server.
You should include your identity management infrastructure in your Business Continuity Planning. This valuable process will help you identify dependencies between components and how to minimize disruption caused by the failure or unavailability of systems or processes. Be sure to identify and plan for a variety of disasters, such as system and network outages, regional disruptions caused by environmental disasters or pandemic illness, corruption of data, or abrupt termination of services if a vendor suddenly goes out of business. Remember that your dependent applications may be inaccessible if identity is down.
As part of your disaster and failure planning efforts, consider whether any functions necessary during a crisis require alternative solutions. For example, if administrative access to systems is controlled by the identity platform, it may be appropriate to have alternative access paths enabled for such administrative access so administrators can access systems during an outage. Check to make sure that during an identity platform outage users will still have a way to report issues and that your helpdesk staff will still be able to receive issues and respond to customers. You do not want a failure of your identity system to take out your application and then also take out the ability of your helpdesk staff to support and communicate with customers during an outage!
"Include your identity management infrastructure in your Business Continuity Planning to help you identify dependencies between components and how to minimize disruption caused by the failure or unavailability of systems or processes."
Plan for Collaborative Support
Speaking of support, if you utilize any external components as part of your identity services, you will need to plan for the reality that troubleshooting issues with customers and/or remote identity providers often needs to be a collaborative affair. Both parties should know how to contact each other, what to check before contacting the other party, and what information to have on hand for an effective troubleshooting session. Some common things to include in a support troubleshooting checklist are:
- Check for network outages between solution components
- Check all solution components are up
- Questions to ask to Identify scope of the problem
- Impact to many users or just a few?
- Impact to all device types/browsers or just a few?
- Is the problem repeatable or does it only occur intermittently?
- How far through the authentication sequence does a user get?
- Capture an HTTP trace of the attempted activity
- Have dummy users to test with that can be shared with a partner
- Contact information for component owners in case of questions
Advance planning to ensure that your team is prepared to respond to failures is essential. The above list should give you a good head start on the types of changes and failures that may impact identity-related projects. Make a full list of the changes and potential failures that apply to your identity systems or dependent applications and craft a response plan. Remember to educate all parties involved in the response and conduct periodic tests to make sure everyone knows what to do and has the information and tools they need. With a bit of advance effort, you can ensure a good customer experience and significantly reduce the stress level when a failure occurs.
In the next part of this series, we are going to cover the security aspects of identity-related projects and how to do your due diligence to make sure your IDM project is secure. In case you missed Part 1 of this series, see how to create realistic IDM project plan.
Auth0, a global leader in Identity-as-a-Service (IDaaS), provides thousands of customers in every market sector with the only identity solution they need for their web, mobile, IoT, and internal applications. Its extensible platform seamlessly authenticates and secures more than 2.5 billion logins per month, making it loved by developers and trusted by global enterprises. The company's U.S. headquarters in Bellevue, WA, and additional offices in Buenos Aires, London, Tokyo, and Sydney, support its global customers that are located in 70+ countries.