A NOTE from Matias Woloski, VP, R&D Okta, Co-Founder of Auth0
Early in my career, around 2005-06, I became involved in web services and interop projects. Do you remember SOAP/WSDL/WS-Trust, etc? It was very difficult to understand the specifications and the different implementations. Vittorio's blog and videos were an oasis in this complexity. He would explain things from the big picture down to the smallest details, always starting from first principles. He mastered the art of progressing from the very basic idea to a fully-fledged architecture. Since then, I have read every one of his blog posts and books. He was incredibly influential in my career. His knowledge and passion for identity led me and Eugenio to create Auth0 in 2013.
Vittorio began writing this article at the beginning of 2023. This piece is an authentic Vittorio Blog Post™ It's the same type of blog post I would read early in my career that would clarify everything and help me connect the dots. As with many other identity-related topics, he spent a lot of time and effort figuring out the best way to teach the world about Verifiable Credentials (VCs). What are their use cases? Why would they work this time? What are some common misconceptions around them? What are the components and advantages of VC architecture?
Sadly, Vittorio passed on Oct. 7, 2023. And we miss Vittorio.
We miss his excellent articulation and accuracy. His sense of humor, the endless discussions, his fun and educational conference talks, his humanity and passion.
Completing and publishing this post is a small homage from us, aiming to contribute to his effort to educate the world about digital identity. I hope you enjoy his writing as much as I still do.
TL;DR: we discuss verifiable credentials (VCs) from the point of view of the modern identity practitioner, creating a pragmatic scaffolding we can use to ground the many things being said about VCs nowadays. In a nutshell: We believe VCs will happen, and it’s wise to be prepared for them, but not necessarily for the reasons most commonly found in literature today.
You Have Been Lied To
If you learned about claims-based identity as a practitioner in the last decade or two, chances are that whoever introduced you to the topic used some metaphor along the lines of:
“It’s just like a person going to the wine shop to buy wine. The merchant asks to see a document proving that the person is of age, the person exhibits their driving license, and voila’ – it’s party time.”
“You see, the wine seller is the relying party (RP). The government is the issuer. The driving license is the token. The user is the subject. When the subject attempts access to the RP, they get redirected to the issuer, where they authenticate and get back a token, which they subsequently present to the RP and gain access. Easy, just like in meatspace!”.
Except, that’s not really true, is it? No metaphor is ever perfect, but there are very fundamental differences separating the "real life scenario" from the metaphor and that of actual production identity systems. To mention the most obvious differences:
In offline life, we can use our documents without the issuer of those documents knowing when or with whom we choose to transact. In protocols like OpenID Connect, not only do we have to go back to the IdP every time we start a session with an RP: if the RP isn’t registered, the request will fail.
Corollary: With relatively few exceptions, the documents we use in offline life aren’t tied to one particular intended recipient, but can be used with any entity that will choose to accept them – including entities that didn’t yet exist at the document issuance time and entities that the issuer has no knowledge of. Compare that to tokens, where we try to express audiences as narrow as the use case allows (for least privilege and for containing the blast radius in case of leaks).
Why this discrepancy? And why does no one even point it out in identity classes? Is it part of a big conspiracy to keep you from some truth?
But It Was for a Good Cause
Sorry, no conspiracy. The fact is – the metaphor is very useful to introduce the dramatis personae of identity and the rough roles everyone/thing plays, so we keep using it. And the discrepancies? They are there because if we want to tackle many (most?) of the scenarios people pay big $$ in identity products for… we need solutions to behave that way.
The places we visit online are not the same as our squares, malls, and beaches, where you can wander aimlessly and simply be. They are purpose-built artifacts designed to support specific kinds of transactions, among specific people and entities, following specific rules… and the business of identity often boils down to the ability to express and enforce those relationships and rules.
When two companies enter a federation relationship, they want to control the exact content of the messages exchanged between the parties… and that the rules defined for this federation aren’t applied to entities belonging to different federation agreements.
Administrators need to be able to control what apps employees consent to (to avoid, for example, violating embargo rules against certain countries) and what information is shared (so that an employee who is working on a secret merger deal doesn’t leak the company plan when sharing that they belong to the directory group “CompanyXAcquisitionDueDiligence")
Generally, any scenario where some centralized logic needs to run, from authorization rules conditional to the intended recipient to dynamically generated data.
In fact, there are times in which it would have been nice to have more of this in offline life too. Years ago a US citizen was diagnosed with drug-resistant tuberculosis while abroad and was told not to travel, but he flew to Canada and then drove to the US (article here). He was able to do that because the government had no way to get every airport authority of every country to enforce a ban and because the man was able to use his travel docs without the issuer (the US Department of State) being in the loop.
All good, then? Should we consider the way we do identity today as perfect and exhaustive for every scenario? Are we already all set? No, of course not. For the scenarios and use cases hinted at here, what we have works pretty well. But there are indeed scenarios (especially NEW scenarios not often tackled today) where the way traditional identity systems work is too restrictive, and we could use something behaving closer to what we have in offline life. Let’s postpone the task of establishing what those scenarios are, and focus on what we need to change in our traditional approaches if we are to better approximate offline life.
What Do We Need to Change in Modern Identity to Better Approximate Offline Life?
Let’s use OpenID Connect as a starting point. What do we need to do to make a classic OpenID Connect flow behave more like the drivers license for buying wine scenario in offline life?
The two main discrepancies we identified were:
- Ability to use the token with multiple RPs
- Ability to transact with an RP without IdP knowing anything about time and parties involved in the transaction
The first one is pretty easy to achieve. The only thing restricting tokens to be used with one particular audience is the IdP running authorization logic based on the declared intended recipient, and the presence of the audience claim in the resulting token. Change the IdP not to run that logic, tell the client there’s no need to declare the intended recipient, omit the audience in the resulting token, and instruct RPs not to look for an audience in incoming tokens. Done!
In doing so, we also created the basis for satisfying the second requirement. We are no longer telling the IdP for whom the token we are requesting is meant. That takes care of hiding the RPs identity from the IdP. All that’s left to hide is the time, but in fact by removing the identity of the RP from the request, we also eliminated the need to ask for a token contextually to the RP transaction: issuance of the token and presentation of that token to the RP are now two separate transactions that can happen at completely different times, as long as they occur in that sequence, and as long as the token expiration is long enough to allow for the system to credibly work.
This does add some complication. Now we need a place for the user to save their general purpose tokens, and (assuming that there will indeed be more than one token) an experience to make a selection at runtime. Let’s call that place a “wallet” for the time being. In turn, the context that allowed the IdP to decide which kind of info to issue in the token during a transaction with a particular RP is gone, hence whatever requirements and restrictions on the tokens an RP is willing to accept can now only be fielded by the wallet, given that the IdP is out of the loop at presentation time.
I am glossing over a stupendous amount of important details, but I hope you are nonetheless following the gist of how this new world can be conceived in terms of the old, and hopefully it is helping to better ground implications and new requirements.
That said, there is at least one detail I cannot ignore here. What we have done goes against some of the most ironclad security rules we have been trained to apply. A long lived bearer token without an audience? That sounds like a recipe for disaster.
From our requirements to mimic offline life, “no audience” and “long-lived” are by design. By exclusion, the only thing we can work on to improve security is the “bearer” part. One way we can prevent disasters in case of token leakage is to upgrade our subject confirmation method from bearer to proof of possession (that is "PoP", not DPoP specifically), a well-known sender-constraint mechanism that ties the presentation of a given token with a proof that the presenter is indeed the legitimate entity to which the token was issued. Clear as mud? We have an entire Identity, Unlocked episode on the topic, but in a nutshell: imagine that the wallet has a private-public key pair, and that at issuance request time, it presents the public key to the IdP. The IdP issues its signed token and includes in it the public key presented by the wallet. When the user wants to use the token with a given RP, it attaches the token to the request (as usual) and signs some part of the request with its private key. Now the RP has to validate two signatures: the usual one from the IdP, and the one from the wallet (which needs to be verifiable using the public key indicated by the signed token). Voila’, now no adversary in the middle can intercept the transaction and reuse the token (other than a verbatim replay of the entire request, that is).
Phew! It would appear we made it. We turned our modern identity primitives into something that not only better mimics, but improves over the offline life scenario.
Somebody Already Invented VCs
I have bad news for you: someone already invented this. Several someones, in fact :D
Various entities tackled this problem, and came up with many different specs regulating different aspects of this.
To W3C we owe the term Verifiable Credentials, the data structure and the generally accepted terminology for roles. My naïve mangling of the traditional OIDC should be mapped as shown in the following diagram. The user becomes the holder, the identity/attribute provider the issuer, and the verifier performs the function of the relying party. The token "becomes" a Verifiable Credential, and what is sent to a verifier when performing an operation is a Verifiable Presentation.
Some of the other important standard bodies that have already published documents on the topic are the OpenID Foundation, Decentralized Identity Foundation and ISO. We might say that there are too many cooks in the kitchen. In fact, multiple alternatives exist for credential data formats (e.g. ISO mDL and W3C), protocols (e.g. OpenID for Verifiable Credentials, DIDComm and ISO mDL), wallets, and verifiable data registries. I dare say this space is overspecified.
At the same time (I am biased because I am on the foundation's board), I think the OpenID Foundation is doing really good work here. By consolidating some of the good parts from other specifications they are trying to lead to an interoperable outcome, which is fundamental for Verifiable Credentials to eventually succeed.
Still, I expect a lot of work will need to happen in the marketplace, as production implementations with working use cases feel the pain points from these specs and run into a few walls for some of VCs to fully come to life.
Now that we have some pragmatic understanding of what Verifiable Credentials mean, let's explore some misconceptions in this space.
Misconceptions about Verifiable Credentials
Over the past few years I've heard a number of claims about verifiable credentials that I think misrepresent the value proposition of verifiable credentials and can mislead decision makers. I have picked three of the most puzzling statements in the space to shed some light on them.
Centralized DBs will disappear
… and in turn this would prevent some of the massive data leaks that we have seen in recent history.
It’s unclear how that would work. Specific IdPs need to know what they know because of the function they perform, regardless of sign in. Let's look at this through an example: applying for a job requires a master’s degree in computer science.
The job applicant, Jane, could obtain a credential from their university and store it in their wallet, so they can present it to the verifier, the employer.
Can the university, now that Jane has the credential in their wallet, delete the database record(s) that state that she got the master's degree? They better not! Jane put in a lot of time, money and effort to get her degree, so they better always be able to remember that Jane got that master’s degree. For example, what if Jane has her phone stolen and needs to get a new digital diploma?
Can the employer delete Jane's master’s degree information? Not likely. In 5 years time, they might need to go through an audit and prove that they did due diligence on hires with certain requirements. They can't just have a record in the database that says
"due_dilligence = true". Verifiers need an audit trail for legal and compliance reasons.
Moreover, there are flows that cannot go through the user. If a health provider has Jane's blood type and Jane has an accident, Jane cannot consent to release her blood type data from her wallet, and she really needs a transfusion fast! The health provider needs to keep that data available in their database, so other parties can obtain it without Jane being involved in case it is necessary. In this case, it is in Jane's best interest to not have full control of her data.
Users are in control of their identity!
This one is such good PR – it sounds good when you say it. And it is tricky because it depends on what "control" means in this context.
Remember the user key we introduced earlier? For many people, it’s super important that the key (and all the machinery to access it, such as an identifier) remains under control of the end user. Imagine using that key to authenticate to RPs even without signing anything from an issuer, instead of, say, using Google for signing in an RP. As long a you keep your key in a place you control (we can discuss whether it’s easier to use PGP or the blockchain for that), no one can take that away from you – whereas if you violate Google's terms of service, Google might suspend your ability to log in with your (blocked) account using their authentication provider. Great in principle, but it might not make as much of a difference in practice. Besides the fact that fewer and fewer people use social providers to sign in, as RPs increasingly seek a direct relationship (similar to "username + password" logins) with the end user, there are usually recovery mechanisms that can get around Google’s reluctance.
What about users carrying VCs in their wallet with assertions from an issuer? Well, the fact that you retain control of your key won’t help you much if last week you got a DUI and your driver’s license has been revoked. The verifier would still check for revoked credentials. Thus, the user having their (revoked) license in their wallet and control of their key won't do much for them.
Moreover, the moment when a verifier sees your data, they can decide to store it. RPs today already have the same incentives and disincentives to keep user data around. They ask for and store more data than they need to provide their features, and keep it because it is beneficial for them to do so. Specifying that data from a Verifiable Presentation should not be persisted (for example using termsOfUse) will not solve the problem. The solution is not technology; it is law. You'd need to make it illegal for the verifier to store data they don't need, or to respect mechanisms like
A frequent claim is that because VCs support "selective disclosure," privacy will improve. The reality is that VCs did not invent the ability to minimize the data verifiers get access to. We have been able to do that for years as part of OAuth consent. The verifier/RP asks for as much data as they can, and users consent to it, because otherwise they cannot use the services the RP provides…
It is true that if you compare digital credentials to real life ones, selective disclosure has benefits for privacy. But in the digital world, much like it is today, convenience will be the key. Unless users decide to not present more data than necessary for particular operations, it is possible that they will end up disclosing more/all credential data just for usability sake.
Specific IdPs already know things about you because they organically do. In general, authentication providers don’t get users’ data and the apps they use during login – they have other routes.
The main advantage for privacy is that, assuming proper privacy preserving revocation check mechanisms, the IdP will not know directly where the user goes, how often, etc.
The Reason We Believe VCs Will Happen
Regardless of all this, we NEED VCs. As our online presence moves from a collection of relatively independent, purpose-built experiences to a more seamless, open-ended world that more closely mirrors our offline lives and promotes a less-constrained agency, we need users to have the means to exercise their identity in a more agile fashion. And there are things that only verifiable credentials can do that can help enable this, as I explained in my KupperingerCole presentation
The main one is being able to disclose our identity/claims without issuers knowing. It is a civil liberty; it is a right. As more of our life moves online, we should be able to express our identity like we do it offline.
Then, users having VCs in their wallet will be very powerful for verifiers. They don't have to deal with the issuance, revocation, and recovery of those credentials, and they get high assurance claims about subjects with relatively low friction for many use cases.
And finally, systems using VCs can more easily achieve the massive scale that is required as more interactions move online. From an architecture perspective, all users' wallets work as a distributed cache for the issuer. Unlike OpenID connect authorization servers, the issuer will not typically need to support high scale scenarios that might correlate to online events like ticket sales or streaming episodes premiers. Users will typically already have credentials in their wallets and verifiers will simply need to verify them, in a (mostly) stateless fashion.
How will VCs take off?
However, we do have a multi-parties cold start problem. To have viable VCs we need effective, dependable and ubiquitous wallets. To have good wallets, we need relying parties implementing flows that require them, and creating real, concrete requirements for actual use. To incentivize RPs to implement and explore new flows, we need high value, ubiquitous credentials that make good business sense to leverage. But to get natural IdPs to create the infrastructure to issue such credentials, you need all of the above… plus a business case. Luckily, there’s a way out: governments.
It is in governments' purview to do the absolute best they can to empower their citizens to live safe, effective and fulfilling lives, including online. VCs appear to be a great tool for it, and in fact governments are hard at work to bring the first truly impactful implementations into the world (Europe, the US, APAC) . That might kickstart the flywheel that will enable all other players to come on board. Once that happens, it will likely happen all of a sudden… which is why it is really a good idea to stay up-to-date and experiment with VCs TODAY: You can experiment with issuing and verifying VCs using the Auth0Lab experimental environment, and sign up for the mDL online verification early access.
If you found this post useful you might also enjoy this Verifiable Credentials talk from Vittorio, recorded a few months before his illness.