---
title: "Protocol Types in Python 3.8"
description: "A quick introduction to the new Protocol class in Python 3.8 and how it enables structural typing"
authors:
  - name: "Vincenzo Chianese"
    url: "https://auth0.com/blog/authors/vincenzo-chianese/"
date: "Jun 23, 2021"
category: "Developers,Deep Dive,Python"
tags: ["protocol", "type", "python"]
url: "https://auth0.com/blog/protocol-types-in-python/"
---

# Protocol Types in Python 3.8

## Introduction

In the [previous installment](https://auth0.com/blog/typing-in-python/), we have seen how to type annotations enable gradual typing in Python and some of the common typing patterns that Python developers use when writing code.

In this article, we're going to take a look at the new Protocol classes introduced in Python 3.8 and how it enables, even in typed contexts, structural typing, and other idiomatic patterns in Python.

## What Is A Protocol

Protocol is a very generic word, both in regular language as well as in Computer Science. Most of us are probably familiar with it from hearing **TCP Protocol**, **UDP Protocol** or also **HTTP Protocol**. Dictionaries have also dedicated definitions for it:

> A set of rules governing the exchange or the transmission of data between devices

Which indeed makes sense. All the examples listed above are communication _protocols_ between two remote devices with a set of rules that are governing the transmission. In the case of TCP, for instance, the protocol mandates the shape of the message, the possible operations, the error policies as well as the rules for possible retransmission of a message.

On the other hand, it might be kind of weird for people to talk about protocols in a programming language. After all, there is no communication involved in a program, so what is the meaning of it?

We can answer the question by taking a look at another more generic definition that is not coupled with computer science:

> The accepted or established code of procedure or behavior in any group, organization, or situation. (New Oxford Dictionary)

Let's notice the difference with the previous definition we found above:

* No **devices** involved.
* No data **transmission** involved.
* No **data** involved, per se.

In fact, the nature of the protocol we're going to be talking about today. By removing some of the constraints of the previous definition, it is possible to reason about protocols in the context of a programming language.

Protocols are indeed not a new idea and have been around for a long time. For instance, Clojure supports protocols explicitly.

```clojure
(defprotocol Shape
  (area [_])
  (perimeter [_]))
  
(defrecord Square [l])

(defrecord Circle [r]
  Shape
  (area [_] (* Math/PI (Math/pow r 2)))
  (perimeter [_] (* r Math/PI)))

(defrecord Rectangle [w h]
  Shape
  (area [_] (* w h))
  (perimeter [_] (* 2 (+ w h))))
```

This code is basically creating a set of functions grouped in a Protocol called `Shape`; every type that wants to adhere to such Protocol has to implement such methods.

Successively, we create two new types — `Circle` and `Rectangle` where we implement the `area` and `perimeter` methods that are defined in the protocol. Because of this, we can call any of the protocol methods on any of the type instance implementing it:

```clojure
(def c (->Circle 10))
(area c)
; 314.1592653589793

(def r (->Rectangle 10 12))
(area r)
; 120

(def s (->Square 10))
(area s)
; Execution error (IllegalArgumentException) at user/eval147$fn$G (REPL:1).
; No implementation of method: :area of protocol: #'user/Shape found for class: user.Square
```

We can see that the methods are called correctly in the first two instances, but it fails on the `Square` object because it does not implement the `Shape` Protocol. Also, note that no **class** or **inheritance** is involved in this.

It is also possible to implement a protocol on a type that is already existing, even on types we do not own:

```clojure
(extend-protocol Shape
  Square
  (area [s] (Math/pow (:l s) 2))
  (perimeter [s] (* 4 (:l s))))
  
(extend-protocol Shape
  java.lang.String
  (area [_] 1)
  (perimeter [_] 10))
  
(def s ->Square 10)
  
(area s)
; 100.0

(area "hello world")
; 1
  
```

In this example, we have indeed implemented the same protocol on a type that has been precedently created (`Square`) and then even implemented the protocol on the built-in `String` type. This allows us to obtain what is called "polymorphism a la carte", where classes and inheritance are not involved/required. The original type is not even aware that a protocol implementation is being attached to it.

The rationale behind the Protocols is well explained in [Simple made easy](https://www.youtube.com/watch?v=kGlVcSMgtV4) or in the [documentation page](https://clojure.org/reference/protocols)

## Protocols In Python

Protocols in Python work a little bit differently. While the concept is the same as Clojure, Python does not need explicit protocol declaration. If a type has the methods specified in the protocol, then it implements the protocol.

For instance, the [len function](https://docs.python.org/3/library/functions.html#len) requires a `Sized` type as its argument, which is every type that implements the `__len__` function:

```python
class Team:
    def __init__(self, members):
        self.__members = members

justice_league_fav = Team(["batman", "wonder woman", "flash"])

print(len(justice_league_fav)) # TypeError: object of type 'Team' has no len()
```

The `Team` class does not implement the `__len__` method, and the runtime is throwing an error. We can change the class to implement such method:

```python
class SizedTeam:
    def __init__(self, members):
        self.__members = members

    def __len__(self):
        return len(self.__members)

justice_league_fav = SizedTeam(["batman", "wonder woman", "flash"])

print(len(justice_league_fav)) # 3
```

In the same way, we have seen in Clojure, no inheritance was involved, and the runtime has been able to execute the code.

Python 3 defines a number of protocols that can be implemented by our own types to be used around. The most common ones are:

* `Sized`: any type implementing the `__len__` method
* `Iterable`: any type implementing the `__iter__` method
* `Iterator`: any type implementing the `__iter__` and `__next__` methods

The complete list, though, is available on the [documentation page](https://mypy.readthedocs.io/en/stable/protocols.html)

## Protocol Types In Python

The problem comes when we try to apply the same Protocol concept in a typed context. Suppose, for instance; we have a custom class defining a protocol describing an open/close method for IO operations:

```python
import io

class IOResource:
    def __init__(self, uri: str):
        pass

    def open(self) -> int:
        pass

    def close(self) -> None:
        pass


class FileResource:
    def __init__(self, uri: str):
        self.uri = uri

    def open(self):
        self.file = io.FileIO(self.uri)
        return self.file.fileno()

    def close(self):
        self.file.close()


def write_resource_to_disk(r: IOResource):
    pass


write_resource_to_disk(FileResource("file.txt"))
```

Try to paste this code in a new file called `IOResource.py` and then try to run [`mypy`](http://mypy.readthedocs.io) on it:

```bash
mypy IOResource.py

IOResource.py:11: error: Argument 1 to "write_resource_to_disk" has incompatible type "FileResource"; expected "IOResource"
```

You can see we received a type error claiming (rightfully) that `FileResource` is not an `IOResource` subclass, even though our class is implemented all the required methods.

The same also happens to the builtin collection types, in case we're using Python 3.7 and running an older version of mypy (0.521, to be specific) on the following code:

```python
from typing import Sized

def multiply_len(val: Sized) -> int:
    return 2 * len(val)

class SizedTeam:
    def __init__(self, members):
        self.__members = members

    def __len__(self):
        return len(self.__members)

multiply_len(SizedTeam(["batman", "wonder woman", "flash"]))

# team.py:13: error: Argument 1 to "multiply_len" has incompatible type "SizedTeam"; expected "Sized"
```

Basically, the type system is **not** smart enough to recognize the implicit protocol through structural typing, leaving a bunch of idiomatic Python constructs out of the game when working in typed contexts.

Fortunately speaking, the issue has been fixed in Python 3.8 (or with any mypy >= 0.521) with the introduction of [Protocol classes](https://www.python.org/dev/peps/pep-0544/). If we take the collection example from above and try it on Python 3.8 — it will indeed work.

We can also fix now the `IOResource` example to make it work in Python 3.8:

```python
import io
from typing import Protocol

class IOResource(Protocol):
    def __init__(self, uri: str):
        pass

    def open(self) -> int:
        pass

    def close(self) -> None:
        pass


class FileResource:
    def __init__(self, uri: str):
        self.uri = uri

    def open(self):
        self.file = io.FileIO(self.uri)
        return self.file.fileno()

    def close(self):
        self.file.close()


def write_resource_to_disk(r: IOResource):
    pass


write_resource_to_disk(FileResource("file.txt"))
```

If we try to run mypy on this file again, we should receive no error.

The only change that was required was to make the base class inherit from a new special class called `Protocol`. This is a special class that enables the type system to go structural and not nominal.

## Rundown of Protocols Features

Let's now consider this protocol class:

```python
from typing import Protocol
import io

class IOResource(Protocol):
    uri: str

    def __init__(self, uri: str):
        pass

    def open(self) -> int:
        pass

    def close(self) -> None:
        pass
```

And let's use it to check what we can do when using Protocols.

Protocols are defined by including the special `typing.Protocol` class in the base class list. The annotated class does not lose any semantics of a regular abstract base class; they are just handled specially by the type checker.

All the functions and variables of a Protocol class are also protocol members, no matter the decorator on the top. This means they have to present in the target classes in order to consider them compliant with the protocol:


```python
import io


class FileResource:
    data: str
    file: io.FileIO

    def __init__(self, uri: str):
        self.data = uri

    def open(self):
        self.file = io.FileIO(self.uri)
        return self.file.fileno()

    def close(self):
        self.file.close()



write_resource_to_disk(FileResource("file.txt")) # Argument 1 to "write_resource_to_disk" has incompatible type "FileResource"; expected "IOResource"

```

When running this code through mypy, the type checker will detect the class is missing the `uri` string property that's defined in the protocol class.

A class can also inherit directly from a Protocol class — making the relationship explicit. This does not change anything, and it is indeed not required (and discouraged):

```python
- class FileResource:
+ class FileResource(IOResource):
```

A Protocol class can also aggregate other protocols if necessary:

```python
- class IOResource(Protocol):
+ class IOResource(Sized, Protocol):
```

In this case, the IOResource is a protocol with the methods we have defined above **and** it's also a `Sized`, meaning the `__len__` method must be implemented:

```python
class FileResource():
    uri: str
    file: io.FileIO

+    def __len__(self):
+        return len(self.file.readall())
```

It is also possible to create "aggregate" protocol classes, if necessary:

```python
from typing import Protocol
import io

class IOResource(Protocol):
    uri: str

    def __init__(self, uri: str):
        pass

    def open(self) -> int:
        pass

    def close(self) -> None:
        pass

class SizedIOResource(IOResource, Sized, Protocol):
  pass
```

Any class willing to adhere to the `SizedIOResource` protocol has to implement all the methods in `IOResource` and the `__len__` method as well.

Protocol classes cannot be instantiated, and both a typing error and a runtime error will be thrown since they're internally abstract classes:

```python
q = IOResource("/dev/file.txt") # Cannot instantiate abstract class 'IOResource' with abstract attributes '__len__' and 'uri'
```

## Conclusions

Protocol classes are very useful to keep some idiomatic patterns in Python, even in typed contexts. While the type system is yet not complete, this is an important milestone to close the loop of the story.