How to gradually start adding type hints to python projects

How to gradually add types to an existing untyped python codebase.

How to gradually start adding type hints to python projects
Photo by Amador Loureiro / Unsplash

Wouldn’t it be cool to start adding types to an existing code base? Yes, but usually that requires a change in the underlining data types we used. For example, we could have a code that heavily relies on non-typed dictionary manipulations:

def v2_modulus(v):
    return sqrt(v["x"] ** 2 + v["y"] ** 2)

This is a simple case, but we could also have some .get, del, .clear or any other dict calls, so my first approach wouldn’t be to create a dataclass or class. We can use TypedDict:

from typing import TypedDict

class Vec2(TypedDict):
    x: float
    y: float

def v2_modulus(v: Vec2):
    return sqrt(v["j"] ** 2 + v["n"] ** 2)

Oh, sorry, I made a mistake while retyping this, I wrote v["j"] and v["n"] which clearly are not valid keys for our new Vec2. Luckily I was able to notice that with mypy:

error: TypedDict "Vec2" has no key 'j'
error: TypedDict "Vec2" has no key 'n'

What if instead of using dict we were one of the functional-minded cool kids who liked immutability with namedtuples? Well, we are in a better situation because mypy we can check whether we access a defined attribute. But it won’t catch typing errors:

from collections import namedtuple

Vec2 = namedtuple("Vec2", "x y")
v = Vec2(1, 2)
v.x + v.z
v.x + {}

It can’t catch v.x + {} because it namedtuple lacks typing information. It only defines fields:

error: "Vec2" has no attribute "z"

But you may have guessed it, for every untyped data there is a typed counterpart. Introducing typing.NamedTuple:

from typing import NamedTuple

class Vec2(NamedTuple):
    x: float
    y: float

v = Vec2(1, 2)
v.x + v.z
v.x + {}

This will catch all the problems with the previous code:

error: "Vec2" has no attribute "z"
error: Unsupported operand types for + ("float" and "Dict[, ]")

What if we only used raw tuples? Well, it’s easier:

from typing import Tuple

Vec2 = Tuple[float, float]

v: Vec2 = (1, 2)
v[0] + v[1]
v.x + v.z

The change here is that instead of creating an object of a given type, we need to add a type hint to the newly created variable to indicate its type.

error: "Tuple[float, float]" has no attribute "x"
error: "Tuple[float, float]" has no attribute "z"

Now imagine that we want to fetch an Account based on its id:

class Account: ...
def get_acc(acc_id: int) -> Account: ...

Having account_id defined as an int or string or whatever “real” type it could be, is a really bad idea. And int is an int and shouldn’t be used as an account identifier, at least from a typing perspective. For this kind of situation, we can use NewType.

AccountId = NewType("AccountId", int)
class Account: ...
def get_acc(acc_id: AccountId) -> Account: ...

That would produce an error:

error: Argument 1 to "get_acc" has incompatible type "int"; expected "AccountId"

Structural Duck-Typing

Believe it or not, duck-typing is nothing more than the implicit definition of an interface, that when is not honored an exception is thrown. What would happen if we have a lot of duck-typed code, like our Vector2 example:

from dataclasses import dataclass
from typing import NamedTuple
from math import sqrt

class Vector2:
    x: float
    y: float

class Vector3(NamedTuple):
    x: float
    y: float
    z: float

class Vector4:
    def __init__(self, x, y, z, w):
        self.x = x
        self.y = y
        self.z = z
        self.w = 2

def modulus(v2):
    return sqrt(v2.x + v2.y)


We really don’t want to start touching those Vector types at all, but what we could do is add some extra typing to modulus to indicate that in reality it only semantically works for the Vector2 type.

from typing import Protocol

class V2(Protocol):
    def x(self) -> float: ...
    def y(self) -> float: ...

def modulus(v2: V2):
    return sqrt(v2.x + v2.y)

V2 is a protocol with two read-only (@property) attributes: x and y. The read-only thing is important in this example because of the use NamedTuple in Vector4.

At the time of writing structural typing is not supported for modules by mypy: mypy#5018. Worst case scenario you can wrap Module in classes, and call it the day.