How to gradually start adding type hints to python projects
How to gradually add types to an existing untyped python codebase.

Wouldn’t it be cool to start adding types to an existing code base? Yes, but usually that requires a change in the underlining data types we used. For example, we could have a code that heavily relies on non-typed dictionary manipulations:
def v2_modulus(v):
return sqrt(v["x"] ** 2 + v["y"] ** 2)
This is a simple case, but we could also have some .get
, del
, .clear
or any other dict
calls, so my first approach wouldn’t be to create a dataclass
or class. We can use TypedDict
:
from typing import TypedDict
class Vec2(TypedDict):
x: float
y: float
def v2_modulus(v: Vec2):
return sqrt(v["j"] ** 2 + v["n"] ** 2)
Oh, sorry, I made a mistake while retyping this, I wrote v["j"]
and v["n"]
which clearly are not valid keys for our new Vec2
. Luckily I was able to notice that with mypy
:
error: TypedDict "Vec2" has no key 'j'
error: TypedDict "Vec2" has no key 'n'
What if instead of using dict
we were one of the functional-minded cool kids who liked immutability with namedtuples
? Well, we are in a better situation because mypy
we can check whether we access a defined attribute. But it won’t catch typing errors:
from collections import namedtuple
Vec2 = namedtuple("Vec2", "x y")
v = Vec2(1, 2)
v.x + v.z
v.x + {}
It can’t catch v.x + {}
because it namedtuple
lacks typing information. It only defines fields:
error: "Vec2" has no attribute "z"
But you may have guessed it, for every untyped data there is a typed counterpart. Introducing typing.NamedTuple
:
from typing import NamedTuple
class Vec2(NamedTuple):
x: float
y: float
v = Vec2(1, 2)
v.x + v.z
v.x + {}
This will catch all the problems with the previous code:
error: "Vec2" has no attribute "z"
error: Unsupported operand types for + ("float" and "Dict[, ]")
What if we only used raw tuple
s? Well, it’s easier:
from typing import Tuple
Vec2 = Tuple[float, float]
v: Vec2 = (1, 2)
v[0] + v[1]
v.x + v.z
The change here is that instead of creating an object of a given type, we need to add a type hint to the newly created variable to indicate its type.
error: "Tuple[float, float]" has no attribute "x"
error: "Tuple[float, float]" has no attribute "z"
Now imagine that we want to fetch an Account
based on its id
:
class Account: ...
def get_acc(acc_id: int) -> Account: ...
get_acc(1+2)
Having account_id
defined as an int
or string
or whatever “real” type it could be, is a really bad idea. And int
is an int
and shouldn’t be used as an account identifier, at least from a typing perspective. For this kind of situation, we can use NewType
.
AccountId = NewType("AccountId", int)
class Account: ...
def get_acc(acc_id: AccountId) -> Account: ...
get_acc(AccountId(1)+2)
That would produce an error:
error: Argument 1 to "get_acc" has incompatible type "int"; expected "AccountId"
Structural Duck-Typing
Believe it or not, duck-typing is nothing more than the implicit definition of an interface, that when is not honored an exception is thrown. What would happen if we have a lot of duck-typed code, like our Vector2
example:
from dataclasses import dataclass
from typing import NamedTuple
from math import sqrt
@dataclass
class Vector2:
x: float
y: float
class Vector3(NamedTuple):
x: float
y: float
z: float
class Vector4:
def __init__(self, x, y, z, w):
self.x = x
self.y = y
self.z = z
self.w = 2
def modulus(v2):
return sqrt(v2.x + v2.y)
modulus(Vector2(1,1))
modulus(Vector3(1,1,1))
modulus(Vector4(1,1,1,1))
We really don’t want to start touching those Vector
types at all, but what we could do is add some extra typing to modulus
to indicate that in reality it only semantically works for the Vector2
type.
from typing import Protocol
class V2(Protocol):
@property
def x(self) -> float: ...
@property
def y(self) -> float: ...
def modulus(v2: V2):
return sqrt(v2.x + v2.y)
V2
is a protocol with two read-only (@property
) attributes: x
and y
. The read-only thing is important in this example because of the use NamedTuple
in Vector4
.
At the time of writing structural typing is not supported for modules by mypy
: mypy#5018
. Worst case scenario you can wrap Module
in classes, and call it the day.