Notes on
Powerful Python
by Aaron Maxwell
• 10 min read
I’d include the entire “Who This Book Is For” section. It’s great. Here’s a snippet:
The difference between elite engineers and “normal” coders lies in the distinctions they make, the mental models they leverage, and their ability to perceive what others cannot.
1. Scaling with Generators
Before we get into the chapter itself, let me motivate it.
The problem we’re solving is that we have a list of things (numbers, lines in a file, DB rows, etc.), and we want to process them one by one without copying them, loading everything into memory, or caring how they are stored internally.
So Python defines a protocol, split into two concepts:
- Iterables, which are things you can iterate over, and
- Iterators, which are the things that actually yield the items, one by one, maintaining state
In technical terms, you can view an iterable as a factory for iterators, which are stateful cursors over an iterable / data stream.
When Python wants to loop, it does it = iter(obj). If the object defines an __iter__ method, great, then we call it and get an iterator back. If not, we check if the object has __getitem__ to create an iterator using it. And if those aren’t present, we get a TypeError stating that the object is not iterable.
To actually advance through the collection of things, Python uses next. It calls it.__next__(), which returns the value of the next item, or raises StopIteration if there are no more values.
The chapter opens by contrasting list-building with generators. Returning a list is wasteful when the consumer will process items one-by-one.
from typing import Generator
def fetch_squares(max_root: int) -> list[int]:
squares = []
for n in range(max_root):
squares.append(n**2)
return squares
MAX = 5
for square in fetch_squares(MAX):
print(square)
# Imagine MAX = 10**6 → huge list only to stream it out.
Generators fix that by yielding values on demand, so you start working immediately and keep memory flat.
def gen_nums():
n = 0
while n < 4:
yield n # exit + re-entry point
n += 1
yield 42 # multiple yields are fine; StopIteration is implicit
for num in gen_nums():
print(num)
gen_nums() returns a generator object. A generator object is an iterator, as it has __iter__ returning itself, and __next__ implementing the suspend/resume logic.
You could write the generator as a class instead, but that would mean more boilerplate code in some cases.
- A call to a generator function returns a generator object (an iterator).
fordesugars toiter()+next()withStopIterationterminating the loop:
it = iter(something)
while True:
try:
v = next(it)
except StopIteration:
break
- An iterable exposes
__iter__; an iterator exposes__next__and usually__iter__returning itself.
I like the paired classes for contrast:
class A:
def __init__(self):
self.n = 2
def __iter__(self):
while self.n < 100: # generator method → returns a generator object
self.n *= 2
yield self.n
class B:
def __init__(self):
self.n = 2
def __iter__(self):
return self # the instance *is* the iterator
def __next__(self):
if self.n > 100:
raise StopIteration
self.n *= 2
return self.n
A instances are iterables whose iterator is the generator object produced by __iter__. B instances are themselves iterators (state lives on the instance, so reuse will keep advancing).
Rewriting the squares example as a generator keeps the API but streams results:
def gen_squares(max_root: int) -> Generator[int, None, None]:
for num in range(max_root):
yield num**2
Using it mirrors the list version but without the memory balloon:
for square in gen_squares(MAX):
print(square)
Streaming files line-by-line
Reinforcing the “don’t materialize everything” point: iterate file handles directly instead of readlines(). The walrus operator works but the straight for line in handle is clearer.
def matching_lines_from_file(path, pattern):
with open(path) as handle:
# stream line-by-line instead of slurping the whole file
for line in handle:
if pattern in line:
yield line.rstrip("\n")
# Alternative with assignment expressions; readable, but still more verbose
while (line := handle.readline()) != "":
...
Composing small generator steps
Treat each step as a reusable pipe: source → filter → sink. The types help document intent.
from typing import Generator, Any
def lines_from_file(path: str) -> Generator[str, None, None]:
with open(path) as handle:
for line in handle:
yield line.rstrip("\n")
def matching_lines(
lines: Generator[str, Any, Any], pattern: str
) -> Generator[str, None, None]:
for line in lines:
if pattern in line:
yield line
def matching_lines_from_file_2(pattern: str, path: str) -> Generator[str, None, None]:
lines = lines_from_file(path)
matching = matching_lines(lines, pattern)
for line in matching:
yield line
Fan out: stream characters from lines
Generators compose both breadth- and depth-wise. Here we explode a text stream into a character stream, then count occurrences without ever holding the whole string.
from typing import Generator
text = """...big multi-line string..."""
def gentext():
for line in text.split("\n"):
yield line + "\n"
def chars_in_text(lines: Generator[str, None, None]):
for line in lines:
for char in line:
yield char
def count_char_occurrences(
char_stream: Generator[str, None, None], filter_char: str, normalize: bool = True
) -> int:
if len(filter_char) != 1:
raise ValueError("Filter-char can only be len=1")
n = 0
for c in char_stream:
if (c.lower() if normalize else c) == filter_char:
n += 1
return n
count_char_occurrences(chars_in_text(gentext()), "p")
Fan in: regroup streamed lines into records
The inverse pattern: accumulate lines into structured records, emitting each record as soon as it’s complete, with a blank line acting as boundary. This keeps memory constant and preserves streaming.
from typing import Generator
text = """
address: 1337 42nd Ave
square_feet: 9999
price_usd: 10000
...property listings...
"""
def gentext():
for line in text.split("\n"):
yield line
def house_records(lines: Generator[str, None, None]):
record = {}
for line in lines:
if line == "":
if len(record.keys()) == 0:
continue
yield record
record = {}
continue
k, v = line.split(": ", 1)
record[k] = v
def house_records_from_text():
lines = gentext()
yield from house_records(lines)
for house in house_records_from_text():
print(house)
Iterators and iterables
Python distinguishes between iterators and iterables.
An object in Python is an iterator if it follows the iterator protocol:
- Defines
__next__(), called with no arguments - Each time
__next__()is called, it produces the next item in the sequence- Raises
StopIterationon subsequent calls, after all items have been produced
- Raises
- Defines a boilerplate
__iter__(), called with no arguments, that returns the same iterator. Body is justreturn self
How the built-in next() function works.
_NO_DEFAULT = object()
def next2(it, default=_NO_DEFAULT):
try:
return it.__next__()
except StopIteration:
if default is _NO_DEFAULT:
raise
return default
We’re defining a unique sentinel value, _NO_DEFAULT, to avoid potentially overlapping with real data.
An object in Python is an iterable if it either:
- Defines a method called
__iter__()that creates and returns an iterator over elements in the container, or - Defines
__getitem__(), also used to define square-bracket indexing notation access
class C:
def __init__(self, length):
self.n = length
def __getitem__(self, i):
if i > self.n:
raise IndexError
return f"a{i}"
w = C(10)
for x in w:
print(x) # prints "a0" to "a10"
2. Creating Collections With Comprehensions
Nesting loops
colors = ["orange", "purple", "pink"]
toys = ["bike", "basketball", "skateboard", "doll"]
# This
[color + " " + toy for color in colors for toy in toys]
# Is like
arr = []
for color in colors:
for toy in toys:
arr.append(color + " " + toy)
Multiple filters
You can keep writing ifs, like below, and they’ll chain by and.
numbers = [9, -1, -4, 20, 17, -3]
odd_positives = [
num for num in numbers
if num > 0
if num % 2 == 1
]
# Same as
odd_positives = [
num for num in numbers
if num > 0 and num % 2 == 1
]
Generator expressions - generator comprehension
If you want to use comprehension, but not blow up your memory footprint, use generator expressions.
NUM_SQUARES = 10 * 1000 * 1000
# This
generated_squares = (n * n for n in range(NUM_SQUARES))
type(generated_squares) # generator
# Is equivalent to
def gen_many_squares(limit: int):
for n in range(limit):
yield n * n
many_squares = gen_many_squares(NUM_SQUARES)
Dictionary comprehension
Similar to list comprehensions but for dictionaries.
lists = [("A", "B"), ("C", "D")]
{k: v for k, v in lists}
Set comprehension
You can create a set with similar notation to dictionary comprehension:
dupes = ["A", "A", "A", "A", "A", "A"]
dupes_removed = {v for v in dupes}
3. Advanced Functions
Variable arguments
# Function showcasing variable arguments
def read_paths(*paths):
for path in paths:
with open(path) as handle:
print(handle.read())
read_paths("../main.py", "../uv.lock")
Argument unpacking
# Function showcasing argument unpacking
def normal_function(a, b, c):
print(f"a: {a} b: {b} c: {c}")
numbers = (7, 5, 3)
normal_function(*numbers)
Keyword unpacking
It’s called keyword argument unpacking.
numbers = {"a": 7, "b": 5, "c": 3}
def normal_function(a, b, c):
print(f"a: {a} b: {b} c: {c}")
normal_function(**numbers)
Combining positional and keyword arguments
You can combine positional and keyword arguments, and you can also combine argument unpacking with keyword argument unpacking.
def general_function(*args, **kwargs):
for arg in args:
print(arg)
for k, v in kwargs.items():
print(f"{k}: {v}")
pos_ins = ["foo, bar"]
kw_ins = {"x": 7, "y": 33}
general_function(*pos_ins, **kw_ins)
# You could also just write out the inputs fully, like
# general_function("foo", "bar", x=7, y=33)
4. Decorators
You don’t need the types to write a decorator. I’ve added them for reference.
from __future__ import annotations
from functools import wraps
from typing import Callable, ParamSpec, TypeVar
P = ParamSpec("P")
R = TypeVar("R")
def my_decorator(func: Callable[P, R]) -> Callable[P, R]:
@wraps(func)
def wrapper(*args: P.args, **kwargs: P.kwargs) -> R:
func_name = getattr(func, "__name__", "<callable>")
print("Before the function '{}' is called.".format(func_name))
result = func(*args, **kwargs)
print("After the function '{}' is called.".format(func_name))
return result
return wrapper
@my_decorator
def say_hello() -> None:
print("Hello!")
if __name__ == "__main__":
say_hello()
You might need self as an argument to wrapper and func. This will require func to take at least one parameter. It is not always necessary, so consider whether you really need to use the current object.
For example, the decorator is only intended to be used on methods, not functions, and you need the current object.
If we decide to keep state for each callable we decorate, how might we access it? The key here is that functions are objects, so we can assign attributes to the function objects.
Like before, I’ve added types for reference.
from __future__ import annotations
from functools import wraps
from typing import (
Callable,
Generic,
ParamSpec,
Protocol,
Self,
TypedDict,
TypeVar,
cast,
)
P = ParamSpec("P")
R = TypeVar("R")
class Addable(Protocol):
def __add__(self, other: Self) -> Self: ...
TAddable = TypeVar("TAddable", bound=Addable)
class Stats(TypedDict, Generic[TAddable]):
total: TAddable | None
count: int
class StatsCallable(Protocol[P, TAddable]):
data: Stats[TAddable]
def __call__(self, *args: P.args, **kwargs: P.kwargs) -> TAddable: ...
def collect_stats(func: Callable[P, TAddable]) -> StatsCallable[P, TAddable]:
data: Stats[TAddable] = {"total": None, "count": 0}
@wraps(func)
def wrapper(*args: P.args, **kwargs: P.kwargs) -> TAddable:
val = func(*args, **kwargs)
if data["total"] is None:
data["total"] = val
else:
data["total"] = data["total"] + val
data["count"] += 1
return val
wrapped = cast(StatsCallable[P, TAddable], wrapper)
wrapped.data = data
return wrapped
@collect_stats
def add(a: int, b: int) -> int:
return a + b
if __name__ == "__main__":
add(1, 2)
add(3, 4)
add(5, 6)
print(add.data)
What if you want to count the calls of a callable with a simple integer count?
class CountCallsCallable(Protocol[P, R]):
call_count: int
def __call__(self, *args: P.args, **kwargs: P.kwargs) -> R: ...
def count_calls(func: Callable[P, R]) -> CountCallsCallable[P, R]:
call_count = 0
@wraps(func)
def wrapper(*args: P.args, **kwargs: P.kwargs) -> R:
# `nonlocal` means "use the nearest enclosing `call_count`".
# It lets us reassign it inside this nested function.
nonlocal call_count
call_count += 1
setattr(wrapper, "call_count", call_count)
return func(*args, **kwargs)
wrapped = cast(CountCallsCallable[P, R], wrapper)
setattr(wrapper, "call_count", 0)
return wrapped
nonlocal was not needed before, because we weren’t reassigning the total, we were modifying the object data.
One way to sidestep nonlocal entirely is to use a class that holds the state instead:
@dataclass(slots=True)
class CountCalls(Generic[P, R]):
call_count: int
_func: Callable[P, R]
def __init__(self, func: Callable[P, R]) -> None:
self._func = func
self.call_count = 0
def __call__(self, *args: P.args, **kwargs: P.kwargs) -> R:
self.call_count = self.call_count + 1
return self._func(*args, **kwargs)
def count_calls_class(func: Callable[P, R]) -> CountCalls[P, R]:
return CountCalls(func)
# You can decorate with either:
@count_calls_class
def foo():
pass
@CountCalls
def bar():
pass Liked these notes? Join the newsletter.
Get notified whenever I post new notes.