Notes on

The Pragmatic Programmer: From Journeyman to Master

by Andrew Hunt and David Thomas


On what a pragmatic programmer is

You shouldn’t be wedded to any particular technology, but have a broad enough background and experience base to allow you to choose good solutions in particular situations. Your background stems from an understanding of the basic principles of computer science, and your experience comes from a wide range of practical projects. Theory and practice combine to make you strong. You adjust your approach to suit the current circumstances and environment. You judge the relative importance of all the factors affecting a project and use your experience to produce appropriate solutions. And you do this continuously as the work progresses. Pragmatic Programmers get the job done, and do it well.

Think! About Your Work

We should avoid programming by coincidence—relying on luck and accidental successes—in favor of programming deliberately.

Think about what you're doing, while you're doing it. Never run on autopilot. Constantly be thinking, critiquing your work in real time.

Don't assume it, prove it. Diagnosing a bug? Think you've found a pattern that indicates the cause? Don't assume it, prove it.

How to program deliberately

  • Know exactly what you're doing and why
  • If you can't explain what you're doing, you don't understand it
  • Understand the technology you use
  • Work from a plan
  • Avoid depending on assumptions regarding reliability ("there's always a network connection", "I can always write to this folder", "this and that env is true"). Better to have proof. If you don't know something is reliable, assume the worst.
  • Document your assumptions.
  • Test both your code & assumptions
  • Work on the most important tasks first. You need to know what this is.
  • Have the fundamentals & infrastructure in order.
  • Don't let existing code dictate future code. Code is replaceable when it isn't appropriate anymore. Be ready to refactor.

Think beyond the immediate problem

Place it in its larger context & seek out the bigger picture.

You have agency

This goes for any aspect of life. If you want change, go get it!

You have more power than you think.

  • Work sucks? Boring job? Try to fix it, or get another one.
  • Feeling behind? Go study in your own time to look at what interests you.
  • Want to work remotely? Ask! If you get rejected, find someone who'll say yes.

Take ownership

Of yourself. Of your actions in terms of career advancement, learning and education, your project, and your day-to-day work.

Provide options, not lame excuses

When you need to give bad news (something can’t be done, is late, etc.), make sure you give options. Not just say “this doesn’t work”. That’s useless.

If you find yourself saying "I don't know," follow it up with "—but I'll find out."

It's okay to not know things, but now you're also taking responsibility for it.

Don't live with broken windows

This comes from analogy. A broken window in a building that has been broken for long enough starts a chain of “rot”. People start breaking other windows. Litter.

That’s to say, if there’s mold, it’ll spread. Fix it when you see it.

If there’s already rot in your codebase, don’t make more mess trying to clean it up.

Keep an eye on the big picture

Constantly review what's happening around you, not just what personally are doing.

Good-enough software

Write software that is good enough. You'll be more productive & your users will be happier.

"Good enough" doesn't mean sloppy or poorly produced. It means after you've filled the basic needs, let users help you decide when what you've built is good enough for their needs.

Great software today is often preferable to the fantasy of perfect software tomorrow. If you give your users something to play with early, their feedback will often lead you to a better eventual solution

Learn when to stop cleaning up, adding features, optimizing, and so on. "Move on, and let your code stand in its own right for a while. It may not be perfect. Don’t worry: it could never be perfect."

Your Knowledge Portfolio

A successful career necessarily requires your continuous learning. Learn often. Learn broadly. Learn high value skills.

It's pretty similar to investing:

  • Diversify: learn broadly
  • Invest regularly: learn regularly
    • Take classes
    • Stay current: read news and posts online on technology different from that of your current project. See what experiences others are having, what jargon they use, etc.
    • Always keep learning
  • Manage risk: learn the fundamental and the bleeding edge
  • Buy low, sell high: learn to spot emerging tech
    • Look at what the nerds are playing around with today. That’ll be mainstream in the future. Discern with a critical eye, though.

Some practical suggestions:

  • Learn at least one new language every year
  • Read a technical book each month
    • There are good short-form essays, but for deep understanding, you need long-form books.
    • "After you’ve mastered the technologies you’re currently using, branch out and study some that don’t relate to your project."
  • Read nontechnical books, too
  • Participate in local user groups & meetups
  • Experiment with different environments
    • Windows user? Try Linux.
    • VSCode user? Try Neovim.

Thinking clearer

The following are some tips in the form of questions you can use to think clearer & deeper.

Critically analyze what you read and hear. Following the money is usually a good start. Who benefits?

What's the context? Lots of people make claims, give advice, etc. say what’s best. But that comes from some experience, and is usually mostly custom to their situation. “Next.js is best” It’s good, sure. But best? In what context? For whom? Which goals?

When or where would this work?

Why is this a problem? Is there an underlying model? How does it work?

Communication

Having the best ideas, the finest code, or the most pragmatic thinking is ultimately sterile unless you can communicate with other people.

English is just another programming language. Treating is as such also helps you write better (DRY, ETC, automation, and so on).

Know your audience: don't speak technical language to nontechnical people.

On writing:

Plan what you want to say. Write an outline. Then ask yourself, “Does this communicate what I want to express to my audience in a way that works for them?” Refine it until it does.

Timing: Think about timing. Is what you’re going to say appropriate right now? And not just that: is this the best time to discuss it? Does the recipient have more important things to consider? Is it important to you to say or for them to hear right now?

Build Documentation In, Don’t Bolt It On

  • You can create documentation from comments in code – there's no excuse!
  • Add comments to modules and exported functions, so other developers can navigate and use it
  • Not every function needs a comment:
    • Add explanatory comments to API
    • Add only why, technical reasoning, etc. to internal code

Good design

The Easier To Change (ETC) Principle: Good Design Is Easier to Change Than Bad Design. They claim that most design principles are just special cases of their Easier To Change principle. I don't thing that's wrong. But some of them are also to lessen the mental burden for reading or writing the code. Perhaps that’s part of “easier”.

If you aren't sure which form change will take, try to fall back on the ultimate "easy to change" path: try to make what you write replaceable.

Don't Repeat Yourself (DRY)

We feel that the only way to develop software reliably, and to make our developments easier to understand and maintain, is to follow what we call the DRY principle: Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

I could agree with this definition.

Don’t repeat yourself is often used in pursuit of NEVER EVER writing the same pattern twice, i.e., no 4 (or any x) lines can repeat. I don’t agree with that.

But using knowledge as the delineator allows for certain freedoms. It depends on what a piece of knowledge is.

  • Don’t want two AppConfig structs that load configuration from environment variables.
  • Could want similar procedures that describe different processes that just have similar steps - and that may change distinctly in the future

The alternative is to have the same thing expressed in two or more places. If you change one, you have to remember to change the others, or, like the alien computers, your program will be brought to its knees by a contradiction. It isn’t a question of whether you’ll remember: it’s a question of when you’ll forget.

And it’s just a pain to change in all the different places

DRY doesn't mean "don't copy-and-paste lines of code." That's part of DRY, but it's a small and trivial part. It's about duplication of knowledge, of intent. Expressing the same thing in multiple, different places, potentially in different ways. Do you have to change code similarly in multiple places to change the same thing in your system? You should stop repeating yourself.

Not All Code Duplication Is Knowledge Duplication.

Performance reasons can be good reasons to violate DRY. Just try to localize impact.

When possible, use accessor functions to read and write the attributes of objects. It'll make it easier to add functionality in the future. Any time a model exposes a data structure, all code that uses that structure is coupled to the specific implementation of that module.

So: when you expose a data structure in a module, it’s a good idea to use methods to facilitate access to its data. This means you can vary the underlying representation without compromising the API.

Grokking Simplicity had a nice part about this. I can’t remember the exact example, but imagine you start by representing a set of keys and values as an array of tuples. That’s fair, but you realized that a dictionary would be better. If you had exposed the array property in the data structure as the “official” API to the data, you can’t change it without making breaking changes. However, if you exposed it through a function, you could change the internal representation to a dictionary and just compute the array for users. Best of both. (But the computation may make it bad; need to consider whether making the breaking change is necessary).

Make it easy to reuse. Easier than rewriting it elsewhere. "Make it easy" is always the advice to enforce good habits — just see Atomic Habits.

Orthogonality

Orthogonality in code: independence between systems. Changes in one doesn't affect the other.
Other words for orthogonality: modularity, component-based, layered.

Eliminate Effects Between Unrelated Things.

Create systems that are independent of each other. Changing one shouldn’t require changing the other.

Keep your code decoupled: don’t expose module internals.

Avoid global data. Use dependency injection or something. Pass structs around with configurations. And so on.

Avoid similar functions. Similar functions can use the strategy pattern. I also like to return functions, or take callbacks.

Reversibility

Decisions that are easily reversible are easy to make. You can always change your mind if it doesn’t work out.

And our ideas don’t always work out. Our first guess or choice can easily be wrong, or later the requirements change, so need to pick something else to do the job.

This is why making decisions easily reversible is important.

For example, if you abstract away exactly which, or what type, of database you’re using, it’s easier to reverse and pick something else.

Tracer bullets

Use Tracer Bullets to Find the Target.
It’s like building the full-stack solution to some requirement, but as a feature MVP. Just to prove it works and how you’d do it.

Get feedback fast on the important, but somewhat uncertain requirements.

It isn’t prototyping. It's more like building the first version of some part of your system. It’s the beginning of some incremental process.

Estimating

Being able to do back-of-the-envelope estimations is a great skill to have. You'll have an intuitive feel for the magnitudes of things. It'll be easier to determine feasibility.

Estimate to avoid surprises.

Guide to estimating:

  1. Understand the Query
    • The first part of any estimation exercise is to understand what’s being asked.
  2. Build a Mental Model
    • Create a simple representation of the system or scenario.
    • This aids in understanding and can reveal hidden patterns or alternatives.
  3. Decompose into Components
    • Identify and define the parts of your model.
    • Determine how each component affects the outcome.
  4. Assign Values to Parameters
    • Estimate numerical values for each component.
    • Focus on critical parameters that significantly impact results, like multipliers.
  5. Perform Calculations
    • Calculate outcomes using your values.
    • Use tools like spreadsheets to handle complex systems and vary parameters to see effects.
  6. Present Estimates with Context
    • Provide estimates with qualifiers that reflect uncertainty.
    • Example: “Response time is about 0.75 seconds with SSDs and 32GB of memory, and 1 second with 16GB memory.”

When asked for an estimate, ask for time. Say "I'll get back to you."

Tools

  • Plain text is great. Use it!
  • Learn to use shell. Prefer it over GUI – it's faster.
  • Get good at using your editor.
    • "over the course of a year, you might actually gain an additional week if you make your editing just 4% more efficient and you edit for 20 hours a week."
  • Always use version control.
  • Learn a text manipulation language.
  • Use engineering daybooks! I do this with Obsidian.
    • "We use daybooks to take notes in meetings, to jot down what we’re working on, to note variable values when debugging, to leave reminders where we put things, to record wild ideas, and sometimes just to doodle."
    • Some benefits:
      • Better than trying to remember every small thing that may be useful
      • Combat the Zeigarnik Effect
      • Rubber ducking
      • Documenting work you’ve done, contributions you’ve made

Debugging

Debugging is just problem-solving. Attack it as such.

Fix the problem, not the blame. It doesn't matter if the bug is your fault or not. It's still your problem.

Don’t panic. Step back and think about what could be causing the symptoms you believe indicate a bug.

Don’t waste a single neuron on the train of thought that begins “but that can’t happen” because quite clearly it can, and has.

Solve root causes, don’t just address symptoms. The actual may very well be several steps removed from what you are observing, involve other related things, and so on. Always try to find the root cause.

  1. Check if your code builds without warnings.
  2. Gather information. All the relevant data.
  3. Reproduce: make it easy to verify and check. Can you make it readily reproducible?
  4. "Failing Test Before Fixing Code"
  5. Read the error message. So many people don’t just do this, and it actively hinders them.
  6. Use the debugger & use your failing test to trigger the problem.
  7. Use tracing statements – print debugging.
  8. Rubber ducking works. I’ve found writing very useful. Note all my assumptions and ideas. Works fast usually.
  9. Assume it’s your code that’s broken before assuming it might be the library that's broken. It's probably your code.
  10. Process of elimination works.
  11. "Don't Assume It–Prove It"

And generally also: fix bug, determine cause, prevent future occurrences, communicate with team.

Pragmatic paranoia

You can't write perfect software. Neither can others.

Coding defensively because others may be wrong:

  • Validate information
  • Use assertions
  • Distrust data from potential attackers
  • Check for consistency
  • Put constraints on database columns

But you should code defensively against yourself as well.

Exceptions

If exceptions occur, crash asap. Don’t want to continue in a corrupted state.

I think it's even better if there’s some fault tolerance in the overall system and there’s a supervisor node that can restart the program to reset it.

Assertive programming

Use Assertions to Prevent the Impossible.

  • "Logging can't fail"
  • "This won't be used abroad"
  • "Count can't be negative"

Don't fool yourself. Be very careful thinking that something can’t happen.

I like to recall Richard P. Feynman's quote here:

The first principle is that you must not fool yourself - and you are the easiest person to fool.

If you're thinking "that could never happen," add code to check it. For example with assertions. Asserts are good for checking if the “impossible” has happened. They aren’t replacements for actual error handling. Only use them for the “impossible.”

Leave assertions turned on.
As long as you’ve tested, assertions aren’t necessarily needed, right? Sure, if you have covered everything that could go wrong. You likely haven’t.

So keep assertions in. While they do have overhead, you can’t possibly check all permutations of events, errors, and inputs.

Can turn off assertions that hurt performance when it’s needed.

Rules for balancing resources

Deallocate what you allocate – "finish what you start."

Whatever allocates should also deallocate.

Don't outrun your headlights

Don't get ahead of yourself.

It’s tough to make predictions, especially about the future.
Lawrence "Yogi" Berra, after a Danish Proverb

The general theme here is our inability to predict the future.

Take small steps. One thing at a time. Get feedback quickly. A task is too big when you start trying to predict the future.

Avoid premature optimization.

Take small, deliberate steps, and check for feedback & adjust before proceeding. Consider the rate of feedback as your speed limit. You can get feedback by, e.g.:

  • Writing quick scripts / using a Read-Eval-Print-Loop (REPL)
  • Writing unit tests
  • Demo for users & talk to them

When am I trying to predict the future? When you are:

  • Trying to estimate months in the future
  • Plan a design for future maintenance/extendability
  • Guess user's future needs
  • Guess future tech availability

The present looks like the past, until it doesn't.

If you can't program it..

If you can’t describe what you are doing as a process, you don’t know what you’re doing.
W. Edwards Deming, (attr)

Corollary: if you can’t program it, you don’t understand it.

Writing flexible code

Decoupled code is easier to change.

Coupling can occur anytime two pieces of code share something.

Tips

  • Don't couple externals to the internals of an object. Make a good API and avoid leaky abstractions.
  • Avoid chaining method calls. Try not to have more than one . when you access something. This also covers where you're using intermediate variables.
    • Doesn't apply if the things you're chaining are really, really unlikely to change.
    • Pipelines are fine. While they do indicate coupling: the data passed to one step depends on the format given in the previous step (usually). But it’s usually cleaner that method chaining.
  • Globally accessible state is usually pretty bad. Avoid global data.
    • Singletons are also bad.
    • This includes external resources. If your app uses a database, service API, etc., make sure you wrap the resource behind code you control.

Event driven systems can help with flexibility.

Even Driven Systems

Strategies for coding event driven systems

  1. Finite State Machines
  2. The Observer Pattern
  3. Publish/Subscribe
  4. Reactive Programming and Streams

Finite State Machines

You can express FSMs as data, e.g., a table, by representing the states as rows, and events as columns. To find out which transition to make when an event occurs, look up the row for the current state & scan along for the column representing the event. The contents of the cell will be the new state.

But that kind of FSM is just an event stream parser. It just outputs the final state. We can make it more powerful by adding actions that are triggered on certain transitions.

Here's an [[TypeScript]] implementation of FSMs:

import { EventEmitter } from "node:events";

interface State<TEventData> {
	onEnter?: (eventData?: TEventData) => void | Promise<void>;
	onExit?: (eventData?: TEventData) => void | Promise<void>;
}

interface Transition<TEventData> {
	from: string;
	to: string;
	condition?: (eventData: TEventData) => boolean;
	action?: (eventData: TEventData) => void | Promise<void>;
}

class FSM<TEventData> extends EventEmitter {
	private states: Map<string, State<TEventData>>;
	private transitions: Transition<TEventData>[];
	private currentState: string;

	constructor(initialState: string) {
		super();
		this.states = new Map();
		this.transitions = [];
		this.currentState = initialState;
	}

	public addState(name: string, state: State<TEventData>): void {
		this.states.set(name, state);
	}

	public addTransition(transition: Transition<TEventData>): void {
		this.transitions.push(transition);
	}

	public async handleEvent(eventData: TEventData): Promise<void> {
		const possibleTransitions = this.transitions.filter(
			(t) => t.from === this.currentState,
		);
		const transition = possibleTransitions.find(
			(t) => !t.condition || t.condition(eventData),
		);

		if (transition) {
			await this.transitionTo(transition.to, eventData, transition.action);
		} else {
			console.error(
				`No valid transition found from state ${this.currentState} with event data: ${eventData}`,
			);
		}
	}

	private async transitionTo(
		stateName: string,
		eventData?: TEventData,
		action?: (eventData: TEventData) => void | Promise<void>,
	): Promise<void> {
		const currentStateObj = this.states.get(this.currentState);
		const nextStateObj = this.states.get(stateName);

		if (!nextStateObj) {
			throw new Error(`State ${stateName} does not exist`);
		}

		const data = eventData as TEventData;

		try {
			await currentStateObj?.onExit?.(data);
			await action?.(data);
			this.currentState = stateName;
			await nextStateObj.onEnter?.(data);
			this.emit("stateChanged", this.currentState);
		} catch (error) {
			console.error(`Error during transition: ${error}`);
		}
	}
}

// --- EXAMPLE ---

interface TimerEvent {
	time: number;
}

type StateName = "Red" | "Green" | "Yellow";

const STATE_DURATIONS: Record<StateName, number> = {
	Red: 5,
	Green: 3,
	Yellow: 2,
};

const TRANSITIONS: Transition<TimerEvent>[] = [
	{ from: "Red", to: "Green" },
	{ from: "Green", to: "Yellow" },
	{ from: "Yellow", to: "Red" },
];

const initialState: StateName = "Red";
const fsm = new FSM<TimerEvent>(initialState);

function addTimedState(name: StateName, duration: number) {
	fsm.addState(name, {
		onEnter: () => {
			logStateChange(name, "enter");
			setTimeout(() => fsm.handleEvent({ time: duration }), duration * 1000);
		},
		onExit: () => logStateChange(name, "exit"),
	});
}

function logStateChange(state: StateName, action: "enter" | "exit") {
	console.log(`${action === "enter" ? "->" : "<-"} ${state}`);
}

function setupFSM() {
	type Entry = [StateName, number];
	for (const [state, duration] of Object.entries(STATE_DURATIONS) as Entry[]) {
		addTimedState(state, duration);
	}

	for (const transition of TRANSITIONS) {
		fsm.addTransition(transition);
	}
}

// Initialize and start the FSM
setupFSM();
fsm.handleEvent({ time: 0 });

Observer Pattern

Observer Pattern is a rather simple pattern. We have a source of events, called the observable, and a list of clients, called observers, who are interested in the events. Observers register their interest with the observable, typically by passing a reference to the function to be called. When the event occurs, the observable iterates through its list of observers and calls the function each has passed.

This pattern does have a notable problem in that each observer has to register with the observable, thereby introducing coupling. And because the callbacks are typically handled inline by the observable, synchronously, it can introduce performance bottlenecks. This is solved by the publish/subscribe strategy.

Here's an example implementation in TypeScript:

// Observer interface declares the update method, used by subjects.
interface Observer<T> {
	// Receive update from subject.
	update(subject: Subject<T>): void;
}

// Subject interface declares a set of methods for managing subscribers.
interface Subject<T> {
	// Attach an observer to the subject.
	attach(observer: Observer<T>): void;

	// Detach an observer from the subject.
	detach(observer: Observer<T>): void;

	// Notify all observers about an event.
	notify(): void;

	// Get the current state, used by observers.
	getState(): T;
}

// AbstractSubject provides default implementations for managing subscribers.
abstract class AbstractSubject<T> implements Subject<T> {
	private observers: Observer<T>[] = [];

	attach(observer: Observer<T>): void {
		const isExistingObserver = this.observers.includes(observer);
		if (isExistingObserver) {
			console.log("Subject: Observer has been attached already.");
			return;
		}

		this.observers.push(observer);
		console.log("Subject: Attached an observer.");
	}

	detach(observer: Observer<T>): void {
		const observerIndex = this.observers.indexOf(observer);
		if (observerIndex === -1) {
			console.log("Subject: Nonexistent observer.");
			return;
		}

		this.observers.splice(observerIndex, 1);
		console.log("Subject: Detached an observer.");
	}

	notify(): void {
		console.log("Subject: Notifying observers...");
		for (const observer of this.observers) {
			observer.update(this);
		}
	}

	abstract getState(): T;
}

// ConcreteSubject stores state of interest to ConcreteObserver objects.
class ConcreteSubject<T> extends AbstractSubject<T> {
	private state: T;

	constructor(initialState: T) {
		super();
		this.state = initialState;
	}

	getState(): T {
		return this.state;
	}

	public someBusinessLogic(state: T): void {
		console.log("Subject: I'm doing something important.");
		this.state = state;

		// Notify all observers about the new state.
		this.notify();
	}
}

// ConcreteObserver reacts to the updates issued by the ConcreteSubject it had been attached to.
class ConcreteObserver<T> implements Observer<T> {
	private id: number;

	constructor(id: number) {
		this.id = id;
	}

	update(subject: Subject<T>): void {
		console.log(
			`Observer ${this.id}: Reacted to the event. New state: ${JSON.stringify(
				subject.getState(),
			)}`,
		);
	}
}

// -- EXAMPLE --
const subject = new ConcreteSubject<string>("State0");

const observer1 = new ConcreteObserver<string>(1);
subject.attach(observer1);

const observer2 = new ConcreteObserver<string>(2);
subject.attach(observer2);

subject.someBusinessLogic("State 1");

subject.detach(observer2);

subject.someBusinessLogic("State 2");

Publish/Subscribe (pubsub)

Pubsub generalizes the observer pattern.

There are publishers and subscribers. These are connected by channels, which are implemented in a separate body of code (e.g., a library, process, or distributed infrastructure). Each channel has a name. Subscribers register interest in one or more channels, and publishers write events to them. The communication between them is handled outside the code, potentially asynchronously.

This strategy solves some coupling and performance issues present in other strategies. It uses a mediator, so the components don’t have to know about each other. And it’s all asynchronous, so there’s less synchronous performance hit (publishers don’t wait for subscribers).

E.g. Amazon SNS, Google Cloud Pub/Sub, Microsoft Azure Event Hubs.

Generally, you’ll:

  • create a topic (Channel)
  • publish messages to the channel via publishers
  • subscribe to the channel with subscribers, that then receive the published messages

Transforming Programming

Think of programs as transforming inputs to outputs.

When we think about design, we think about classes, modules, data structures, algorithms, languages, and frameworks. But this misses the point. We need to think about creating transformations.

It's like an industrial assembly line. Raw data in, finished product (information) out.

Programming is about code, but programs are about data.

Designing your programs as a sequence of transformations is also good for ML pipelines.

How do you handle errors when reading programs like a series of transformations? You could use the Result type, for example. Can handle inside or outside the pipeline. Here's an example of a Result type in TypeScript:

type Result<T, E> = { kind: "ok"; value: T } | { kind: "err"; error: E };

function divide(dividend: number, divisor: number): Result<number, string> {
	if (divisor === 0) {
		return { kind: "err", error: "Division by zero" };
	}

	return { kind: "ok", value: dividend / divisor };
}

Inheritance

Avoid using inheritance. It's coupling. "Not only is the child class coupled to the parent, the parent’s parent, and so on, but the code that uses the child is also coupled to all the ancestors."

Instead, you could use:

  • Interfaces and protocols
  • Delegation
  • Mixins and traits

Interfaces & protocols are great. You can use them as types, and class that implements the appropriate interface will be compatible with that type.

Prefer interfaces to express polymorphism. As long as an object fulfills the contract you can use it with, e.g., a polymorphic function that accepts objects with that interface.

Delegation: use members to delegate to instead of exploding the class with methods. Has-A trumps Is-A.

Mixins extend classes with new functionality without inheritance.

Configuration

Use configuration variables (env vars, for example) for values that may change after the application has gone live.

The authors gave a good list of common things you may include in your configuration data:

  • Credentials for external services (database, third party APIs, and so on)
  • Logging levels and destinations
  • Port, IP address, machine, and cluster names the app uses
  • Environment-specific validation parameters
  • Externally set parameters, such as tax rates
  • Site-specific formatting details
  • License keys

Try to not just load env vars into global state. Wrap them in a thin API.

Configuration as a service is also possible. Instead of keeping configuration data in a flat file or database, store it behind a service API. This way, multiple apps can share configuration information. Any changes you made can be made globally. The data can be maintained with a specialized UI. And it becomes dynamic (e.g., update your site's config live).

I think it’s fine. Immediate problem is: what if the service goes down? But it’s just a trade-off vs. not getting these pros. Also don’t have to restart main service to switch config. But we’ve become great at that. Almost zero downtime at this point.

Concurrency

Concurrency is when the execution of two or more pieces of code act as if they run at the same time. Parallelism is when they do run at the same time.

For concurrency, the code needs to be in an environment that can switch execution between different parts of the code when it's running. This is often implemented with fibers, threads, and processes.

For parallelism, you need hardware that can do things at once: multiple cores in a CPU, multiple GPUs on a single computer, or multiple computers connected together.

There is coupling in code: where dependencies between components make the system hard to change. But there’s also temporal coupling: where the system depends on steps being taken in a particular order. The code imposes a necessary sequence.

Breaking Temporal Coupling

When we sit down and design systems, we usually think in terms of linear execution sequences.

But if we allow for concurrency and temporal decoupling, we get flexibility and can reduce temporal dependencies. Our systems can therefore respond faster and become more reliable.

Remember the distinction: concurrency is a software mechanism, and parallelism is a hardware concern. If we have multiple processors, either locally or remotely, then if we can split work out among them we can reduce the overall time things take.

Good saying. Concurrency is a software mechanism. Parallelism is a hardware concern.

Generally, you can implement asynchronous tasks for pieces of work that are relatively independent. If you have a large piece of work, split it into independent chunks, process them in parallel, and combine the results.

Shared State is Incorrect State

Avoid data races. Shared state is hard to get right.

If two processes and read and write from the same memory, that's fine. The problem is when they can't guarantee that their view of the memory is consistent.

For example:
Process 1 checking count, process 2 checking count, and both seeing '1 left'.
Then both processes tries to grab the item... and one is left without it, even after broadcasting that there is an item left.

So we need to make this kind of operation atomic.

We use semaphores to 'lock' access to resources. To control access.

You should probably not make those that console a resource responsible for protecting it. They shouldn’t be the ones claiming and releasing semaphores. So centralize it behind some API.

Concurrency in a shared state environment is very difficult. Try to avoid it.

Actors and Processes

Actors and processes let you implement concurrency without the burden of synchronizing access to shared memory. The actor pattern is a model for handling concurrency in programming. Here's a simplified explanation:

An actor is an independent virtual processor with its own private state and a mailbox for receiving messages. When a message arrives and the actor is idle, it processes the message, potentially

  • creating new actors,
  • sending messages to known actors, or
  • updating its state for future messages.

Once a message is processed, the actor either processes the next message in its mailbox or goes back to sleep if the mailbox is empty. A process, in this context, refers to a general-purpose virtual processor, often managed by the operating system, that can behave like an actor to facilitate concurrency.

Some key characteristics of actors:

  • Concurrency: There is no central control or scheduler.
  • State Isolation: State is confined to messages and the local state of each actor. Messages are private to the recipient.
  • Asynchronous Messaging: Messages are one-way; to get a response, include your mailbox address in the message.
  • Message Processing: An actor completes processing one message before moving to the next and only processes one message at a time.

Here is a Python implementation:

import queue
import threading


class ActorShutdown(Exception):
    """Custom exception for shutting down the actor cleanly."""

    pass


class Actor:
    def __init__(self):
        self._mailbox = queue.Queue()
        self._shutdown_flag = False

    def send(self, message):
        """Send a message to the actor's mailbox."""
        self._mailbox.put(message)

    def shutdown(self):
        """Signal the actor to shut down."""
        self._shutdown_flag = True
        # Put a dummy message in the queue to unblock it if waiting
        self._mailbox.put(None)

    def start(self):
        """Start the actor's run loop in a separate thread."""
        self._thread = threading.Thread(target=self._bootstrap, daemon=True)
        self._thread.start()

    def _bootstrap(self):
        try:
            while not self._shutdown_flag or not self._mailbox.empty():
                try:
                    # Set a timeout for the queue to ensure we can periodically check the shutdown flag
                    msg = self._mailbox.get(timeout=1)
                    if msg is not None:
                        self.run(msg)
                except queue.Empty:
                    continue
        finally:
            self._cleanup()

    def run(self, message):
        """
        The actor's behavior with each message.
        Override this method in subclasses.
        """
        print(f"Received message: {message}")

    def _cleanup(self):
        """Cleanup resources before shutting down the actor."""
        print("Cleaning up actor resources.")

    def join(self):
        """Wait for the actor's thread to finish."""
        self._thread.join()


# Example Usage
class PrintActor(Actor):
    def run(self, message):
        print(f"PrintActor received: {message}")


if __name__ == "__main__":
    actor = PrintActor()
    actor.start()
    actor.send("Hello")
    actor.send("World")
    actor.shutdown()
    actor.join()

Listen to your lizard brain

Learn to listen to your lizard brain (like, the one that instinctively knows things before “you” do: feeling afraid of something before the threat appears, etc.).

  1. Stop what you’re doing for a while to let thoughts bubble up to your consciousness
  2. If that doesn’t do it, try drawing/writing about what you’re working on - externalize it
  3. If still nothing, make your brain think what you’re doing doesn’t “matter” - it’s play. So do a prototype or something. Write a quick sentence on what you’ll do and just write out something that does that - the quickest thing that works. Tell yourself it’s just to see what works and that failing is fine. Just need to learn. You’ll throw away the code prototype after.

Listen to your gut when coding and designing.

If something feels off, I explore why. I think it’s good to know why you’re feeling uneasy about certain things. E.g. you procrastinate on something. For me, that’s usually from some subconscious “what if I don’t do as well as I think I should?” So it’s important to realize that, so you can debunk it and just get going again.

Refactoring

Refactoring definition my Martin Fowler in his book, “Refactoring." It is a:

disciplined technique for restructuring an existing body of code, altering its internal structure without changing its external behavior.

It isn’t something you just do once. It isn’t like spring-cleaning, either.

It takes discipline. External behavior doesn’t change. (Automated testing can help check that!)

When should you refactor? If you find a better way, use it! (Probably be careful not to prematurely optimize, waste time on unimportant cleaning, etc., but this still holds - I don’t want to specify every exception!)

Time spent cleaning now means much less time doing so in the future. Code mess and tech debt compounds.

Good advice for refactoring by Fowler:

  1. Don’t try to refactor and add functionality at the same time.
  2. Make sure you have good tests before you begin refactoring. Run the tests as often as possible. That way you will know quickly if your changes have broken anything.
  3. Take short, deliberate steps: move a field from one class to another, split a method, rename a variable. Refactoring often involves making many localized changes that result in a larger-scale change. If you keep your steps small, and test after each step, you will avoid prolonged debugging.

Generally, the tests cover a specific need: getting quick and definitive feedback that you haven’t broken or changed anything. I think that, if you can find another mechanism to achieve that, that’s ok too.

Move in small steps, get feedback, and you’ll move fast.

Test to code

Testing isn't about finding bugs. The value comes from reasoning about your code, verifying it, thinking about edge cases, and it helps to write mode modular, “clean” code.

The authors believe that the major benefits of testing happen when you think about and write the tests, not when you run them.

Because you have to use the code you’re testing in your tests, you have to, e.g., pass in dependencies. So it has to be modular, flexible, & easy to change.

The only way to build software is incrementally. You need to iterate a lot when building. Build → learn, repeat.

It is easy to become seduced by the green "tests passed" message, writing lots of code that doesn’t actually get you closer to a solution

Optimizing code structure, getting the abstractions just right, and such things. They’re important to some extent, sure. But you have to make sure you’re solving the problem first! That’s the most important thing.

People become slaves to ideological beliefs in software (and other areas) and can’t think beyond them. They practice TDD, Agile, whatever, to extreme extents. But it’s the midwit way. You have to think about what you’re doing and why, and whether it’s actually benefitting you. You should learn a lot, and then pick and choose the most useful constellation of tools instead of buying a toolbox wholesale and never questioning it.

Tests can definitely help drive development. But, as with every drive, unless you have a destination in mind, you can end up going in circles.

It's nice to see the authors be critical of TDD. It is often praised with no regard for downsides. The authors here seem to give a nuanced perspective. Maybe leaning towards negative/critical.

Write test cases to test that software upholds its contract.

Property-based testing

In property-based testing, rather than checking for specific outputs given a certain input, the tests verify that the defined properties (comprising both contracts and invariants) hold true across various inputs. This approach can lead to more robust and comprehensive testing by ensuring that the code adheres to the agreed-upon contracts and maintains the invariants under all circumstances.

Here, invariants refer to conditions or properties that remain consistent or unchanged through the execution of a piece of code. E.g., when sorting a list, the length remains the same before and after the operation, so the length is an invariant of the sorting process. Code operates under certain contracts, essentially making agreements regarding the behavior expected when given specific inputs and what outputs will be produced.

If a property-test fails, fix the code that failed and create a unit test for that specific input configuration.

Property-based tests improve design by highlighting invariants, contracts.

Staying safe

Watch out for malicious actors when coding.

The authors provide some basic security principles:

  1. Minimize Attack Surface Area – the sum of all access points where an attacker can enter data, extract data, or invoke execution of a service.
  2. Principle of Least Privilege
  3. Secure Defaults
  4. Encrypt Sensitive Data
  5. Maintain Security Updates

Principle of Least Privilege: use the least amount of privilege for the shortest time you can get away with. Don't just grab the highest permission level (e.g., Root or Administrator). If you need that level, take it & do the minimum amount of work, then relinquish your permissions quickly to reduce risk.

Before projects

No one knows exactly what they want.

That's where programmers come in. Help people understand what they want.

People ask for the solutions they think solve the problem. We need to figure out what the actual problem is and how to solve that.
Kind of like the XY problem: “The XY problem is asking about your attempted solution rather than your actual problem. This leads to enormous amounts of wasted time and energy, both on the part of people asking for help, and on the part of those providing help.

Question requirements. What exactly do they mean. What assumptions are built in. Which tacit knowledge is assumed. What’s necessary, and what isn’t. Follow the algorithm.

Solving impossible puzzles

Your conscious brain is aware of the problem, but your conscious brain is really pretty dumb (no offense). So it’s time to give your real brain, that amazing associative neural net that lurks below your consciousness, some space. You’ll be amazed how often the answer will just pop into your head when you deliberately distract yourself.

If you’re struggling on a hard problem, let diffuse mode work.

If you aren't willing to drop the problem for a while, the next best thing is to explain it to someone. This often leads to enlightenment. Have them ask:

  • Why are you solving this problem?
  • What’s the benefit of solving it?
  • Are the problems you’re having related to edge cases? Can you eliminate them?
  • Is there a simpler, related problem you can solve?

I usually write about the particularly difficult problems I'm trying to solve.

Agile

Agile is not a noun; agile is how you do things.

Agile isn’t a process. It isn’t some system “agile-in-a-box” turnkey solution. It’s something you are.

Here’s how you can embody the spirit of agile in your actions:

  1. Work out where you are.
  2. Make the smallest meaningful step towards where you want to be.
  3. Evaluate where you end up, and fix anything you broke.

Repeat until done.

A team that doesn’t continuously experiment with their process is not an agile team.

Pragmatic Projects

Under 10-12 members seems to be the optimal team size.

Making something good is a continuous process that depends on effort by the whole team. Not by lone rangers alone. Not by dead weight either. Everyone contributes. Although I do think a Lone Ranger can carry a really heavy burden, experience tells me they often break their back in the process.

Do what works

Do what works, not what's fashionable.

Don’t just do what X big tech companies do (Google, Amazon, etc.).
They’re solving problems they have, something that comes from their specific situation. They have good ideas, no doubt, and you can adopt some. But think before you do. Take what works for you.

How do you know "what works"? You try it.

You want to take the best pieces from any particular methodology and adapt them for use.
And once you realize that other people are just as stupid and ignorant as you, you start realizing that you need to innovate on methods yourself.
The current way isn’t necessary the best way.

Liked this post? Join the newsletter.