Skip to content
Cover of The Software Engineer's Guidebook

Notes on

The Software Engineer's Guidebook

by Gergely Orosz

• 17 min read


Get stuff done

Be the person who ships. Deliver what you commit to, over-deliver when you can, and make sure people actually know about it.
If you do impactful work in silence, it might as well not exist.

That means telling your manager and team when you finish things.
If the work was complex or the impact was measurable, share it. Most people won’t notice the challenges you overcame unless you surface them.

Start by understanding your team’s priorities and the business’s. It sounds obvious, but a lot of engineers never actually figure out what the business cares about.
They work on whatever lands in their lap without asking whether it’s the most impactful thing they could be doing. So find out.
Ask your manager, ask your PM, look at the company’s goals. Then align your work to that. Then you’re not just “getting stuff done”—you’re getting the right stuff done.
Direction is often more important than speed.

Keep a work log

Record key work each week: code changes, code reviews, design docs, discussions, helping others, postmortems.
This is something I’ve done for a while and it is actually useful. It makes performance reviews trivial, gives you ammunition for negotiations, and helps you prioritize when too many things compete for your time.

When someone drops a new request on you, a visible list makes it easy to say no or negotiate what gets cut.
Beyond the professional benefits, it works as a rubber duck—writing down what you did makes you reflect on it. It functions as a second brain if you use it actively.

While I read this book prior to LLMs becoming as useful as they are now, documenting basically everything you work on (how, when, why, what problems you encounter and why, etc.) a few orders of magnitude more useful.
I’m often finding myself direct my agent to a specific markdown note for what I’m working on in my Obsidian vault. And also keeping track at a higher level in my daily note.

First, you get traceability. You can literally always point to exactly what you were doing, when you were doing it, why, and what you found out. This becomes invaluable over time.

Second, it’s useful for the agent as well. There’s often a lot of domain-specific nuances that are hard to capture. They’re just not in the training data. And generally, it keeps track of a large task that span longer than current agent capabilities can reach.
E.g. you’re debugging a tough problem in production, so you keep a log. It’s useful for you, so you don’t keep it all in your head, and for your agent, so it always has the full context and understanding.

There are many more reasons. You’ll just have to discover them for yourself.

Feedback

Ask for feedback, be grateful when you get it, and give it back.
Don’t ask “do you have feedback for me?” Ask about something specific you did or worked on. General requests get generic responses. Specific ones get useful ones. And it’s especially useful when you’re doing something for the first time or figuring things out in a new environment.

When someone gives you constructive feedback, remember that saying nothing would’ve been easier for them. Most people don’t give feedback unless it’s sought. So when someone actually takes the time, that’s a gift, even when it stings. Don’t react defensively—sit with it.

Giving positive feedback isn’t just “good job.” Point out what specifically was good. Not “great PR,” but “I liked how you handled all four edge cases and wrote tests for each of them.” Be honest—insincere praise erodes trust over time. If someone did an okay job, don’t tell them it was excellent. Find the part that was excellent and call that out. Give constructive feedback on the rest (How to Win Friends and Influence People).

Constructive feedback is harder. The book has good advice on how to do it without it blowing up in your face.

Ask questions before jumping in, especially if you’re unfamiliar with the person’s work or they didn’t ask for feedback. You don’t always know the full picture, and asking first makes the exchange smoother.
Describe the situation and its impact, then ask the person what they think and how they’d prevent it next time.
Don’t say “you should do this.” Unless you’re their manager, avoid sounding like you’re giving instructions. Help them arrive at the solution themselves—you can suggest what you would have done differently.

Do it in person or over video, not over text. It’s way too easy for someone to misread your intent in a message.
Make it clear you’re on their side—that you’re speaking up because you want them to get better, and that staying silent would’ve been easier.
If it’s a peer, you can even state the power dynamics explicitly: “I’m not your manager, just a peer. This is just my observation and you’re free to ignore it.”

End the discussion positively. Giving feedback that leaves the relationship worse than before defeats the purpose.

Don’t outsource all your growth to others either. Reflect on your own work. Feedback helps with that, but it shouldn’t be your only input.

Produce, organize, publish

Shreyas Doshi breaks engineering work into three buckets: produce, organize, and publish.

Produce is the day-to-day output. Code, design docs, postmortems, code reviews. This is where most engineers spend nearly all their time, especially early in their career.

Organize is the coordination layer. Setting up meetings, kicking off initiatives, introducing better workflows, running bug bashes. It’s the work that helps other people get things done, not just you.

Publish is making sure the work is seen. Presenting at team or company-wide meetings, running knowledge-sharing sessions, writing things up for internal or external audiences, bringing it up in 1:1s.

The shift as you get more senior is that produce shrinks as a share of your time while organize and publish grow.
A junior engineer might be 90% produce. A staff engineer might be 40% produce, 30% organize, 30% publish.
If you’re only producing and never organizing or publishing, you’re capping your impact at what you can personally build.

Eyes on the ball

Some developers just work on whatever the task defines, even when they realize halfway through that new work is needed or that some tasks no longer make sense.
Don’t be this person.
The goal is to build software that solves customer problems, not to close tasks.

Tasks are tools for organizing work, not the work itself.
If you come across new work, run into blockers, or discover new constraints, your original plan probably needs revising.
Be ruthless about adding, removing, and overhauling tasks as you learn more.
The plan should serve the outcome, not the other way around.

When it’s done, it’s properly done

There’s no shortage of engineers who can seemingly complete work fast, but then it turns out the code has bugs, uncovered edge cases, or UX that was cobbled together in a hurry. Speed means nothing if the work has holes.

The engineers seen as productive aren’t the fastest. They’re fast enough, and their work actually holds up. The end result works as it should.

The book provides some useful advice to help you build solutions that work:

Work with stakeholders upfront to draft specs that cover both functionality and edge cases. This prevents misunderstandings and saves time later.

Outline your testing strategy before you start building, covering both manual and automated tests, and get QA and stakeholders to give feedback on the plan early.

Include testing and monitoring in your time estimates so they don’t get squeezed out at the end.

Collaborate with QA from the start rather than building something, doing zero testing, and then throwing it over the wall.

Ship every day

How long does it usually take from idea to working prototype? From picking up a bug to having a fix in production? If the answer is weeks, you’re likely not seen as productive.
In the best case, you’re perceived as “slow and steady.” In the worst case, delete “…and steady.”

Ship something every day. Break work into small chunks. Smaller pieces ship faster, they’re faster for others to review, and they keep your feedback loop tight.
When you ship daily, you catch problems early instead of discovering them two weeks into a monolithic branch.

There are exceptions where longer stretches make sense. But as a default, daily shipping forces the kind of iteration that produces good software.

Be product-minded

The most productive engineers aren’t the fastest coders or the ones who best understand computer systems. They’re engineers who are good enough at engineering, but excellent at understanding the product, customers, and the business. This helps them find smart tradeoffs, build less complex solutions faster, and offer solutions that solve the customer’s actual problem rather than the problem as defined in a task.

Productivity is rarely measured in milliseconds optimized or lines of code written. It’s measured in value created for customers. Sometimes those overlap, but that’s rare.

The book frames this as “product-minded engineering.”
Understand how and why your company is successful. Build a relationship with your product manager and seek frequent feedback from them. Participate in user research and customer support—actually talk to the people using your software.
When you work on projects, bring product suggestions backed by evidence and offer engineering/product tradeoffs. Not just the engineering-optimal solution, but the one that makes the most sense for the customer given the constraints.

As software engineers, we’re not paid to write code. We’re paid to solve problems for the business, frequently by writing code.

Dealing with roadblocks

When you hit a roadblock, don’t go quiet and try to solve it alone. That can take way too long (if you even solve it), and in the meantime your team has no visibility into what’s happening. You’ll be seen as unreliable.

Share the roadblock with your team, manager, and project lead. But don’t just dump the problem on them. Offer tradeoffs on what you could do instead of simply delaying. Cut scope so you don’t need to solve the blocker right now. Put a short-term hack in place and plan the proper fix later—basically advocating for taking on tech debt to move faster now. See if a platform team or third-party vendor can make a change that unblocks you.

Get creative with alternatives. By forcing yourself to consider them, you might find a smarter path.
The worst thing you can do is sit silently on a problem, hoping you’ll crack it before anyone notices you’re stuck.

Know your observability stack

When you join a company, make it a priority to learn where the production logs are stored and where to find systems’ health dashboards.
They might live in Datadog, Sentry, Splunk, New Relic, or Sumo Logic. Or in-house systems built on Prometheus, ClickHouse, Grafana, or something custom. Or a mix of everything.

Figure out where they are, get access, and learn how to query them. Do this for systems your team owns, and also for related systems you interact with. When something breaks at 2am, you don’t want to be figuring out how to log in for the first time.

Percentiles

Averages lie. When monitoring load times or response times, looking only at the average can mask worst-case scenarios that affect a lot of customers. A service can have a great average latency while 5% of users wait ten times longer.

The three percentiles to care about:

  • p50 (median): represents the typical experience. 50% of data points are below this, 50% above.
  • p95: the worst-performing 5%. This is where power users often land, and it’s where you spot problems that don’t show up in the median.
  • p99: the worst 1%. Sometimes acceptable as an outlier, sometimes a sign that something is broken for a specific subset of users.

KPIs, OKRs, and questioning measurements

KPIs (Key Performance Indicators) are quantifiable measures of progress.

Common ones include

  • GMV (total spending flowing through your platform),
  • revenue (what the company keeps—at Uber, 10-20% of GMV; at Stripe, 1-3%),
  • DAU/MAU,
  • churn percentage,
  • uptime,
  • support ticket volume, and
  • NPS.

A good KPI is measurable, unambiguous, and hard to game.

OKRs (Objectives and Key Results) were introduced at Google in 1999, suggested by investor John Doerr (who later wrote “Measure What Matters”).
An OKR has one qualitative objective and multiple measurable key results.
For example: objective “improve the reliability of our service,” with key results like increasing uptime from 99.8% to 99.9%, reducing p95 API latency by 20%, and reducing unhandled exceptions by 30%.

If your company uses OKRs, understand which objectives leadership cares about, starting from the top. Figure out how your team contributes.
Help translate the corporate jargon into something engineers can act on. But don’t obsess over them—they’re a focusing tool, not the mission itself.
The risk is becoming so fixated on hitting a number that you stop building the right thing for customers (Goodhart’s Law).

The more interesting section is about questioning measurements. The difference between standout engineering teams and average ones is that on standout teams, engineers question the KPIs and OKRs that product folks bring to the table.
For staff-and-above engineers, this should be a given.

Three angles to interrogate any metric: first, are we measuring the right thing?
If a KPI tracks endpoint latency, will reducing latency actually make a noticeable difference for customers, or would reducing the error rate matter more?
Second, how can this measurement be gamed?
And third, what countermeasures should we track alongside it?

The Uber examples on gaming are worth telling.
When the org mandated 99.9% reliability for all endpoints, several teams with reliability well below that target didn’t improve their code. They changed how they measured reliability to hit the number. No engineering changes at all.
Another time the goal was to reduce 500 response codes, and a team simply changed their 500s to 200s, moving the error message into the response body.

This is what happens when you incentivize numbers without thinking through the countermeasures.
If the goal is to increase median CPU utilization from 15% to 25% (as a proxy for better resource usage), you’d also want performance benchmarks to make sure code isn’t just becoming less efficient, and latency tracking at p50/p95 to make sure users aren’t degraded. Every metric needs a sanity check alongside it.

And don’t lose the bigger picture. Measuring system characteristics is much easier than measuring customer satisfaction, frustration, or why people convert or churn. Those things matter more but are harder to put a number on.

Monitoring and alerting

System-level monitoring covers the fundamentals:

  • uptime,
  • CPU/memory/disk,
  • response times at p50/p95/p99, and
  • error rates (exceptions, 4XX/5XX responses, other error states).

For web apps, add page load time and Core Web Vitals (LCP, FID, CLS—Google’s quality signals from 2020).
For mobile, track startup time (the longer it takes, the higher the churn), crash rate, and app bundle size (larger = fewer installs).

But all of these are infrastructure metrics. They catch fundamental problems, but they can all look green while the product is broken.

To get the full picture you need business metrics specific to your product. At Uber, the core business metrics for Rides were lifecycle events: how many people request a ride, how long requests stay pending, how many get accepted or rejected.
A plummeting acceptance rate could indicate an outage that no system dashboard would catch.

Common business metrics across most products:

  • customer onboarding funnels (how many enter, where they get stuck, how long signup takes),
  • success/error rates for core business actions (like adding a payment method on Uber’s Payments team),
  • DAU/WAU/MAU,
  • revenue on daily/weekly/hourly basis,
  • usage depth (how long users interact, how many actions—p50/p75/p90 to separate median, frequent, and power users),
  • support ticket volume broken out by category (spikes may indicate bugs), and
  • retention and churn rates.

Monitoring without alerting is just a dashboard nobody checks. Alerts need to fire when metrics look wrong, and an oncall engineer needs to receive, investigate, and mitigate.

To figure out which alerts to set, ask three questions:

  • What does “healthy” look like (and alert when it doesn’t)?
  • What outages have happened before (and what metrics would have caught them)?
  • What do customers notice when things break (and how do you detect that, perhaps by looking at p95 to catch outlier latency)?

Verbalizing what healthy and unhealthy states look like often makes the right alerts obvious.

In practice, the best approach is a mix: anomaly detection for most metrics (catches unexpected deviations automatically), plus static thresholds for expected patterns like traffic increases/drops and to catch key metrics dropping to zero.

Logging

The book includes a logging guide from 2008 by Anton Chuvakin (then chief logging evangelist at LogLogic) that still holds up almost two decades later.

Good logs:

  • Tell you exactly what happened: when, where, how
  • Are suitable for manual, semi-automated, and automated analysis
  • Can be read without the application that produced them being available
  • Don’t slow the system down
  • Can be proven reliable if used as evidence

That last one is surprisingly specific. Most people think of logs as a debugging tool, but if you ever need them in an incident review, audit, or legal context, reliability matters.

Events worth logging:

  • Authentication and authorization decisions (including logoff)
  • System and data access
  • System and application changes (especially privilege changes)
  • Data mutations (add/edit/delete)
  • Invalid input (could indicate threats)
  • Resource usage (RAM, disk, CPU, bandwidth, any hard or soft limits)
  • Health signals (startups, shutdowns, faults, errors, delays, backup success/failure)

Every logged event should carry:

  • Timestamp with timezone
  • System, application, or component that produced it
  • Relevant IPs and DNS lookups
  • User identity
  • Action taken
  • Result (success/failure)
  • Priority/severity
  • Reason

Canarying

The term comes from “canary in the coal mine.”
Miners brought caged canary birds underground because the birds had lower tolerance for toxic gas. If the bird stopped chirping or fainted, the miners evacuated.

In software, canary testing means rolling out changes to a small percentage of users first and watching the health signals for signs that something’s wrong before rolling out wider.
It’s typically implemented by routing traffic to the new version via a load balancer, or by deploying to a single node first.

Advanced deployment capabilities

Some deployment capabilities are still uncommon because they’re hard to build, but worth investing in if you can:

  • Monitoring and alerting setups where code changes can easily be paired with health metric tracking
  • Automated staged rollouts with automated rollbacks
  • Dynamic testing environment generation
  • Robust integration, end-to-end, and load testing
  • Testing in production through multi-tenancy approaches

Tech debt as a tradeoff

Pragmatic engineers don’t see tech debt as inherently bad. They see it as a tradeoff between speed and quality, a characteristic of the system rather than a disease.
They put tech debt in the context of a project’s goals and don’t try to pay off more than needed. The key is tracking it and stepping in to reduce it before it compounds out of control—and being creative about how you do it.

Things are rarely just plain bad. There’s usually a reason the tradeoff was made.

Programming language breadth

The book recommends learning at least one language from each of the three main paradigms:

Imperative languages give the computer step-by-step instructions. “If X, do this. Else, do that.”
This is the most common type—C, C++, Go, Java, JavaScript, Python, Rust, TypeScript, and most OO languages fall here.

Declarative languages specify the expected outcome without dictating how to achieve it.
SQL and HTML are the classic examples. You say what you want, not how to get there.

Functional languages are a subset of declarative languages that treat functions as first-class citizens—you can pass them as arguments, return them as values.
They tend toward immutable state and pure functions with no side effects. Haskell, Lisp, Erlang, Elixir, F#.

Each paradigm shapes how you think about problems.

  • Imperative trains you in control flow and state management.
  • Declarative teaches you to describe outcomes.
  • Functional rewires how you handle state, composition, and side effects.

Having depth in all three gives you a broader toolkit for designing solutions.

Hoard knowledge

I think a big part of succeeding in tech is just constantly hoarding new things that you know how to do, then looking out for opportunities to combine and apply them.
— Simon Willison

The book’s advice is to dip your toes into areas that aren’t closely related to your main expertise.
If you’re a frontend engineer, dig into backend or embedded. Play around with LLMs or machine learning.
“Hoard” the knowledge. It might not be useful right now, but could come in handy later.

This tracks with my experience. I try to soak in everything I can, even when it’s not immediately relevant to what I’m working on. That breadth has been a large part of my success so far. Obviously don’t spend all your time on theory without building anything, but the habit of absorbing widely and connecting later is underrated.

Liked these notes? Join the newsletter.

Get notified whenever I post new notes.