What makes a post-mortem blameless?

A blameless post-mortem focuses on the systems and conditions that allowed an incident, not on punishing individuals. The template states this up front so engineers share what really happened without fear, which surfaces the true root cause.

How is the incident duration calculated?

It subtracts the detected timestamp from the resolved timestamp and shows the result in hours and minutes. This gives you an accurate time-to-resolution figure for the report and for tracking trends.

What is the difference between root cause and contributing factors?

The root cause is the primary trigger of the incident. Contributing factors are the latent conditions — a missing alert, a gap in tests, a manual step — that made the incident possible or worse. Both belong in a thorough review.

Why must action items have owners and due dates?

Action items without an owner and a date rarely get done, so the same incident recurs. The template prompts for owner and due date on each item so follow-up is trackable and accountable.

What do the severity levels mean?

SEV1 is a critical, business-wide outage; SEV2 is a major degradation; SEV3 is minor; SEV4 is low impact. Severity sets escalation, communication and how deep the review needs to be.

What is the Engineering Post-Mortem Builder?

Build a blameless engineering post-mortem with incident summary, impact, severity, an automatically calculated duration, ordered timeline, root cause, contributing factors and action items with owners — copy it for review. It runs free in your browser on Gera Tools, with nothing uploaded.

Engineering Post-Mortem Builder

Name: Engineering Post-Mortem Builder
Creator: Gera Tools
License: https://creativecommons.org/licenses/by/4.0/

Learn from the outage, not just survive it

A post-mortem is only useful if it’s honest and actionable. The two things that break post-mortems are blame — which makes engineers hide what happened — and action items that nobody owns. This builder bakes in a blameless framing and forces every follow-up to have an owner and a date.

How it works

You enter the incident title, severity (SEV1–SEV4), and the detected and resolved timestamps. The tool calculates time-to-resolution automatically and shows it in hours and minutes. You then capture the impact, an ordered timeline of events, the root cause, contributing factors, and action items with owners and due dates. The builder assembles a complete post-mortem with a blameless-framing note, a summary section, the calculated duration, the timeline, separate root-cause and contributing-factors sections, a went-well/went-poorly reflection, the numbered action items, and a lessons-learned section.

What makes the blameless approach different

The classic failure mode is an incident review that devolves into finding the person who made the “mistake” — which stops honest reporting cold. The blameless frame rests on a simple assumption: skilled engineers operating in a flawed system will make flawed decisions. The goal of the review is to make the system more robust, not to punish the human who triggered the failure.

Practically, that means:

Write timeline entries as facts, not accusations (“deployment pushed” not “Alice pushed bad code”)
In the root cause section, keep asking “why?” until you reach a systemic issue (missing test, absent alert, unclear runbook), not a person’s name
The contributing factors section is where systemic gaps live — list them explicitly because they are where the real prevention work happens

The five sections that matter most

Timeline — every significant event in chronological order, with timestamps. This is the hardest section to write honestly and the most valuable for preventing recurrence.

Root cause — the single primary trigger. Keep it tight: one or two sentences.

Contributing factors — the latent conditions that let the incident happen or made it worse. This is usually the most instructive section: a missing alert, an untested failover path, a monitoring gap, an unclear ownership boundary.

What went well — necessary to balance the review and surface things worth reinforcing or documenting as institutional practice.

Action items — each item needs an owner and a due date. Items without both are aspirations, not plans.

Tips and example

Distinguish cause from contributors: the root cause might be “a config change disabled connection pooling”, while contributing factors are “no alert on pool exhaustion” and “no staging soak test”. Fixing only the root cause leaves the gaps that will bite again.

Keep all timeline entries in a single timezone to avoid confusion during review.
Write action items as “Add alert on connection pool — @lee — due 2026-06-20” so ownership and timing are unambiguous.
Run the review blamelessly — the goal is a more resilient system, not a culprit.
Publish the post-mortem internally, even for minor incidents — a shared record builds institutional knowledge and normalises the review process across the team.