Debugging Analysis Code
Why analysts spend half their time debugging — and the systematic habits that make it less painful.
Most analysis bugs are not crashes. They're silent mistakes — numbers that look fine but are subtly wrong. Real analysts are not "people who never write bugs". They're "people who catch their own bugs early."
The taxonomy of analysis bugs
Loud bugs are easy — you see the traceback. The quiet ones are what bite you.
Habit 1 — Look at the data after every step
After any non-trivial line, print and check.
Knowing the row count before and after every transformation is the cheapest bug-detector in Pandas.
Habit 2 — Trust nothing about dtypes
Always check df.dtypes after loading. Strings that look like
numbers are one of the most common silent failures.
Habit 3 — Read the assumption, not just the code
When you write:
result = orders.merge(customers, on="customer_id")what you mean is:
"Each order has exactly one customer; the result should have the same number of rows as
orders."
Encode that assumption:
before = len(orders)
result = orders.merge(customers, on="customer_id", validate="many_to_one")
assert len(result) == before, f"Row count changed: {before} -> {len(result)}"If reality contradicts the assumption, you find out now, not after the chart is in your boss's deck.
Habit 4 — Use .shape and .info() as constants
A 30-second .info() after loading catches almost every "wait,
that column is the wrong type" bug.
Habit 5 — Pin down weird rows
When something looks off, find the actual rows rather than guessing.
Don't reason abstractly — pull up the actual problem rows. Often they reveal the root cause immediately.
Habit 6 — Bisect the pipeline
When a 30-step pipeline produces the wrong answer, don't re-read all 30 steps. Print the intermediate output halfway. If it's right at step 15, the bug is in 16–30. If it's wrong at step 15, the bug is in 1–15. Bisect again.
This O(log n) approach beats O(n) re-reading every time.
Habit 7 — Reproduce in a minimal example
When stuck, recreate the problem with a tiny DataFrame.
If you can reproduce the bug in a 4-row example, you can solve it in a 4-row example. And when you ask a colleague for help, the 4-row example is what they need.
Habit 8 — Suspect chained indexing warnings
SettingWithCopyWarning is a real warning, not noise.
Use .loc[..., col] = value when you really mean to modify the
original.
Habit 9 — Confirm with a different method
If your aggregation says revenue is $1.2M, double-check with a
crude calculation — df["revenue"].sum() — to ensure the
groupby didn't drop or duplicate rows.
The act of computing the same number two ways catches an astonishing number of bugs.
Habit 10 — Slow down
When the result looks wrong:
- Re-read the question.
- Re-look at the data shape, dtypes, sample rows.
- Print the intermediate steps.
- Reproduce in a mini example.
- Then edit code.
Most bugs come from skipping step 1.
A debugging checklist
Check your understanding
Your sum of "price" is 6 digits long even though only a handful of rows exist. Most likely cause:
A floating-point bug
The data is corrupt
The column is a string, and .sum() concatenated the values instead of adding them — check df.dtypes
Pandas needs an upgrade
A merge unexpectedly multiplied your row count by 3. Best diagnostic step?
Add more print statements throughout
Re-run with a different package version
Check whether the join key is duplicated on the right-hand table (df.duplicated(subset=...).sum()), and/or re-run the merge with validate="one_to_many" or similar
Drop duplicates blindly
A 30-step pipeline produces a wrong answer. What's the most efficient debugging approach?
Re-read every step
Rewrite the pipeline
Bisect — check the intermediate output halfway through; whichever half is wrong, bisect again
Add more comments
Why is reproducing a bug in a 5-row "mini" DataFrame so valuable?
It is faster than running on big data
It avoids print spam
Both of those, AND a mini-example forces you to truly understand the bug — and is the perfect format to ask someone for help
It uses less memory