Working with Dates and Time
Time is the backbone of analytics. Truncating to periods, extracting parts, computing intervals, and building time series for trends — the date toolkit every analyst needs.
Almost every analytical question has a time dimension: revenue per month, signups this week, growth year over year. Dates are also where beginners stumble, because "group by month" is not as simple as it sounds. This page builds the date toolkit analysts use daily — truncating, extracting, and computing with dates — and shows how to assemble a proper time series.
The core move: truncate to a period
To summarize "by month", you do not group by the raw date — that would
give one group per day. You first truncate each date down to the
first of its month, so every day in March collapses to 2024-03-01. The
tool is date_trunc('month', d).
date_trunc is the backbone of every "by week / by month / by quarter"
report. Change the first argument to 'week', 'quarter', or 'year'
and the same query re-buckets accordingly.
Extract a part with EXTRACT / date_part
Sometimes you want a component — the year, the month number, the day of
week — rather than a truncated date. EXTRACT(part FROM d) (or
date_part) pulls it out. This is how you answer "which weekday is
busiest?" by grouping across all weeks.
Truncate vs. extract
Truncate keeps a date but flattens it to a period start — use it for trends over time (Jan, Feb, Mar...). Extract pulls out a number like the month or weekday — use it to compare across periods (all Mondays, every January). They answer different questions.
Date arithmetic and intervals
Dates support subtraction (giving a number of days) and addition of
INTERVALs. This powers "age", "days since", and "rolling last 30 days"
calculations.
Filling gaps: a complete time series
A subtle analytical trap: if no orders happened on some day, that day is
simply missing from a grouped result — which makes a chart lie by
skipping the gap. The fix is to generate a complete spine of dates with
generate_series (or range) and LEFT JOIN the data onto it, so empty
periods show up as zero.
March 2 and 4 had no orders, yet they appear with revenue = 0 — a
correct, gap-free series ready to chart. Skipping this step is one of the
most common ways an analyst's trend chart misleads.
Check your understanding
To report revenue per month, why group by date_trunc('month', order_date) instead of order_date?
Because order_date cannot be used in GROUP BY.
Grouping by the raw date makes one group per day; truncating to the month collapses all of a month's days into one group.
date_trunc sorts the rows.
date_trunc deletes days with no orders.
When would you use EXTRACT(dow FROM d) (day of week) rather than date_trunc?
To produce a continuous monthly trend line.
To compare activity across periods — e.g. which weekday is busiest, pooling all weeks together.
To keep each date unique.
To add 7 days to a date.
A daily revenue chart skips days that had no sales. What is the standard fix?
Insert fake orders into the table.
Generate a complete series of dates and LEFT JOIN the data onto it, using COALESCE to show 0 for empty days.
Use date_trunc to fill the gaps automatically.
Switch from SUM to AVG.
Dates handled, the last transformation skill is reshaping — turning long data into wide and back. That is the next page.
Conditional Logic
CASE expressions for bucketing, labeling, and conditional aggregation — including the powerful FILTER clause and SUM(CASE ...) pattern that builds pivot-style summaries.
Pivoting and Reshaping
Long vs. wide data, and how analysts move between them — manual pivots with conditional aggregation, DuckDB's PIVOT and UNPIVOT, and when each shape is the right one.