Aggregate Functions
Collapsing many rows into a single summary number — counts, sums, averages, minimums, and maximums — the foundation of every report.
So far every query returned roughly one result row per table row. Now we flip that: aggregate functions take many rows and boil them down to a single summary value. "How many customers?" "What is the total revenue?" "What is the average order size?" These are aggregate questions, and they are the heart of reporting.
From many rows to one number
The five core aggregates cover most needs:
| Function | Answers | Example |
|---|---|---|
COUNT(*) | How many rows? | number of employees |
SUM(col) | What is the total? | total payroll |
AVG(col) | What is the average? | average salary |
MIN(col) | What is the smallest? | lowest salary |
MAX(col) | What is the largest? | highest salary |
Counting rows
COUNT(*) counts rows. Run it and notice the result is a single row
with a single number — the whole table collapsed to a count:
Totals, averages, extremes
You can compute several aggregates in one query. Each one reads the same set of rows and produces its own summary:
One row, five numbers — a complete summary of the whole table. This single query would be a whole afternoon of manual spreadsheet work on a large dataset.
Aggregates over a filtered set
Aggregates respect WHERE. The filter runs first, then the
aggregate summarizes only the surviving rows. "Average salary in
Engineering" is just an average with a filter:
Two subtleties to remember
1. Aggregates ignore NULLs (except COUNT(*)). AVG, SUM,
MIN, MAX, and COUNT(column) skip rows where that column is
NULL. Only COUNT(*) counts every row regardless. This is usually
what you want — an unknown salary should not drag an average toward
zero.
2. COUNT(*) vs COUNT(column). COUNT(*) counts rows;
COUNT(phone) counts rows where phone is not NULL. The gap
between them tells you how many values are missing:
Four rows, but only two phones on file — the difference reveals the two missing values at a glance.
Check your understanding
What does an aggregate function like SUM or AVG do?
It returns one result row for each row in the table.
It combines many rows into a single summary value, such as a total or an average.
It deletes rows that match a condition.
It sorts the rows.
A table has 4 rows; the phone column is NULL in 2 of them. What do COUNT(*) and COUNT(phone) return?
Both return 4.
Both return 2.
COUNT(*) returns 4 (all rows); COUNT(phone) returns 2 (only non-NULL phones).
COUNT(*) returns 2; COUNT(phone) returns 4.
How does AVG(salary) treat rows where salary is NULL?
It counts them as 0, lowering the average.
It ignores those rows, averaging only the rows that have a salary value.
It raises an error.
It returns NULL for the whole query.
Practice challenge
The payments table has an amount column. Return a single row with the number of payments as n, their total as total, and their average as avg.
Working with NULL
The special value that means "unknown," why it behaves so strangely, and how to test for it correctly — a concept that trips up nearly every beginner.
Grouping Data with GROUP BY
The leap from one grand total to one summary per category — how GROUP BY splits rows into buckets and aggregates each one separately.