Grouping Data with GROUP BY
Learn how SQLite GROUP BY splits rows into categories so each category gets its own count, total, average, or other summary.
Aggregate functions can summarize a whole table into one answer. That is useful, but many real questions need one answer per category:
- How many orders did each customer place?
- What is the total sales amount for each product?
- What is the average salary in each department?
- How many visits happened in each month?
GROUP BY is how you ask SQLite to split rows into groups first, then summarize each group.
The big idea: split, then summarize
Without GROUP BY, an aggregate gives one summary for all rows. With GROUP BY, SQLite makes one group for each distinct value, then runs the aggregate once per group.
The result is not one average salary. It is one average salary per department.
Your first GROUP BY query
Start with a small employee table. The query below counts employees and computes average salary for each department.
Read the query out loud like this:
For each
dept, count the rows and average the salaries.
The dept column appears in the result because it labels each group. Without it, you would see numbers but not know which department they describe.
One result row per group
A group is a bucket of rows that share the same grouped value. If the table has three departments, the grouped query returns three rows.
There are six ticket rows, but only three priority groups: High, Medium, and Low.
What does GROUP BY priority do before COUNT(*) runs?
It deletes duplicate priorities from the table.
It puts rows with the same priority value into the same group.
It sorts the table permanently by priority.
It changes all priorities to the same value.
The golden rule of GROUP BY
When a query uses GROUP BY, each result row represents a whole group. That creates an important rule:
Every expression in
SELECTmust either be part of theGROUP BYor be inside an aggregate function.
This is easier to understand with a question: if one output row represents all Engineering employees, which single name should SQLite show for that row? Alice? Bob? Someone else?
A grouped result row needs values that make sense for the whole group.
SQLite note
Some databases reject ungrouped columns immediately. SQLite may allow them in some grouped queries, but the value can be arbitrary and misleading. For clear beginner-friendly SQL, follow the golden rule: select grouped columns or aggregate results.
Grouping and summing
GROUP BY is often paired with SUM for category totals.
This is the pattern behind many reports: category on the left, summary number on the right.
Grouping by more than one column
You can group by multiple columns. The group becomes each unique combination of those column values.
For example, GROUP BY region, year means:
Make one group for each region-and-year pair.
Each result row is one distinct (region, year) combination.
When you write GROUP BY region, year, what does each output row represent?
One row per region only.
One row per year only.
One row per unique combination of region and year.
One row per original sale.
WHERE happens before GROUP BY
You can filter rows before they are grouped. This is useful when only some rows should enter the summary.
In SQLite, use INTEGER values like 1 and 0 for true/false flags.
The intern row is removed first. Then SQLite groups the remaining rows by department.
GROUP BY changes the meaning of SELECT
Before grouping, SELECT name, salary means "show me each row's name and salary."
After grouping, SELECT dept, COUNT(*) means "show me each group's label and summary."
That mental shift is the tricky part: the output rows are no longer individual original rows. They are summary rows.
The result does not show Luna, Milo, or Nori as separate rows. It shows one row per species.
Check your understanding
What is the main purpose of GROUP BY?
To permanently rearrange rows in a table.
To split rows into groups so aggregates can be calculated separately for each group.
To remove all NULL values from a table.
To create a new table automatically.
In SELECT dept, COUNT(*) FROM employees GROUP BY dept;, why is dept allowed in SELECT?
Because text columns are always allowed in grouped queries.
Because dept is the grouped column, so each output row has one department value.
Because COUNT(*) changes dept into a number.
Because dept is the first column in the table.
Which query best answers "How many orders did each customer place?"?
SELECT COUNT(*) FROM orders;
SELECT customer, COUNT(*) FROM orders GROUP BY customer;
SELECT customer FROM orders;
SELECT customer, amount FROM orders ORDER BY amount;
What happens first in a query that has both WHERE and GROUP BY?
GROUP BY forms groups first, then WHERE filters those groups.
WHERE filters individual rows first, then GROUP BY groups the remaining rows.
SELECT creates aliases first, then WHERE and GROUP BY run.
They happen in random order.