GROUP BY
groups the rows that have the same values in some columns.
In this example, we group all students with the same last name and count them for individual unique last names
SELECT last_name, COUNT(*)
FROM students
GROUP BY last_name
Calculates average points for each unique last name
SELECT last_name, AVG(points)
FROM students
GROUP BY last_name
Returns unique last name and first name, meaning, if there are two rows with the exact same first names and last names, it will be appeared once
SELECT
first_name,
last_name
FROM students
GROUP BY first_name, last_name
The query returns the same result using
DISTINCT
keyword as the above query
SELECT DISTINCT
first_name,
last_name
FROM students
If we want to filter the aggregated data, it's not possible to use
WHERE
clause, because
WHERE
runs before running
GROUP BY
clause
SELECT
last_name,
AVG(points) average_points
FROM students
WHERE average_points > 70
GROUP BY last_name
The query gives the following error:
Unknown column 'average_points' in 'where clause'
Instead, we need to use
HAVING
because it's specifically designed to filter data after running
GROUP BY
clause:
SELECT
last_name,
AVG(points) average_points
FROM students
GROUP BY last_name
HAVING average_points > 70