Advanced SQL
Aggregate Functions
Aggregation in SQL are a standard and very useful extension of FOL.
Aggregate (column) functions are introduced to:
- Count the number of tuples in a relation;
- Sum values of a numeric attribute over a relation; and
- Find minimum or maximum values of a numeric attribute over a relation.
They can apply to groups of tuples that have equal values for selected attributes.
The SQL COUNT()
, AVG()
and SUM()
Functions are super useful.
However, be super careful, you probably need to do a GROUP BY
afterwards.
The general syntax is the following:
SELECT column_name(s)
FROM table_name
WHERE condition
GROUP BY column_name(s)
HAVING condition
ORDER BY column_name(s)
LIMIT X HAVING X;
More covered below.
Group By
Order of operations:
- Run
<FROM-WHERE>
- Run
GROUP BY
to partition the results of<FROM-WHERE>
- On on each of these partitioned groups, apply the aggregate functions.
- For each group, add a tuple with the grouping attribute values and the results of the aggregate functions to the result.
https://www.w3schools.com/sql/sql_groupby.asp
SELECT COUNT(CustomerID), Country
FROM Customers
GROUP BY Country;
COUNT(CustomerID) | Country |
---|---|
3 | Argentina |
2 | Austria |
2 | Belgium |
9 | Brazil |
3 | Canada |
2 | Denmark |
2 | Finland |
Without group by, you would have this
SELECT COUNT(CustomerID), Country
FROM Customers
COUNT(CustomerID) | Country |
---|---|
91 | Germany |
So it seems to just give a country?
Danger
When you use aggregate functions like
COUNT()
,SUM()
,AVG()
, etc., along with other columns that are not part of the aggregate function, you need to use theGROUP BY
clause.
Here’s an example where GROUP BY
doesn’t work:
- For each author, a count of the number of article pages.
select author, sum(endpage-startpage+1) as pcnt \
from wrote, article \
where publication = pubid \
group by author
AUTHOR PCNT
----------- -----------
1 12
1 record(s) selected.
Not quite correct: Author 2 should be reported as having 0 pages. This is because of the way the cross join works, see SQL.
Having
https://www.w3schools.com/sql/sql_having.asp
The HAVING
clause was added to SQL because the WHERE
keyword cannot be used with aggregate functions.
SELECT column_name(s)
FROM table_name
WHERE condition
GROUP BY column_name(s)
HAVING condition
ORDER BY column_name(s)
LIMIT X HAVING X;
The following SQL statement lists the number of customers in each country. Only include countries with more than 5 customers:
SELECT COUNT(CustomerID), Country
FROM Customers
GROUP BY Country
HAVING COUNT(CustomerID) > 5;
All publications with more than one author.
select publication, count(author) as acnt \
from wrote \
group by publication \
having count(author) > 1
PUBLICATION ACNT
----------- -----------
2 2
1 record(s) selected.
SQL Order By
Syntax:
ORDER BY e1 [Dir1], . . . , ek [Dirk]
where `Diri is either ASC or DESC.
ASC
is the default
Warning
If you use
ORDER BY
, make sure that the column name is correct first. Ideally, you should rename it to prevent errors. Using the aggregate function name won’t work, because it will just be relabeled to a column number.
For example,
select author, MAX(publication) as maximum \
from wrote \
group by author \
order by maximum DESC
SELECT * FROM Customers
ORDER BY Country DESC;
You can also order by several columns
SELECT * FROM Customers
ORDER BY Country, CustomerName;
SQL Limit
The number of results of a query can be limited by appending a LIMIT
clause to a query. You can also use the OFFSET
keyword to skip the first e2
answers.
<query> LIMIT e1 [ OFFSET e2 ]
List at most the first two entries of a sorted list of authors by name.
SELECT DISTINCT * FROM author \
ORDER BY name \
LIMIT 2
List at most the next two entries following the second entry of a sorted list of authors by name.
SELECT DISTINCT * FROM author \
ORDER BY name \
LIMIT 2 OFFSET 2
SQL Join
See SQL Joins.
Extension of syntax for the FROM
clause:
FROM T1 <j-type> JOIN T2 ON <cond>
- where
<j-type>
is one ofFULL
,LEFT
,RIGHT
, orINNER
Example
select aid, publication \
from author left join wrote on aid=author
Counting with outer join: Author ids and a count of the number of publications for each.
select aid, count(publication) as pcnt \
from author left join wrote on aid = author \
group by aid
SQL Multisets
The Answers to SELECT blocks are actually multisets of tuples (remember, this is how we can apply aggregate functions). Therefore, we can actually apply multiset operations on them.
- Multiset union:
UNION ALL
.- behaves like “∨” in range restricted RC with multiset semantics
- Multiset difference:
EXCEPT ALL
.- behaves like “∧¬” in range restricted RC with multiset semantics
- Multiset intersection:
INTERSECT ALL
.- maximum number of tuples common to and
- rarely used