# Advanced SQL

## Aggregate Functions

Aggregation in SQL are a standard and very useful extension of FOL.

Aggregate (column) functions are introduced to:

- Count the number of tuples in a relation;
- Sum values of a numeric attribute over a relation; and
- Find minimum or maximum values of a numeric attribute over a relation.

They can apply to groups of tuples that have equal values for selected attributes.

The SQL `COUNT()`

, `AVG()`

and `SUM()`

Functions are super useful.

However, be super careful, you probably need to do a `GROUP BY`

afterwards.

The general syntax is the following:

More covered below.

#### Group By

Order of operations:

- Run
`<FROM-WHERE>`

- Run
`GROUP BY`

to partition the results of`<FROM-WHERE>`

- On on each of these partitioned groups, apply the aggregate functions.
- For each group, add a tuple with the grouping attribute values and the results of the aggregate functions to the result.

https://www.w3schools.com/sql/sql_groupby.asp

COUNT(CustomerID) | Country |
---|---|

3 | Argentina |

2 | Austria |

2 | Belgium |

9 | Brazil |

3 | Canada |

2 | Denmark |

2 | Finland |

Without group by, you would have this

COUNT(CustomerID) | Country |
---|---|

91 | Germany |

So it seems to just give a country?

Danger

When you use aggregate functions like

`COUNT()`

,`SUM()`

,`AVG()`

, etc., along with other columns that are not part of the aggregate function, you need to use the`GROUP BY`

clause.

Here’s an example where `GROUP BY`

**doesn’t work**:

- For each author, a count of the number of article pages.

```
AUTHOR PCNT
----------- -----------
1 12
1 record(s) selected.
```

Not quite correct: Author 2 should be reported as having 0 pages. This is because of the way the cross join works, see SQL.

#### Having

https://www.w3schools.com/sql/sql_having.asp

The `HAVING`

clause was added to SQL because the `WHERE`

keyword cannot be used with aggregate functions.

The following SQL statement lists the number of customers in each country. Only include countries with more than 5 customers:

All publications with more than one author.

```
PUBLICATION ACNT
----------- -----------
2 2
1 record(s) selected.
```

## SQL Order By

Syntax:

where `Diri is either ASC or DESC.

`ASC`

is the default

Warning

If you use

`ORDER BY`

, make sure that the column name is correct first. Ideally, you should rename it to prevent errors. Using the aggregate function name won’t work, because it will just be relabeled to a column number.

For example,

You can also order by several columns

## SQL Limit

The number of results of a query can be limited by appending a `LIMIT`

clause to a query. You can also use the `OFFSET`

keyword to skip the first `e2`

answers.

List at most the first two entries of a sorted list of authors by name.

List at most the next two entries following the second entry of a sorted list of authors by name.

## SQL Join

See SQL Joins.
Extension of syntax for the `FROM`

clause:

- where
`<j-type>`

is one of`FULL`

,`LEFT`

,`RIGHT`

, or`INNER`

Example

Counting with outer join: Author ids and a count of the number of publications for each.

## SQL Multisets

The Answers to SELECT blocks are actually multisets of tuples (remember, this is how we can apply aggregate functions). Therefore, we can actually apply multiset operations on them.

- Multiset union: $Q_{1}$
`UNION ALL`

$Q_{2}$.- behaves like “∨” in range restricted RC with multiset semantics

- Multiset difference: $Q_{1}$
`EXCEPT ALL`

$Q_{2}$.- behaves like “∧¬” in range restricted RC with multiset semantics

- Multiset intersection: $Q_{1}$
`INTERSECT ALL`

$Q_{2}$.- maximum number of tuples common to $Q_{1}$ and $Q_{2}$
- rarely used