Query Cost

Query cost is the DBMS’s estimate of how much work a query plan will take. Introduced in ECE459 L19. The dominant term is disk I/O; CPU time is nonzero but ignored for simplicity, as in [SKS11].

Why estimate at all?

The optimizer needs a number to compare alternative plans. Actually running a query to measure it defeats the point of planning, so we guess from metadata and statistics.

Two measurables of interest:

Block transfers: data moved in or out of memory
Disk seeks: repositioning on the disk

[SKS11] estimates assume the worst case: only one block per table fits in memory at a time. If reality is better, the actual cost is lower than estimated, which is the less dangerous direction to be wrong.

Things the estimate doesn’t capture

System load (concurrent operations may block)
Buffer contents (data already in memory skips disk ops)
Data layout (packed data means fewer seeks, distributed data enables parallel reads)

The lowest-cost plan is not necessarily the fastest in wall-clock terms. The DB optimizer is a “hypermiler”: it minimizes disk reads, even if a path with more reads but more parallelism would finish sooner.

Metadata used in estimates [SKS11, EN11]

$n_{r}$ : rows in table $r$
$b_{r}$ : blocks containing $r$
$l_{r}$ : bytes in $r$ , equal to blocks times block size
$f_{r}$ : rows per block, equal to $n_{r} / b_{r}$
$V (A, r)$ : distinct values of attribute $A$ in $r$ . Equals $n_{r}$ when $A$ is a key
$h_{r, i}$ : height of index $i$ on $r$

Histograms divide a value range into buckets with counts. Distributions are rarely even. For “salary ≥ $100 k ", i f t h e 100 k + b u c k e t h o l d s 10 t u pl es, t h e an s w er i s a t m os t 10. F or ab o u n d a ry in s i d e ab u c k e t (e . g . s a l a ry >$ 150k), interpolate within the bucket.

Stats can go stale between updates, but “close enough, more often than not” suffices. The optimization only needs to beat doing nothing.

🛠️ Steven Gong

Table of Contents

Query Cost

Things the estimate doesn’t capture

Metadata used in estimates [SKS11, EN11]

Graph View

Backlinks

🛠️ Steven Gong

Table of Contents

Query Cost

Things the estimate doesn’t capture

Metadata used in estimates [SKS11, EN11]

Related

Graph View

Backlinks