🛠️ Steven Gong

Search

Ranked Retrieval
Related

Dec 16, 2024, 1 min read

Ranked Retrieval

This builds upon Boolean Retrieval.

Similar to Boolean Retrieval, you can do this 2 ways:

Document-at-a-time
Term-at-a-time

Document-At-A-Time For each document:

Score each query term, and add them all up
Accumulate best k hits
A min-heap is good for this

PRO: time is O(n log k), memory is O(k) – k probably a constant CON: can’t terminate early, must look at whole document collection

Term-At-A-Time

Collect hits and ranking for rarest term into accumulator
For each other term in the query:
If a document does NOT have that term, remove from accumulator
Otherwise, add next term’s ranking to overall ranking

PRO: Can have early termination heuristics, will not normally need to traverse all documents

CON: uses a lot of memory

Related

Page Rank

Graph View

Backlinks

Boolean Retrieval

Created with Quartz, © 2025

Blog
LinkedIn
Twitter
GitHub