Retrieval-Augmented Generation (RAG)

3 steps:

  1. Indexing
  2. Retrieval
  3. Generation

Some common techniques / terminology that you should know: Query Construction (how do you augment our initial query to get better search)

  • Multi-query: where you rewrite the query N-ways through LLM, then redo retrieval N-times
  • RAG-fusion: builds on multi-query by adding sophisticated ranking step by adding RRF
  • Query Translation
    • Sub-question
    • Step-back
    • Using LLM to generate domain-specific language (usually in the form of JSON, similar to routing?)
  • HyDE
  • Routing
    • I was introduced to structured outputs that you can specify
    1. Logical routing
    2. Semantic routing

Indexing (this is where I don’t fully understand?)

  • Multi-representation
  • RAPTOR
  • ColBERT (you can use ratatouille to do this)
  • Active RAG