🛠️ Steven Gong

Search

SearchSearch
  • Multi-Query Attention (MQA)
  • Related

Feb 11, 2026, 1 min read

Attention

Multi-Query Attention (MQA)

Multi-head attention consists of multiple attention layers (heads) in parallel with different linear transformations on the queries, keys, values and outputs. Multi-query attention is identical except that the different heads share a single set of keys and values.

  • https://paperswithcode.com/method/multi-query-attention

Related

  • Grouped Query Attention

Graph View

Backlinks

  • Attention (Transformer)

Created with Quartz, © 2026

  • Blog
  • LinkedIn
  • Twitter
  • GitHub