Multi-Head Latent Attention (MLA)
Heard about this from a Lex friedman podcast https://www.youtube.com/watch?v=PncVSWbxdWU
Damnnn DeepSeek was behind MLA!
https://www.youtube.com/watch?v=0VLAoVGf_74
Check out this guy https://fxmeng.github.io/#two