🛠️ Steven Gong

Search

SearchSearch
  • Fully Sharded Data Parallel (FSDP)
  • Related

Sep 03, 2025, 1 min read

Fully Sharded Data Parallel (FSDP)

https://engineering.fb.com/2021/07/15/open-source/fsdp/

They actually have a paper PyTorch FSDP Experiences on Scaling Fully Sharded Data Parallel that you should read if you really want to understand what’s going on.

Related

  • DDP

Graph View

Backlinks

  • Data Parallelism
  • Distributed Data Parallel (DDP)

Created with Quartz, © 2025

  • Blog
  • LinkedIn
  • Twitter
  • GitHub