Revision history of "PyTorch FSDP vs DDP"

From Ben's Personal Wiki

Diff selection: Mark the radio boxes of the revisions to compare and hit enter or the button at the bottom.
Legend: (cur) = difference with latest revision, (prev) = difference with preceding revision, m = minor edit.

  • (cur | prev) 19:49, 6 June 2024Ben (talk | contribs). . (411 bytes) (+31)
  • (cur | prev) 19:47, 6 June 2024Ben (talk | contribs). . (380 bytes) (+380). . (Created page with "I observed that PyTorch's FSDP with NO_SHARD is substantially faster and slightly more memory-efficient in a multi-node setting than DDP. Apparently this comes down to differe...")