The Definitive Guide toAI Data Centers
Ask the Guide
GuideGlossaryFSDP

FSDP · Fully Sharded Data Parallel

A training method that shards model parameters, gradients and optimizer state across GPUs to fit larger models.

← All terms