The initial, compute-heavy phase that trains a model on vast unlabeled data to learn general capabilities.
← All terms