How long an inference request waits before the first output token appears; a key latency SLO set by prefill.
← All terms