Breaking a long prompt's prefill into chunks interleaved with decode so latency stays steady under load.
← All terms