A vLLM technique that manages the KV cache in fixed pages like virtual memory, cutting waste and fragmentation.
← All terms