A workload limited by how fast data moves from memory rather than by compute; typical of inference decode.
← All terms