Test-time compute
Spending extra inference compute (e.g. chain-of-thought reasoning) to improve answers, shifting cost from training to serving.
Spending extra inference compute (e.g. chain-of-thought reasoning) to improve answers, shifting cost from training to serving.