A popular technique to make AI more efficient has drawbacks
Quantization, a technique to make AI models more efficient, has limits as shown by a study. Training large models and then quantizing them may not always be beneficial as smaller models could perform better in some cases. Lower precision formats can make models more robust, but models have finite capacity. Inference costs cannot be reduced indefinitely without impacting model quality. New architectures focusing on low precision training may be important in the future.