Multiverse Computing Shrinks Llama AI Models by 80%, Boosting Efficiency
Summary
Multiverse Computing's CompactifAI technology shrinks Llama AI models by 80%, boosting efficiency with 60% fewer parameters, 84% greater energy efficiency, 40% faster inference, and 50% cost reduction, enabling new use cases for portable AI.
Key Points
- Multiverse Computing released compressed versions of Llama 3.1-8B and Llama 3.3-70B using CompactifAI, reducing their size by 80% with almost no loss in precision.
- The compressed models have 60% fewer parameters, 84% greater energy efficiency, 40% faster inference, and 50% cost reduction compared to the original models.
- CompactifAI is a proprietary AI compressor that uses quantum-inspired tensor networks to make AI systems more efficient and portable, enabling new use cases for AI models.