学术报告(高群 6.14)
Accelerate AI with Intel AI Tools
Intel Neural Compressor is an innovative AI toolkit, which continuously evolves and integrates with most state-of-art emerging technologies, such Smooth Quant, FP8 quantization, etc. Intel Neural Compressor provides unified interfaces for popular network compression technologies, such as quantization, pruning and knowledge distillation, across multiple deep learning frameworks. Intel Extension for Transformers is an innovative toolkit that accelerates Transformer-based models on Intel platforms by providing not only advanced compression, fine-tuning technologies, but also its own proprietary sparse-aware inference engine- Neural Engine. Intel AI toolkits achieve remarkable speedup while maintaining similar or better (within <1% loss) accuracy, across a suite of applied models, and have been proven to be highly effective for broad models (> 10,000 models).