Quantization Tutorial

MobileQuant: Mobile-friendly Quantization for On-device Language Models

Large language models (LLMs) have revolutionized language processing, delivering outstanding results across multiple applications. However, deploying LLMs on edge devices poses several challenges with ...

IEEE

Improving the Post-Training Neural Network Quantization by Prepositive Feature Quantization

Abstract: Post-training neural network quantization (PTQ) is an effective model compression technology that has revolutionized the deployment of deep neural networks on various edge devices. It ...

IEEE

A Quantization Approach for the Reduced Size of Large Language Models

Abstract: The use of large-language models is widespread in a range of applications, including natural language processing and multimodal tasks. However, these models are computationally intensive.

一些您可能无法访问的结果已被隐去。

显示无法访问的结果

MobileQuant: Mobile-friendly Quantization for On-device Language Models

Improving the Post-Training Neural Network Quantization by Prepositive Feature Quantization

A Quantization Approach for the Reduced Size of Large Language Models

今日热点