Block-Based Programming Language Itchy

Research on efficient inference of large language model based on routing policy

Abstract: To solve the problem of balancing high cost and high performance in large language model (LLMs) inference scenarios, an adaptive routing strategy (MA-Router) with multi-modal attention ...

一些您可能无法访问的结果已被隐去。

显示无法访问的结果

反馈

Research on efficient inference of large language model based on routing policy

今日热点