Advanced Memory Compression Methods

7 天

Nvidia says it can shrink LLM memory 20x without changing model weights

Nvidia's KV Cache Transform Coding (KVTC) compresses LLM key-value cache by 20x without model changes, cutting GPU memory ...

17 天

New KV cache compaction technique cuts LLM memory 50x without accuracy loss

MIT researchers developed Attention Matching, a KV cache compaction technique that compresses LLM memory by 50x in seconds — without the hours of GPU training that prior methods required.

Nature

Video Compression Algorithms and Memory Efficiency

Video compression has become an essential technology to meet the burgeoning demand for high‐resolution content while maintaining manageable file sizes and transmission speeds. Recent advances in ...

一些您可能无法访问的结果已被隐去。

显示无法访问的结果

Nvidia says it can shrink LLM memory 20x without changing model weights

New KV cache compaction technique cuts LLM memory 50x without accuracy loss

Video Compression Algorithms and Memory Efficiency

今日热点