English
全部
搜索
图片
视频
地图
资讯
Copilot
更多
购物
航班
旅游
笔记本
Top stories
Sports
U.S.
Local
World
Science
Technology
Entertainment
Business
More
Politics
时间不限
过去 1 小时
过去 24 小时
过去 7 天
过去 30 天
最新
最佳匹配
搜狐
10月
ICLR 2025 | 阿里等提出LLaVA-MoD,用MoE+蒸馏训练轻量化多模态大模型
本文提出 轻量化多模态大模型 LLaVA-MoD,通过集成稀疏的专家混合(MoE)架构,优化小模型的网络结构,并提出 Dense-to-Sparse 蒸馏框架,结合两阶段蒸馏策略(模仿蒸馏+偏好蒸馏),实现全面的知识迁移。 该方案仅用 0.3% 数据和 23% 激活参数,即实现 2B 小模型 ...
一些您可能无法访问的结果已被隐去。
显示无法访问的结果
今日热点
Judge limits federal agents
New details released
Driver gets 24 years to life
AU ban hits 4.7M accounts
Plane slides off taxiway
St. Clair sues Musk's xAI
Says 'Board of Peace' formed
Taps new ICE deputy director
VA backs redrawing maps
Southern Africa floods
Quake strikes off OR coast
Appeals court on release
Was under conservatorship?
Sentenced to 5 yrs in prison
Trump pardons Vázquez
Former biotech CEO sued
To hike subscription price
Bill to fund science agencies
Denies abuse allegations
On reviewing Epstein files
To buy shale gas assets
Seeks tech plant deal
Amazon vs. Saks
Carney hails new partnership
To test ads in ChatGPT
Ratcliffe meets w/ Rodríguez
To study cellphone radiation
Issues new tariff threat
Measles cases rise in SC
Gets extension in US probe
DOJ launches investigation
UKR has sufficient fuel stocks
Exits with lower-body injury
Denver schools block ChatGPT
To hear Bayer's bid
反馈