Paper
Paper
Search
Search
Ctrl+K
Publish
Home
Fund
Earn
Journal
Notebook
Lists
RSC
USD
About
Docs
Changelog
Terms
Privacy
Support
Mixture-of-Depths: Dynamically allocating compute in tran... | ResearchHub
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models
By
David Raposo
5 more
David Raposo
·
Sam Ritter
3 more
·
Alberto Santoro
Preprint
April 2, 2024
7
Paper
Conversation
1
Reviews
5.0
Bounties
0
Paper
Paper
Search
Search
Ctrl+K
Publish
Home
Fund
Earn
Journal
Notebook
Lists
RSC
USD
About
Docs
Changelog
Terms
Privacy
Support
Loading PDF viewer…
Journal
arXiv (Cornell University)
Peer Reviews
AR
Ahan M R
almost 2 years ago
Topics
#computer-science
#machine-learning
#engineering
#artificial-intelligence
#voltage-1
Show all topics
DOI
10.48550/arxiv.2404.02258
Other Formats
PDF