Paper
Paper
Search...
Search ResearchHub...
Ctrl+K
New
Home
Browse
Earn
Fund
RH Journal
Notebook
Lists
Leaderboard
RSC
USD
Changelog
Terms
Privacy
Issues
Docs
Support
Foundation
About
Mixture-of-Depths: Dynamically allocating compute in tran... | ResearchHub
Paper
Paper
Search...
Search ResearchHub...
Ctrl+K
New
Home
Browse
Earn
Fund
RH Journal
Notebook
Lists
Leaderboard
RSC
USD
Changelog
Terms
Privacy
Issues
Docs
Support
Foundation
About
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models
7
Authors
David Raposo
5 more
David Raposo
•
Sam Ritter
3 more
•
Alberto Santoro
Published
April 2, 2024
Paper
Conversation
1
Reviews
5.0
Bounties
0
Sign in to review
Share your thoughts on this paper...
Best
Supporters
Support the authors with ResearchCoin
Tip RSC
Journal
arXiv (Cornell University)
Topics
Computer Science
Machine Learning
Engineering
Artificial Intelligence
Transformer
Show all topics
DOI
10.48550/arxiv.2404.02258
Other Formats
PDF
Supporters
Support the authors with ResearchCoin
Tip RSC
Journal
arXiv (Cornell University)
Topics
Computer Science
Machine Learning
Engineering
Artificial Intelligence
Transformer
Show all topics
DOI
10.48550/arxiv.2404.02258
Other Formats
PDF