Paper
Paper
Search...
Search ResearchHub...
Ctrl+K
New
Home
Browse
Earn
Fund
RH Journal
Notebook
Lists
Leaderboard
RSC
USD
Changelog
Terms
Privacy
Issues
Docs
Support
Foundation
About
Mixture-of-Depths: Dynamically allocating compute in tran... | ResearchHub
Paper
Paper
Search...
Search ResearchHub...
Ctrl+K
New
Home
Browse
Earn
Fund
RH Journal
Notebook
Lists
Leaderboard
RSC
USD
Changelog
Terms
Privacy
Issues
Docs
Support
Foundation
About
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models
7
Authors
David Raposo
5 more
David Raposo
•
Sam Ritter
3 more
•
Alberto Santoro
Published
April 2, 2024
Paper
Conversation
1
Reviews
5.0
Bounties
0
Sign in to comment
Add a comment...
Best
Supporters
Support the authors with ResearchCoin
Tip RSC
Journal
arXiv (Cornell University)
Topics
Computer Science
Machine Learning
Engineering
Artificial Intelligence
Transformer
Show all topics
DOI
10.48550/arxiv.2404.02258
Other Formats
PDF
Supporters
Support the authors with ResearchCoin
Tip RSC
Journal
arXiv (Cornell University)
Topics
Computer Science
Machine Learning
Engineering
Artificial Intelligence
Transformer
Show all topics
DOI
10.48550/arxiv.2404.02258
Other Formats
PDF