Paper

Paper

About Docs Changelog Terms Privacy Support

Mixture-of-Depths: Dynamically allocating compute in tran... | ResearchHub

Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

By

David Raposo

David Raposo·Sam Ritter·Alberto Santoro

Preprint

April 2, 2024

7

Paper

Paper

About Docs Changelog Terms Privacy Support

Sign in to review

Share your thoughts on this paper...

Journal

arXiv (Cornell University)

Peer Reviews

almost 2 years ago

Topics

#computer-science #machine-learning #engineering #artificial-intelligence #voltage-1

DOI

10.48550/arxiv.2404.02258

Other Formats