Paper
Document
Submit new version
Download
Flag content
15

LLM in a flash: Efficient Large Language Model Inference with Limited Memory

Authors
Keivan Alizadeh,Iman Mirzadeh
Dmitry Belenko,Karen Khatamifard,Minsik Cho,Carlo Mundo,Mohammad Rastegari
+5 authors
,Mehrdad Farajtabar
Published
Dec 12, 2023
Posted by
Save
TipTip
Document
Submit new version
Download
Flag content
15
TipTip
Save
Document
Submit new version
Download
Flag content
100%