Yadong Wen
ralphite
AI & ML interests
None yet
Organizations
None yet
systems
-
Distributed Inference and Fine-tuning of Large Language Models Over The Internet
Paper • 2312.08361 • Published • 27 -
S-LoRA: Serving Thousands of Concurrent LoRA Adapters
Paper • 2311.03285 • Published • 31 -
Efficient Memory Management for Large Language Model Serving with PagedAttention
Paper • 2309.06180 • Published • 30 -
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 627
agi
finetune
applications
systems
-
Distributed Inference and Fine-tuning of Large Language Models Over The Internet
Paper • 2312.08361 • Published • 27 -
S-LoRA: Serving Thousands of Concurrent LoRA Adapters
Paper • 2311.03285 • Published • 31 -
Efficient Memory Management for Large Language Model Serving with PagedAttention
Paper • 2309.06180 • Published • 30 -
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 627
benchmark/eval
agi
ai-misc
finetune