-
Controllable Text Generation for Large Language Models: A Survey
Paper • 2408.12599 • Published • 65 -
xGen-VideoSyn-1: High-fidelity Text-to-Video Synthesis with Compressed Representations
Paper • 2408.12590 • Published • 36 -
Real-Time Video Generation with Pyramid Attention Broadcast
Paper • 2408.12588 • Published • 17 -
Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model
Paper • 2408.11039 • Published • 63
Collections
Discover the best community collections!
Collections including paper arxiv:2410.00907
-
VILA^2: VILA Augmented VILA
Paper • 2407.17453 • Published • 41 -
Octopus v4: Graph of language models
Paper • 2404.19296 • Published • 118 -
Octo-planner: On-device Language Model for Planner-Action Agents
Paper • 2406.18082 • Published • 48 -
Dolphin: Long Context as a New Modality for Energy-Efficient On-Device Language Models
Paper • 2408.15518 • Published • 42
-
Addition is All You Need for Energy-efficient Language Models
Paper • 2410.00907 • Published • 151 -
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 626 -
LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding
Paper • 2404.16710 • Published • 80 -
Beyond Scaling Laws: Understanding Transformer Performance with Associative Memory
Paper • 2405.08707 • Published • 34
-
CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data
Paper • 2404.15653 • Published • 29 -
MoDE: CLIP Data Experts via Clustering
Paper • 2404.16030 • Published • 15 -
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning
Paper • 2405.12130 • Published • 50 -
Reducing Transformer Key-Value Cache Size with Cross-Layer Attention
Paper • 2405.12981 • Published • 33
-
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
Paper • 2403.03507 • Published • 189 -
RAFT: Adapting Language Model to Domain Specific RAG
Paper • 2403.10131 • Published • 72 -
LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models
Paper • 2403.13372 • Published • 173 -
InternLM2 Technical Report
Paper • 2403.17297 • Published • 34
-
LLM Pruning and Distillation in Practice: The Minitron Approach
Paper • 2408.11796 • Published • 57 -
TableBench: A Comprehensive and Complex Benchmark for Table Question Answering
Paper • 2408.09174 • Published • 52 -
To Code, or Not To Code? Exploring Impact of Code in Pre-training
Paper • 2408.10914 • Published • 44 -
Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications
Paper • 2408.11878 • Published • 63
-
How to Train Data-Efficient LLMs
Paper • 2402.09668 • Published • 42 -
SliceGPT: Compress Large Language Models by Deleting Rows and Columns
Paper • 2401.15024 • Published • 74 -
SHERL: Synthesizing High Accuracy and Efficient Memory for Resource-Limited Transfer Learning
Paper • 2407.07523 • Published • 6 -
Spectra: A Comprehensive Study of Ternary, Quantized, and FP16 Language Models
Paper • 2407.12327 • Published • 79
-
Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
Paper • 2405.08748 • Published • 24 -
Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection
Paper • 2405.10300 • Published • 30 -
Chameleon: Mixed-Modal Early-Fusion Foundation Models
Paper • 2405.09818 • Published • 132 -
OpenRLHF: An Easy-to-use, Scalable and High-performance RLHF Framework
Paper • 2405.11143 • Published • 41
-
RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval
Paper • 2401.18059 • Published • 47 -
Personalized Visual Instruction Tuning
Paper • 2410.07113 • Published • 70 -
Differential Transformer
Paper • 2410.05258 • Published • 179 -
What Matters in Transformers? Not All Attention is Needed
Paper • 2406.15786 • Published • 31
-
LoRA+: Efficient Low Rank Adaptation of Large Models
Paper • 2402.12354 • Published • 7 -
The FinBen: An Holistic Financial Benchmark for Large Language Models
Paper • 2402.12659 • Published • 23 -
TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization
Paper • 2402.13249 • Published • 14 -
TrustLLM: Trustworthiness in Large Language Models
Paper • 2401.05561 • Published • 69
-
Controllable Text Generation for Large Language Models: A Survey
Paper • 2408.12599 • Published • 65 -
xGen-VideoSyn-1: High-fidelity Text-to-Video Synthesis with Compressed Representations
Paper • 2408.12590 • Published • 36 -
Real-Time Video Generation with Pyramid Attention Broadcast
Paper • 2408.12588 • Published • 17 -
Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model
Paper • 2408.11039 • Published • 63
-
LLM Pruning and Distillation in Practice: The Minitron Approach
Paper • 2408.11796 • Published • 57 -
TableBench: A Comprehensive and Complex Benchmark for Table Question Answering
Paper • 2408.09174 • Published • 52 -
To Code, or Not To Code? Exploring Impact of Code in Pre-training
Paper • 2408.10914 • Published • 44 -
Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications
Paper • 2408.11878 • Published • 63
-
VILA^2: VILA Augmented VILA
Paper • 2407.17453 • Published • 41 -
Octopus v4: Graph of language models
Paper • 2404.19296 • Published • 118 -
Octo-planner: On-device Language Model for Planner-Action Agents
Paper • 2406.18082 • Published • 48 -
Dolphin: Long Context as a New Modality for Energy-Efficient On-Device Language Models
Paper • 2408.15518 • Published • 42
-
How to Train Data-Efficient LLMs
Paper • 2402.09668 • Published • 42 -
SliceGPT: Compress Large Language Models by Deleting Rows and Columns
Paper • 2401.15024 • Published • 74 -
SHERL: Synthesizing High Accuracy and Efficient Memory for Resource-Limited Transfer Learning
Paper • 2407.07523 • Published • 6 -
Spectra: A Comprehensive Study of Ternary, Quantized, and FP16 Language Models
Paper • 2407.12327 • Published • 79
-
Addition is All You Need for Energy-efficient Language Models
Paper • 2410.00907 • Published • 151 -
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 626 -
LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding
Paper • 2404.16710 • Published • 80 -
Beyond Scaling Laws: Understanding Transformer Performance with Associative Memory
Paper • 2405.08707 • Published • 34
-
Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
Paper • 2405.08748 • Published • 24 -
Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection
Paper • 2405.10300 • Published • 30 -
Chameleon: Mixed-Modal Early-Fusion Foundation Models
Paper • 2405.09818 • Published • 132 -
OpenRLHF: An Easy-to-use, Scalable and High-performance RLHF Framework
Paper • 2405.11143 • Published • 41
-
CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data
Paper • 2404.15653 • Published • 29 -
MoDE: CLIP Data Experts via Clustering
Paper • 2404.16030 • Published • 15 -
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning
Paper • 2405.12130 • Published • 50 -
Reducing Transformer Key-Value Cache Size with Cross-Layer Attention
Paper • 2405.12981 • Published • 33
-
RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval
Paper • 2401.18059 • Published • 47 -
Personalized Visual Instruction Tuning
Paper • 2410.07113 • Published • 70 -
Differential Transformer
Paper • 2410.05258 • Published • 179 -
What Matters in Transformers? Not All Attention is Needed
Paper • 2406.15786 • Published • 31
-
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
Paper • 2403.03507 • Published • 189 -
RAFT: Adapting Language Model to Domain Specific RAG
Paper • 2403.10131 • Published • 72 -
LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models
Paper • 2403.13372 • Published • 173 -
InternLM2 Technical Report
Paper • 2403.17297 • Published • 34
-
LoRA+: Efficient Low Rank Adaptation of Large Models
Paper • 2402.12354 • Published • 7 -
The FinBen: An Holistic Financial Benchmark for Large Language Models
Paper • 2402.12659 • Published • 23 -
TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization
Paper • 2402.13249 • Published • 14 -
TrustLLM: Trustworthiness in Large Language Models
Paper • 2401.05561 • Published • 69