Mixture of Experts - a CCMat Collection

CCMat 's Collections

3D Understanding

3D World / Scene

RL

Visual Consistency

ID Preservation

Inference Improvements

Adapters & Controls

Personalization

Depth & Segmentation

Computer Vision

Mixture of Experts

Transformers & Attention

StateSpaceModels

UI

Mixture of Experts

updated Feb 4, 2025

BlackMamba: Mixture of Experts for State-Space Models

Paper • 2402.01771 • Published Feb 1, 2024 • 25
OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models

Paper • 2402.01739 • Published Jan 29, 2024 • 28
MoE-LLaVA: Mixture of Experts for Large Vision-Language Models

Paper • 2401.15947 • Published Jan 29, 2024 • 53
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models

Paper • 2401.06066 • Published Jan 11, 2024 • 59
MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts

Paper • 2401.04081 • Published Jan 8, 2024 • 74
Mixtral of Experts

Paper • 2401.04088 • Published Jan 8, 2024 • 160
Scaling Laws for Fine-Grained Mixture of Experts

Paper • 2402.07871 • Published Feb 12, 2024 • 13
Mixtures of Experts Unlock Parameter Scaling for Deep RL

Paper • 2402.08609 • Published Feb 13, 2024 • 36
Multi-Head Mixture-of-Experts

Paper • 2404.15045 • Published Apr 23, 2024 • 60
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

Paper • 2404.02258 • Published Apr 2, 2024 • 107
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

Paper • 2405.04434 • Published May 7, 2024 • 25