Less is More: Recursive Reasoning with Tiny Networks Paper • 2510.04871 • Published Oct 6, 2025 • 507
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone Paper • 2404.14219 • Published Apr 22, 2024 • 259
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows Paper • 2103.14030 • Published Mar 25, 2021 • 5
Enable Language Models to Implicitly Learn Self-Improvement From Data Paper • 2310.00898 • Published Oct 2, 2023 • 24
Toolformer: Language Models Can Teach Themselves to Use Tools Paper • 2302.04761 • Published Feb 9, 2023 • 12
Decision Transformer: Reinforcement Learning via Sequence Modeling Paper • 2106.01345 • Published Jun 2, 2021 • 3