BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution Paper • 2510.08697 • Published Oct 9 • 35
Vision-Zero: Scalable VLM Self-Improvement via Strategic Gamified Self-Play Paper • 2509.25541 • Published Sep 29 • 140
SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning Paper • 2506.24119 • Published Jun 30 • 50
Confident Splatting: Confidence-Based Compression of 3D Gaussian Splatting via Learnable Beta Distributions Paper • 2506.22973 • Published Jun 28 • 3
Answer Matching Outperforms Multiple Choice for Language Model Evaluation Paper • 2507.02856 • Published Jul 3 • 8
ZeCO: Zero Communication Overhead Sequence Parallelism for Linear Attention Paper • 2507.01004 • Published Jul 1 • 10