P1: Mastering Physics Olympiads with Reinforcement Learning Paper • 2511.13612 • Published 19 days ago • 132
Scaling Agent Learning via Experience Synthesis Paper • 2511.03773 • Published about 1 month ago • 80
Supervised Reinforcement Learning: From Expert Trajectories to Step-wise Reasoning Paper • 2510.25992 • Published Oct 29 • 44
Glyph: Scaling Context Windows via Visual-Text Compression Paper • 2510.17800 • Published Oct 20 • 67
DeepAnalyze: Agentic Large Language Models for Autonomous Data Science Paper • 2510.16872 • Published Oct 19 • 103
A Survey of Reinforcement Learning for Large Reasoning Models Paper • 2509.08827 • Published Sep 10 • 189
From What to Why: A Multi-Agent System for Evidence-based Chemical Reaction Condition Reasoning Paper • 2509.23768 • Published Sep 28 • 48
OffTopicEval: When Large Language Models Enter the Wrong Chat, Almost Always! Paper • 2509.26495 • Published Sep 30 • 10
MCPMark: A Benchmark for Stress-Testing Realistic and Comprehensive MCP Use Paper • 2509.24002 • Published Sep 28 • 173
MMR1: Enhancing Multimodal Reasoning with Variance-Aware Sampling and Open Resources Paper • 2509.21268 • Published Sep 25 • 103
HiPhO: How Far Are (M)LLMs from Humans in the Latest High School Physics Olympiad Benchmark? Paper • 2509.07894 • Published Sep 9 • 31