Quartet II: Accurate LLM Pre-Training in NVFP4 by Improved Unbiased Gradient Estimation Paper • 2601.22813 • Published 10 days ago • 55
SWE-rebench: An Automated Pipeline for Task Collection and Decontaminated Evaluation of Software Engineering Agents Paper • 2505.20411 • Published May 26, 2025 • 92
Feature-Level Insights into Artificial Text Detection with Sparse Autoencoders Paper • 2503.03601 • Published Mar 5, 2025 • 232
Cognitive Behaviors that Enable Self-Improving Reasoners, or, Four Habits of Highly Effective STaRs Paper • 2503.01307 • Published Mar 3, 2025 • 38