Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2510.08558

AI Paper of the Day

A collection of papers that I think are interesting, one added each day

Can Large Language Models Understand Context?

Paper • 2402.00858 • Published Feb 1, 2024 • 23
OLMo: Accelerating the Science of Language Models

Paper • 2402.00838 • Published Feb 1, 2024 • 85
Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18, 2024 • 151
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity

Paper • 2401.17072 • Published Jan 30, 2024 • 25

Towards General-Purpose Model-Free Reinforcement Learning

Paper • 2501.16142 • Published Jan 27 • 30
DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Paper • 2503.14476 • Published Mar 18 • 142
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?

Paper • 2504.13837 • Published Apr 18 • 138
Learning to Reason under Off-Policy Guidance

Paper • 2504.14945 • Published Apr 21 • 88

DeepAgent: A General Reasoning Agent with Scalable Toolsets

Paper • 2510.21618 • Published Oct 24 • 99
Agent Learning via Early Experience

Paper • 2510.08558 • Published Oct 9 • 266

Famous Authors/Institute

QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs

Paper • 2510.11696 • Published Oct 13 • 176
Agent Learning via Early Experience

Paper • 2510.08558 • Published Oct 9 • 266
Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published Oct 6 • 493

Agent Learning via Early Experience

Paper • 2510.08558 • Published Oct 9 • 266
Regret Bounds for Markov Decision Processes with Recursive Optimized Certainty Equivalents

Paper • 2301.12601 • Published Jan 30, 2023
Bayesian Risk Markov Decision Processes

Paper • 2106.02558 • Published Jun 4, 2021
Metrics for Markov Decision Processes with Infinite State Spaces

Paper • 1207.1386 • Published Jul 4, 2012

Learning from examples - training/inference

ExGRPO: Learning to Reason from Experience

Paper • 2510.02245 • Published Oct 2 • 80
A Practitioner's Guide to Multi-turn Agentic Reinforcement Learning

Paper • 2510.01132 • Published Oct 1 • 5
Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models

Paper • 2510.04618 • Published Oct 6 • 123
MixReasoning: Switching Modes to Think

Paper • 2510.06052 • Published Oct 7 • 21

The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain

Paper • 2509.26507 • Published Sep 30 • 535
Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published Oct 6 • 493
Agent Learning via Early Experience

Paper • 2510.08558 • Published Oct 9 • 266
Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models

Paper • 2510.04618 • Published Oct 6 • 123

MacroBench: A Novel Testbed for Web Automation Scripts via Large Language Models

Paper • 2510.04363 • Published Oct 5
Control Plane as a Tool: A Scalable Design Pattern for Agentic AI Systems

Paper • 2505.06817 • Published May 11
Agentic Web: Weaving the Next Web with AI Agents

Paper • 2507.21206 • Published Jul 28
Improving Autonomous AI Agents with Reflective Tree Search and Self-Learning

Paper • 2410.02052 • Published Oct 2, 2024 • 9

Agent Learning via Early Experience

Paper • 2510.08558 • Published Oct 9 • 266

Agent Learning via Early Experience

Paper • 2510.08558 • Published Oct 9 • 266

AI Paper of the Day

A collection of papers that I think are interesting, one added each day

Can Large Language Models Understand Context?

Paper • 2402.00858 • Published Feb 1, 2024 • 23
OLMo: Accelerating the Science of Language Models

Paper • 2402.00838 • Published Feb 1, 2024 • 85
Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18, 2024 • 151
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity

Paper • 2401.17072 • Published Jan 30, 2024 • 25

Learning from examples - training/inference

ExGRPO: Learning to Reason from Experience

Paper • 2510.02245 • Published Oct 2 • 80
A Practitioner's Guide to Multi-turn Agentic Reinforcement Learning

Paper • 2510.01132 • Published Oct 1 • 5
Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models

Paper • 2510.04618 • Published Oct 6 • 123
MixReasoning: Switching Modes to Think

Paper • 2510.06052 • Published Oct 7 • 21

Towards General-Purpose Model-Free Reinforcement Learning

Paper • 2501.16142 • Published Jan 27 • 30
DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Paper • 2503.14476 • Published Mar 18 • 142
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?

Paper • 2504.13837 • Published Apr 18 • 138
Learning to Reason under Off-Policy Guidance

Paper • 2504.14945 • Published Apr 21 • 88

The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain

Paper • 2509.26507 • Published Sep 30 • 535
Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published Oct 6 • 493
Agent Learning via Early Experience

Paper • 2510.08558 • Published Oct 9 • 266
Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models

Paper • 2510.04618 • Published Oct 6 • 123

DeepAgent: A General Reasoning Agent with Scalable Toolsets

Paper • 2510.21618 • Published Oct 24 • 99
Agent Learning via Early Experience

Paper • 2510.08558 • Published Oct 9 • 266

MacroBench: A Novel Testbed for Web Automation Scripts via Large Language Models

Paper • 2510.04363 • Published Oct 5
Control Plane as a Tool: A Scalable Design Pattern for Agentic AI Systems

Paper • 2505.06817 • Published May 11
Agentic Web: Weaving the Next Web with AI Agents

Paper • 2507.21206 • Published Jul 28
Improving Autonomous AI Agents with Reflective Tree Search and Self-Learning

Paper • 2410.02052 • Published Oct 2, 2024 • 9

Famous Authors/Institute

QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs

Paper • 2510.11696 • Published Oct 13 • 176
Agent Learning via Early Experience

Paper • 2510.08558 • Published Oct 9 • 266
Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published Oct 6 • 493

Agent Learning via Early Experience

Paper • 2510.08558 • Published Oct 9 • 266

Agent Learning via Early Experience

Paper • 2510.08558 • Published Oct 9 • 266
Regret Bounds for Markov Decision Processes with Recursive Optimized Certainty Equivalents

Paper • 2301.12601 • Published Jan 30, 2023
Bayesian Risk Markov Decision Processes

Paper • 2106.02558 • Published Jun 4, 2021
Metrics for Markov Decision Processes with Infinite State Spaces

Paper • 1207.1386 • Published Jul 4, 2012

Agent Learning via Early Experience

Paper • 2510.08558 • Published Oct 9 • 266

Previous
1
2
3
4
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs