39 40 31

Shizhe Diao

shizhediao2

https://shizhediao.github.io/

AI & ML interests

LLM pre-training and reasoning

Recent Activity

upvoted a paper 14 days ago

VGGT-Det: Mining VGGT Internal Priors for Sensor-Geometry-Free Multi-View Indoor 3D Object Detection

liked a dataset 17 days ago

nvidia/ProfBench

upvoted a paper 19 days ago

Query-focused and Memory-aware Reranker for Long Context Processing

View all activity

Organizations

upvoted a paper 14 days ago

VGGT-Det: Mining VGGT Internal Priors for Sensor-Geometry-Free Multi-View Indoor 3D Object Detection

Paper • 2603.00912 • Published 16 days ago • 36

upvoted a paper 19 days ago

Query-focused and Memory-aware Reranker for Long Context Processing

Paper • 2602.12192 • Published Feb 12 • 56

upvoted a paper 20 days ago

On Data Engineering for Scaling LLM Terminal Capabilities

Paper • 2602.21193 • Published 20 days ago • 95

upvoted an article 20 days ago

Article

Can Your LLM Think Like a Professional? Introducing ProfBench

Oct 28, 2025

•

upvoted 2 papers about 1 month ago

PhyCritic: Multimodal Critic Models for Physical AI

Paper • 2602.11124 • Published Feb 11 • 52

Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text

Paper • 2601.22975 • Published Jan 30 • 109

upvoted a paper 2 months ago

GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization

Paper • 2601.05242 • Published Jan 8 • 229

upvoted 2 papers 3 months ago

Nemotron-Flash: Towards Latency-Optimal Hybrid Small Language Models

Paper • 2511.18890 • Published Nov 24, 2025 • 35

ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration

Paper • 2511.21689 • Published Nov 26, 2025 • 125

upvoted 5 papers 5 months ago

Unified Reinforcement and Imitation Learning for Vision-Language Models

Paper • 2510.19307 • Published Oct 22, 2025 • 32

DLER: Doing Length pEnalty Right - Incentivizing More Intelligence per Token via Reinforcement Learning

Paper • 2510.15110 • Published Oct 16, 2025 • 18

OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM

Paper • 2510.15870 • Published Oct 17, 2025 • 91

QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs

Paper • 2510.11696 • Published Oct 13, 2025 • 181

GAR: Generative Adversarial Reinforcement Learning for Formal Theorem Proving

Paper • 2510.11769 • Published Oct 13, 2025 • 26

upvoted an article 5 months ago

Article

Finally, a Replacement for BERT: Introducing ModernBERT

Dec 19, 2024

•

735

upvoted 2 papers 6 months ago

BroRL: Scaling Reinforcement Learning via Broadened Exploration

Paper • 2510.01180 • Published Oct 1, 2025 • 20

DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search

Paper • 2509.25454 • Published Sep 29, 2025 • 146

upvoted an article 6 months ago

Article

Introducing smolagents: simple agents that write actions in code.

Dec 31, 2024

•

1.18k

upvoted a paper 6 months ago

Beyond Correctness: Harmonizing Process and Outcome Rewards through RL Training

Paper • 2509.03403 • Published Sep 3, 2025 • 23

upvoted a paper 7 months ago

Beyond Ten Turns: Unlocking Long-Horizon Agentic Search with Large-Scale Asynchronous RL

Paper • 2508.07976 • Published Aug 11, 2025 • 52

Shizhe Diao

AI & ML interests

Recent Activity

Organizations

shizhediao2's activity

Can Your LLM Think Like a Professional? Introducing ProfBench

Finally, a Replacement for BERT: Introducing ModernBERT

Introducing smolagents: simple agents that write actions in code.