Knowledge Engineer Group @ Tsinghua University

university

https://keg.cs.tsinghua.edu.cn/

THU-KEG

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

Wesleythu submitted a paper 3 days ago

WildReward: Learning Reward Models from In-the-Wild Human Interactions

Wesleythu updated a collection 6 days ago

Wesleythu updated a collection 6 days ago

View all activity

Papers

WildReward: Learning Reward Models from In-the-Wild Human Interactions

DeepPrune: Parallel Scaling without Inter-trace Redundancy

View all Papers

Wesleythu

submitted a paper to Daily Papers 3 days ago

WildReward: Learning Reward Models from In-the-Wild Human Interactions

Paper • 2602.08829 • Published 4 days ago • 3

Wesleythu

updated a collection 6 days ago

WildReward

Learning Reward Models from In-the-Wild Interactions • 3 items • Updated 6 days ago • 2

Wesleythu

updated 2 models 7 days ago

THU-KEG/WildReward-4B

Text Classification • 4B • Updated 6 days ago • 19 • 3

THU-KEG/WildReward-8B

Text Classification • 8B • Updated 6 days ago • 25 • 3

mozhu

authored a paper 9 days ago

Kimi K2.5: Visual Agentic Intelligence

Paper • 2602.02276 • Published 11 days ago • 228

NeoZ123

submitted a paper to Daily Papers about 1 month ago

Chaining the Evidence: Robust Reinforcement Learning for Deep Search Agents with Citation-Aware Rubric Rewards

Paper • 2601.06021 • Published Jan 9 • 46

NeoZ123

published a dataset about 1 month ago

THU-KEG/CaRR-DeepDive

Preview • Updated Jan 11 • 244 • 1

NeoZ123

updated a dataset about 1 month ago

THU-KEG/CaRR-DeepDive

Preview • Updated Jan 11 • 244 • 1

mozhu

authored a paper 3 months ago

Kimi Linear: An Expressive, Efficient Attention Architecture

Paper • 2510.26692 • Published Oct 30, 2025 • 123

Kikkk

updated a dataset 4 months ago

THU-KEG/AgentIF

Viewer • Updated Oct 24, 2025 • 707 • 98 • 7

RicardoL1u

in THU-KEG/RM-Bench 4 months ago

Many chosen rows are truncated

#3 opened 4 months ago by

AlexShengzhiMeta

linny2002

updated 4 models 4 months ago

THU-KEG/LLaDA-8B-BGPO-sudoku

Reinforcement Learning • 8B • Updated Oct 14, 2025 • 1

THU-KEG/LLaDA-8B-BGPO-countdown

Reinforcement Learning • 8B • Updated Oct 14, 2025 • 3 • 1

THU-KEG/LLaDA-8B-BGPO-code

Reinforcement Learning • 8B • Updated Oct 14, 2025 • 1

THU-KEG/LLaDA-8B-BGPO-math

Reinforcement Learning • 8B • Updated Oct 14, 2025 • 1

NeoZ123

updated a collection 4 months ago

LLaDA-8B-BGPO

Boundary-Guided Policy Optimization for Memory-Efficient RL of Diffusion Large Language Models • 4 items • Updated Oct 11, 2025 • 4