萧诗雨's picture

萧诗雨

liangziyu1

AI & ML interests

None yet

Recent Activity

upvoted a paper 4 days ago

DelTA: Discriminative Token Credit Assignment for Reinforcement Learning from Verifiable Rewards

upvoted a paper 4 days ago

Perception or Prejudice: Can MLLMs Go Beyond First Impressions of Personality?

upvoted a paper 5 days ago

MOCHA: Multi-Objective Chebyshev Annealing for Agent Skill Optimization

View all activity

Organizations

None yet

upvoted 2 papers 4 days ago

DelTA: Discriminative Token Credit Assignment for Reinforcement Learning from Verifiable Rewards

Paper • 2605.21467 • Published 7 days ago • 201

Perception or Prejudice: Can MLLMs Go Beyond First Impressions of Personality?

Paper • 2605.22109 • Published 6 days ago • 165

upvoted a paper 5 days ago

MOCHA: Multi-Objective Chebyshev Annealing for Agent Skill Optimization

Paper • 2605.19330 • Published 8 days ago • 8

upvoted a paper 6 days ago

Anti-Self-Distillation for Reasoning RL via Pointwise Mutual Information

Paper • 2605.11609 • Published 15 days ago • 191

upvoted 2 papers 9 days ago

Learning to Foresee: Unveiling the Unlocking Efficiency of On-Policy Distillation

Paper • 2605.11739 • Published 14 days ago • 58

AutoLLMResearch: Training Research Agents for Automating LLM Experiment Configuration -- Learning from Cheap, Optimizing Expensive

Paper • 2605.11518 • Published 15 days ago • 4

upvoted a paper 13 days ago

MAP: A Map-then-Act Paradigm for Long-Horizon Interactive Agent Reasoning

Paper • 2605.13037 • Published 14 days ago • 8

upvoted a paper 16 days ago

Chain of Evidence: Pixel-Level Visual Attribution for Iterative Retrieval-Augmented Generation

Paper • 2605.01284 • Published 25 days ago • 3

upvoted a paper 26 days ago

Accelerating RL Post-Training Rollouts via System-Integrated Speculative Decoding

Paper • 2604.26779 • Published 28 days ago • 13

upvoted a paper about 1 month ago

LLaDA2.0-Uni: Unifying Multimodal Understanding and Generation with Diffusion Large Language Model

Paper • 2604.20796 • Published Apr 22 • 242

upvoted 4 papers about 2 months ago

Improving Semantic Proximity in Information Retrieval through Cross-Lingual Alignment

Paper • 2604.05684 • Published Apr 7 • 9

GrandCode: Achieving Grandmaster Level in Competitive Programming via Agentic Reinforcement Learning

Paper • 2604.02721 • Published Apr 3 • 630

CARLA-Air: Fly Drones Inside a CARLA World -- A Unified Infrastructure for Air-Ground Embodied Intelligence

Paper • 2603.28032 • Published Mar 30 • 342

FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization

Paper • 2603.19835 • Published Mar 20 • 351