Haitao Mi
haitaominlp
AI & ML interests
Large Language Models
Recent Activity
upvoted
a
collection
3 days ago
Olmo 3
upvoted
a
paper
about 2 months ago
Can LLMs Guide Their Own Exploration? Gradient-Guided Reinforcement Learning for LLM Reasoning