Mu Cai's picture

7 10 3

Mu Cai

mucai

·

https://pages.cs.wisc.edu/~mucai/

AI & ML interests

Computer Vision, Deep Learning, 3D Vision, Vision and Language,

Recent Activity

upvoted a paper about 1 month ago

Contamination Detection for VLMs using Multi-Modal Semantic Perturbation

commented on a paper about 1 month ago

Contamination Detection for VLMs using Multi-Modal Semantic Perturbation

upvoted a paper 4 months ago

When Tokens Talk Too Much: A Survey of Multimodal Long-Context Token Compression across Images, Videos, and Audios

View all activity

Organizations

upvoted a paper about 1 month ago

Contamination Detection for VLMs using Multi-Modal Semantic Perturbation

Paper • 2511.03774 • Published Nov 5 • 12

upvoted a paper 4 months ago

When Tokens Talk Too Much: A Survey of Multimodal Long-Context Token Compression across Images, Videos, and Audios

Paper • 2507.20198 • Published Jul 27 • 26

upvoted a paper 5 months ago

Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities

Paper • 2507.06261 • Published Jul 7 • 64

upvoted a paper 10 months ago

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published Jan 28 • 123

upvoted a paper about 1 year ago

Vinoground: Scrutinizing LMMs over Dense Temporal Reasoning with Short Videos

Paper • 2410.02763 • Published Oct 3, 2024 • 7

upvoted 3 papers over 1 year ago

Matryoshka Multimodal Models

Paper • 2405.17430 • Published May 27, 2024 • 34

FunAudioLLM: Voice Understanding and Generation Foundation Models for Natural Interaction Between Humans and LLMs

Paper • 2407.04051 • Published Jul 4, 2024 • 39

LLaRA: Supercharging Robot Learning Data for Vision-Language Policy

Paper • 2406.20095 • Published Jun 28, 2024 • 18

upvoted a collection over 1 year ago

Matryoshka Multimodal Models

3 items • Updated Aug 4, 2024 • 3

upvoted a paper over 1 year ago

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

Paper • 2404.14219 • Published Apr 22, 2024 • 259