Official model collection for the paper "TokenPacker: Efficient Visual Projector for Multimodal LLM"
LI WENTONG
sunshine-lwt
AI & ML interests
Computer Vision, Multimodal AI
Recent Activity
authored
a paper
18 days ago
Scalable Autoregressive Monocular Depth Estimation
authored
a paper
18 days ago
Inst3D-LMM: Instance-Aware 3D Scene Understanding with Multi-modal
Instruction Tuning
authored
a paper
18 days ago
Uncertainty-Instructed Structure Injection for Generalizable HD Map
Construction