Code2World: A GUI World Model via Renderable Code Generation Paper • 2602.09856 • Published 11 days ago • 191
Zooming without Zooming: Region-to-Image Distillation for Fine-Grained Multimodal Perception Paper • 2602.11858 • Published 10 days ago • 58
Stroke3D: Lifting 2D strokes into rigged 3D model via latent diffusion models Paper • 2602.09713 • Published 12 days ago • 8
ObjEmbed: Towards Universal Multimodal Object Embeddings Paper • 2602.01753 • Published 20 days ago • 5
Green-VLA: Staged Vision-Language-Action Model for Generalist Robots Paper • 2602.00919 • Published 21 days ago • 284
view article Article SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data +7 Jun 3, 2025 • 14
UPLiFT: Efficient Pixel-Dense Feature Upsampling with Local Attenders Paper • 2601.17950 • Published 27 days ago • 4
ActionMesh: Animated 3D Mesh Generation with Temporal 3D Diffusion Paper • 2601.16148 • Published about 1 month ago • 12
EvoCUA: Evolving Computer Use Agents via Learning from Scalable Synthetic Experience Paper • 2601.15876 • Published about 1 month ago • 90
Paper2Rebuttal: A Multi-Agent Framework for Transparent Author Response Assistance Paper • 2601.14171 • Published Jan 20 • 50
OmniTransfer: All-in-one Framework for Spatio-temporal Video Transfer Paper • 2601.14250 • Published Jan 20 • 47
Urban Socio-Semantic Segmentation with Vision-Language Reasoning Paper • 2601.10477 • Published Jan 15 • 155
3AM: Segment Anything with Geometric Consistency in Videos Paper • 2601.08831 • Published Jan 13 • 34
KnowMe-Bench: Benchmarking Person Understanding for Lifelong Digital Companions Paper • 2601.04745 • Published Jan 8 • 58
Avatar Forcing: Real-Time Interactive Head Avatar Generation for Natural Conversation Paper • 2601.00664 • Published Jan 2 • 56
DreamID-V:Bridging the Image-to-Video Gap for High-Fidelity Face Swapping via Diffusion Transformer Paper • 2601.01425 • Published Jan 4 • 52
LiveTalk: Real-Time Multimodal Interactive Video Diffusion via Improved On-Policy Distillation Paper • 2512.23576 • Published Dec 29, 2025 • 65