Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length Paper • 2512.04677 • Published 4 days ago • 150
LongCodeZip: Compress Long Context for Code Language Models Paper • 2510.00446 • Published Oct 1 • 108
StableToken: A Noise-Robust Semantic Speech Tokenizer for Resilient SpeechLLMs Paper • 2509.22220 • Published Sep 26 • 65
RPG: A Repository Planning Graph for Unified and Scalable Codebase Generation Paper • 2509.16198 • Published Sep 19 • 127
MachineLearningLM: Continued Pretraining Language Models on Millions of Synthetic Tabular Prediction Tasks Scales In-Context ML Paper • 2509.06806 • Published Sep 8 • 63
Beyond Pass@1: Self-Play with Variational Problem Synthesis Sustains RLVR Paper • 2508.14029 • Published Aug 19 • 118
The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding Paper • 2502.08946 • Published Feb 13 • 191
Zero-Shot Novel View and Depth Synthesis with Multi-View Geometric Diffusion Paper • 2501.18804 • Published Jan 30 • 5
SafeRAG: Benchmarking Security in Retrieval-Augmented Generation of Large Language Model Paper • 2501.18636 • Published Jan 28 • 32
PhD Knowledge Not Required: A Reasoning Challenge for Large Language Models Paper • 2502.01584 • Published Feb 3 • 9
Bringing Objects to Life: 4D generation from 3D objects Paper • 2412.20422 • Published Dec 29, 2024 • 40
Mobile-Agent-E: Self-Evolving Mobile Assistant for Complex Tasks Paper • 2501.11733 • Published Jan 20 • 28
IntellAgent: A Multi-Agent Framework for Evaluating Conversational AI Systems Paper • 2501.11067 • Published Jan 19 • 13
Step-KTO: Optimizing Mathematical Reasoning through Stepwise Binary Feedback Paper • 2501.10799 • Published Jan 18 • 15
RealCritic: Towards Effectiveness-Driven Evaluation of Language Model Critiques Paper • 2501.14492 • Published Jan 24 • 32
Emilia: A Large-Scale, Extensive, Multilingual, and Diverse Dataset for Speech Generation Paper • 2501.15907 • Published Jan 27 • 17
MinMo: A Multimodal Large Language Model for Seamless Voice Interaction Paper • 2501.06282 • Published Jan 10 • 52
ConceptMaster: Multi-Concept Video Customization on Diffusion Transformer Models Without Test-Time Tuning Paper • 2501.04698 • Published Jan 8 • 15
Emma-X: An Embodied Multimodal Action Model with Grounded Chain of Thought and Look-ahead Spatial Reasoning Paper • 2412.11974 • Published Dec 16, 2024 • 10