2 19 2

Zeng

AuroraZengfh

AI & ML interests

None yet

Recent Activity

upvoted a paper 13 days ago

LLaVA-OneVision-2: Towards Next-Generation Perceptual Intelligence

authored a paper 13 days ago

Does Seeing More Mean Knowing More? Mono-Anchored Advantage Normalization for Multi-Source Visual Reasoning

upvoted a paper 13 days ago

Does Seeing More Mean Knowing More? Mono-Anchored Advantage Normalization for Multi-Source Visual Reasoning

View all activity

Organizations

upvoted 2 papers 13 days ago

LLaVA-OneVision-2: Towards Next-Generation Perceptual Intelligence

Paper • 2605.25979 • Published 16 days ago • 27

Does Seeing More Mean Knowing More? Mono-Anchored Advantage Normalization for Multi-Source Visual Reasoning

Paper • 2605.25437 • Published 16 days ago • 16

upvoted 3 papers 20 days ago

CL-VISTA: Benchmarking Continual Learning in Video Large Language Models

Paper • 2604.00677 • Published Apr 1 • 1

Pelican-Unified 1.0: A Unified Embodied Intelligence Model for Understanding, Reasoning, Imagination and Action

Paper • 2605.15153 • Published 27 days ago • 1

Fine-Grained Post-Training Quantization for Large Vision Language Models with Quantization-Aware Integrated Gradients

Paper • 2603.17809 • Published Mar 18 • 1

upvoted a collection 23 days ago

merging

Collection

24 items • Updated Nov 23, 2025 • 4

upvoted 2 papers 3 months ago

Imagination Helps Visual Reasoning, But Not Yet in Latent Space

Paper • 2602.22766 • Published Feb 26 • 44

HyTRec: A Hybrid Temporal-Aware Attention Architecture for Long Behavior Sequential Recommendation

Paper • 2602.18283 • Published Feb 20 • 57

upvoted 7 papers 5 months ago

MLLM-CL: Continual Learning for Multimodal Large Language Models

Paper • 2506.05453 • Published Jun 5, 2025 • 4

Local-Prompt: Extensible Local Prompts for Few-Shot Out-of-Distribution Detection

Paper • 2409.04796 • Published Sep 7, 2024 • 1

ModalPrompt: Towards Efficient Multimodal Continual Instruction Tuning with Dual-Modality Guided Prompt

Paper • 2410.05849 • Published Oct 8, 2024 • 1

HiDe-LLaVA: Hierarchical Decoupling for Continual Instruction Tuning of Multimodal Large Language Model

Paper • 2503.12941 • Published Mar 17, 2025 • 1

MambaIC: State Space Models for High-Performance Learned Image Compression

Paper • 2503.12461 • Published Mar 16, 2025 • 2

Urban Socio-Semantic Segmentation with Vision-Language Reasoning

Paper • 2601.10477 • Published Jan 15 • 155

Token Reduction Should Go Beyond Efficiency in Generative Models -- From Vision, Language to Multimodality

Paper • 2505.18227 • Published May 23, 2025 • 15

upvoted 2 papers 6 months ago

VTCBench: Can Vision-Language Models Understand Long Context with Vision-Text Compression?

Paper • 2512.15649 • Published Dec 17, 2025 • 7

On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models

Paper • 2512.07783 • Published Dec 8, 2025 • 40

upvoted 2 papers 8 months ago

MCITlib: Multimodal Continual Instruction Tuning Library and Benchmark

Paper • 2508.07307 • Published Aug 10, 2025 • 1

Parameter Efficient Merging for Multimodal Large Language Models with Complementary Parameter Adaptation

Paper • 2502.17159 • Published Feb 24, 2025 • 2

Zeng

AI & ML interests

Recent Activity

Organizations

AuroraZengfh's activity