Agent0: Unleashing Self-Evolving Agents from Zero Data via Tool-Integrated Reasoning Paper • 2511.16043 • Published 17 days ago • 104
BhashaBench V1: A Comprehensive Benchmark for the Quadrant of Indic Domains Paper • 2510.25409 • Published Oct 29 • 3
BhashaBench-V1 Collection BhashaBench-V1 is a domain-specific, multi-task, multilingual benchmark designed to evaluate large language models in India-centric contexts. • 6 items • Updated Oct 30 • 2
IndicTTS Datasets Collection Datasets derived from the Indic TTS Database, a special corpus of Indian languages developed by the Speech Technology Consortium at IIT Madras. • 13 items • Updated Mar 6 • 12
TradingAgents: Multi-Agents LLM Financial Trading Framework Paper • 2412.20138 • Published Dec 28, 2024 • 14
AMO-Bench: Large Language Models Still Struggle in High School Math Competitions Paper • 2510.26768 • Published Oct 30 • 33
Can Agent Conquer Web? Exploring the Frontiers of ChatGPT Atlas Agent in Web Games Paper • 2510.26298 • Published Oct 30 • 45
Supervised Reinforcement Learning: From Expert Trajectories to Step-wise Reasoning Paper • 2510.25992 • Published Oct 29 • 44
Fathom-DeepResearch: Unlocking Long Horizon Information Retrieval and Synthesis for SLMs Paper • 2509.24107 • Published Sep 28 • 78
Video-LMM Post-Training: A Deep Dive into Video Reasoning with Large Multimodal Models Paper • 2510.05034 • Published Oct 6 • 48