When Benchmarks Age: Temporal Misalignment through Large Language Model Factuality Evaluation Paper • 2510.07238 • Published Oct 8 • 14
BiasFreeBench: a Benchmark for Mitigating Bias in Large Language Model Responses Paper • 2510.00232 • Published Sep 30 • 15
WildScore: Benchmarking MLLMs in-the-Wild Symbolic Music Reasoning Paper • 2509.04744 • Published Sep 5 • 11
BiasEdit: Debiasing Stereotyped Language Models via Model Editing Paper • 2503.08588 • Published Mar 11 • 7
Benchmarking Chinese Knowledge Rectification in Large Language Models Paper • 2409.05806 • Published Sep 9, 2024 • 15
Exploring Collaboration Mechanisms for LLM Agents: A Social Psychology View Paper • 2310.02124 • Published Oct 3, 2023 • 1
A Comprehensive Study of Knowledge Editing for Large Language Models Paper • 2401.01286 • Published Jan 2, 2024 • 21
How to Unleash the Power of Large Language Models for Few-shot Relation Extraction? Paper • 2305.01555 • Published May 2, 2023
Towards Realistic Low-resource Relation Extraction: A Benchmark with Empirical Baseline Study Paper • 2210.10678 • Published Oct 19, 2022
DeepKE: A Deep Learning Based Knowledge Extraction Toolkit for Knowledge Base Population Paper • 2201.03335 • Published Jan 10, 2022 • 1