Home > artificial teknologi training > LLM Benchmarks Are Broken—The Leaderboard Illusion

LLM Benchmarks Are Broken—The Leaderboard Illusion

Posted by adijaya on May 01, 2025

In this video, I dive into the controversy surrounding the Leaderboard Illusion paper and what it reveals about systematic flaws in LLM benchmarks—especially Chatbot Arena. As someone who’s followed the evolution of these leaderboards closely, I was shocked by the extent of data access disparities and selective reporting. This is a wake-up call for the entire AI community.

LLM Benchmarks Are Broken—The Leaderboard Illusion

Recent Posts