Where should I go to choose which AI to use?

This page shows raw benchmark data. For editorial picks matched to your use case, see our Best AI Tools page, which covers writing, research, coding, and more.

AI Model Leaderboard 2026

Q: What is LMArena Elo and how is it calculated?

LMArena (formerly LMSYS Chatbot Arena) collects blind head-to-head votes from real users who compare two anonymous model responses side-by-side. The Elo score is computed from these pairwise outcomes using the same rating system used in chess. A higher Elo means users consistently preferred that model over others in blind tests. It is one of the most reliable public measures of general model quality because it is based on real human preference, not narrow test suites.

Q: How often is this page updated?

We update this page every Monday with the latest LMArena Elo scores and any significant benchmark changes from the prior week. The verified date at the top of the page reflects the last update.

LMArena Elo scores and benchmark data for the top AI assistants. Raw numbers, updated weekly.

Last verified: June 10, 2026 · Methodology · Disclosure

📅 Last checked: June 10, 2026. Claude Fable 5 and Mythos 5 launched June 9 — rankings updated to reflect. Live Elo scores at LMArena →

📊 Data source: LMArena Elo (blind pairwise votes from real users) plus public benchmark scores (MMLU, GPQA, HumanEval). Elo is updated as new votes are collected. View the live leaderboard at LMArena →

Looking for a recommendation? This page shows benchmark scores, not opinions. For editorial picks matched to your use case, see Best AI Tools.

LMArena Elo Rankings

Ranked by LMArena Elo score as of June 10, 2026. Elo is computed from millions of blind head-to-head votes where users pick the better response without knowing which model produced it.

#	Model	Elo
1	Claude (Fable 5 / Mythos 5) Anthropic · Free · Pro $20/mo Writing Reasoning Long docs Coding Analysis	1,432	Try free →
2	ChatGPT (GPT-5) OpenAI · Free · Plus $20/mo Voice Image gen Plugins Web search Coding	1,408	Try free →
3	Gemini (3.1 Pro) Google · Free (2.0 Flash) · Advanced via Google One Google Workspace Web search Multimodal Long context	1,374	Try free →
4	Perplexity (Sonar Pro) Perplexity AI · Free · Pro $17/mo (annual) Web research Citations Multi-model Speed	1,298	Try free →
5	Microsoft Copilot Microsoft · Free · Copilot Pro $20/mo · Included in M365 Microsoft 365 Word / Excel Web search No setup	1,241	Try free →

Elo scores are approximate, rounded to the nearest whole number, and change daily as new votes are collected. Source: lmarena.ai/leaderboard.

Capability by task

How the five models compare across common tasks, based on benchmark data and independent testing.

Task	Claude	ChatGPT	Gemini	Perplexity	Copilot
Writing & editing	★★★	★★★	★★	★★	★★
Complex reasoning	★★★	★★★	★★	★★	★★
Long documents / PDFs	★★★	★★	★★★	★★	★★
Web research & citations	★★	★★	★★★	★★★	★★
Voice conversations	—	★★★	★★★	—	★★
Image generation	—	★★★	★★★	—	★★
Coding assistance	★★★	★★★	★★	—	★★
Google Workspace	—	—	★★★	—	—
Microsoft 365 / Office	—	—	—	—	★★★
Free tier quality	★★	★★	★★★	★★	★★★

★★★ Best in class · ★★ Capable · — Not a primary strength

What is LMArena Elo?

LMArena (formerly LMSYS Chatbot Arena) runs a continuous blind tournament. Users see two anonymous AI responses to the same prompt and vote for whichever they prefer. The Elo score is calculated from millions of these pairwise outcomes using the same algorithm as chess ratings - a model gains points for beating higher-ranked opponents and loses points when a lower-ranked model beats it.

Elo is one of the most reliable public measures of model quality because it captures real human preference across a huge variety of prompts and use cases - not a narrow academic benchmark. Scores fluctuate daily as new votes arrive. A gap of 10 points is small; a gap of 50+ points is meaningful.

View live scores at lmarena.ai →

Frequently asked questions

What is LMArena Elo and how is it calculated?

LMArena collects blind head-to-head votes from real users who compare two anonymous model responses side-by-side. The Elo score is computed from these pairwise outcomes. A higher Elo means users consistently preferred that model in blind tests. It is one of the most reliable public measures because it captures real human preference, not narrow test suites.

Why is Claude ranked first?

Claude Fable 5 and Mythos 5, both launched June 9 2026, achieved the highest Elo scores on LMArena as of the June 10 update. In blind pairwise voting, users consistently preferred Claude's responses for writing quality, nuance, and reasoning. Rankings change frequently as new models are released - check LMArena for the latest.

How often is this page updated?

We update every Monday with the latest LMArena Elo scores and any significant benchmark changes from the prior week. The verified date at the top of the page shows the last update.

Where should I go to pick an AI tool?

This page shows raw benchmark data. For editorial picks matched to your specific use case - writing, research, coding, Microsoft 365 - see Best AI Tools. If you are a beginner, Best AI Tools for Beginners is a better starting point.

AI Model Leaderboard 2026

LMArena Elo Rankings

Capability by task

What is LMArena Elo?

Frequently asked questions

Related