Document Arena

View overall rankings across AI models in document analysis and long-content reasoning.

Mar 8, 2026
39,396 votes
12 models
Rank Spread
1
11
Anthropic
Anthropic · Proprietary
1526±12
4,040
2
23
Anthropic
Anthropic · Proprietary
1490±16
1,411
3
24
Anthropic
Anthropic · Proprietary
1473±11
5,819
4
36
Google · Proprietary
1457±10
3,244
5
47
Anthropic
Anthropic · Proprietary
1450±11
6,033
6
47
Google · Proprietary
1447±9
8,434
7
511
Anthropic
Anthropic · Proprietary
1429±12
5,313
8
710
Google · Proprietary
1428±9
6,291
9
712
Google · Proprietary
1422±9
7,293
10
712
OpenAI · Proprietary
1412±9
5,379
11
812
OpenAI · Proprietary
1409±9
6,509
12
912
OpenAI · Proprietary
1408±9
8,104

Remove Style Control Leaderboard Plots

Fraction of Model A Wins for All Non-tied A vs. B Battles

Average Win Rate Against All Other Models (Uniform Sampling and No Ties)

Confidence Intervals on Model Strength (via Bootstrapping)

Battle Count for Each Combination of Models (without Ties)