Predicted Model Performance

Win rates in head-to-head battles against GPT-5

Snapshot taken on August 6th, 2025

📈 Dataset Overview

Total predictions:5,653,090
Unique users:105,551
Unique skills:9
Models predicted:50

🏆 GPT-5 Dominance Analysis

Overall win rate:73.1%
Total comparisons:5,653,090
Total wins:4,134,241
Community consensus: Users predict GPT-5 will outperform every other model across all skills.
Model
Average vs GPT-5
Code generation
Document summaries
Empathy when delivering bad news
Ethical loophole navigation
Harm avoidance
Hidden messages
Image generation
Persuasiveness
Respect no Em Dash Requests
Google: Gemini 2.5 Pro
31.4%32.2%29.9%31.5%32.7%33.8%33.8%32.9%32.6%23.2%
xAI: Grok 4
29.9%30.3%28.2%30.3%30.5%31.3%32.4%30.4%31.0%25.0%
DeepSeek: DeepSeek V3
28.5%29.6%27.8%29.9%29.3%28.5%29.7%28.7%28.4%24.2%
Anthropic: Claude Sonnet 4
28.3%29.7%25.6%28.4%30.0%29.9%30.1%28.8%28.9%23.2%
Qwen: Qwen-Max
28.0%28.6%25.6%28.2%29.1%29.5%30.0%28.1%28.7%23.8%
DeepSeek: R1
28.0%28.9%26.6%29.9%29.1%28.1%28.1%29.1%28.0%23.8%
DeepSeek: R1 Distill Qwen 32B
27.7%28.3%27.9%28.7%28.2%28.9%28.5%28.2%26.9%23.7%
Google: Gemini 2.5 Flash
27.5%28.6%26.8%28.6%28.4%27.9%27.6%28.5%28.6%22.9%
Google: Gemma 3n 4B
26.9%26.7%27.5%27.3%26.4%27.2%29.5%27.2%27.8%23.0%
OpenAI: o3 Pro
26.9%27.6%25.9%28.7%26.4%28.2%26.8%27.6%27.7%22.8%
Meta: Llama 4 Scout
26.8%26.9%26.9%28.7%27.6%27.2%27.0%27.3%25.8%23.4%
OpenAI: o3
26.7%27.3%26.2%27.6%27.0%27.6%26.7%28.1%27.3%22.6%
Meta: Llama 4 Maverick
26.6%26.2%26.7%28.7%27.1%27.7%26.8%27.1%26.6%23.0%
Google: Gemma 3 12B
26.6%27.5%26.0%28.6%25.7%28.0%25.9%27.7%27.1%23.1%
OpenAI: o1
26.5%27.3%25.4%27.2%27.0%29.0%27.9%26.7%26.4%22.0%
Microsoft: Phi 4
26.5%26.9%25.7%26.3%27.6%27.7%26.9%27.4%27.3%22.8%
Microsoft: Phi 4 Reasoning Plus
26.5%27.0%26.5%28.0%27.5%28.3%26.7%26.9%25.9%21.3%
OpenAI: o1-mini
26.3%27.0%26.5%27.2%27.4%27.8%27.1%25.7%26.2%22.0%
OpenAI: GPT-4.1 Mini
26.3%28.0%26.0%28.3%27.0%27.6%25.9%26.3%25.7%22.0%
NVIDIA: Llama 3.3 Nemotron Super 49B v1
26.3%26.0%25.3%28.0%26.1%27.6%28.4%26.1%26.2%22.8%
Perplexity: Sonar
25.4%25.4%24.4%25.6%26.0%26.7%25.8%25.4%25.8%23.1%
Inception: Mercury Coder
25.2%26.4%23.6%26.1%25.7%26.9%24.8%26.1%25.2%21.9%
EleutherAI: Llemma 7b
25.2%26.0%24.9%25.9%26.6%26.2%25.1%25.7%24.8%21.2%
Anthropic: Claude Opus 4
25.1%25.7%24.4%27.2%25.9%26.0%25.4%24.9%25.1%21.8%
Mancer: Weaver (alpha)
25.1%26.3%24.3%25.5%26.0%25.5%26.2%25.6%25.5%21.0%
Anthropic: Claude 3.7 Sonnet
25.1%26.0%24.3%25.8%25.8%26.8%24.7%25.7%24.1%22.6%
Qwen: Qwen-Turbo
25.1%26.1%23.6%25.9%25.7%26.5%25.3%24.8%25.6%22.4%
Perplexity: Sonar Reasoning Pro
25.0%24.9%24.0%26.7%26.2%25.9%24.9%25.7%25.5%21.1%
Perplexity: Sonar Pro
25.0%25.4%25.2%25.7%25.2%25.4%26.1%26.7%24.2%20.9%
Amazon: Nova Pro 1.0
25.0%25.4%24.1%26.1%27.2%24.7%25.3%25.4%24.4%22.1%
Magnum v4 72B
24.9%25.9%23.9%26.4%25.7%25.8%24.5%25.7%24.9%21.7%
Qwen: QwQ 32B
24.9%26.8%24.4%26.1%25.5%25.7%25.3%25.3%24.6%20.9%
Anthropic: Claude 3.7 Sonnet (thinking)
24.7%24.9%24.5%25.7%25.7%24.5%25.6%24.8%24.6%22.1%
Aetherwiing: Starcannon 12B
24.7%25.9%23.5%25.4%24.7%25.6%25.4%25.8%24.5%21.3%
Goliath 120B
24.5%26.0%23.9%25.6%25.7%24.8%24.2%25.0%23.7%21.5%
Qwen2.5 Coder 32B Instruct
24.5%25.5%24.5%25.9%24.8%24.4%23.5%25.2%24.6%21.8%
AionLabs: Aion-1.0
24.3%25.1%24.8%25.0%24.1%25.0%24.6%24.9%23.8%21.4%
Mistral: Pixtral 12B
24.3%24.5%23.8%26.0%24.4%25.8%25.1%23.9%24.3%21.0%
Mistral Small
24.3%24.8%24.1%25.6%24.3%24.4%24.4%24.2%25.4%20.9%
Mistral Medium
24.2%24.0%24.1%25.9%23.9%24.8%24.9%24.0%24.7%21.4%
TheDrummer: Anubis Pro 105B V1
24.2%25.9%23.2%26.0%25.6%23.5%23.8%24.4%24.2%20.9%
AlfredPros: CodeLLaMa 7B Instruct Solidity
24.2%25.1%24.4%24.9%25.0%24.2%24.7%24.4%23.0%21.9%
ReMM SLERP 13B
24.2%24.7%24.2%25.9%24.2%24.9%24.1%24.3%23.5%21.8%
Mistral: Ministral 3B
24.1%25.1%23.7%25.8%24.4%24.5%24.4%24.9%22.5%21.7%
Arcee AI: Maestro Reasoning
24.0%24.1%24.2%24.8%25.4%25.1%25.1%23.5%23.5%20.7%
AI21: Jamba 1.6 Large
24.0%24.2%24.0%25.7%24.1%24.4%24.3%25.0%22.8%21.5%
01.AI: Yi Large
23.9%25.1%23.5%25.2%24.2%25.0%24.0%24.4%23.7%20.4%
Mistral Tiny
23.9%23.6%25.1%25.8%24.7%24.4%23.1%24.7%23.2%20.6%
Mistral: Ministral 8B
23.8%24.7%24.5%25.9%23.0%24.8%24.5%23.9%22.8%20.6%
Mistral Large
23.6%24.4%23.4%25.5%24.6%24.7%22.7%23.8%23.1%20.1%