Model Battles

Unbiased, data-driven comparisons of the world's most powerful AI models. See who wins in coding, reasoning, and creative writing.

GPT-4o vs Claude 3.5 Sonnet

In-depth comparison of [GPT-4o](/lab?model=openai/gpt-4o) and [Claude](/lab?model=anthropic/claude-3.5-sonnet) 3.5 Sonnet based on real-world benchmarks (HumanEval, LMSYS). Find out which one wins for coding, writing, and daily use.

View Comparison

GPT-4o vs Gemini 1.5 Pro

In-depth comparison of [GPT-4o](/lab?model=openai/gpt-4o) and [Gemini](/lab?model=google/gemini-flash-1.5) 1.5 Pro based on real-world benchmarks (HumanEval, LMSYS). Find out which one wins for [coding](/lab?model=anthropic/claude-3.5-sonnet), writing, and daily use.

View Comparison

GPT-4o vs Llama 3.1 405B

In-depth comparison of [GPT-4o](/lab?model=openai/gpt-4o) and Llama 3.1 405B based on real-world benchmarks (HumanEval, LMSYS). Find out which one wins for [coding](/lab?model=anthropic/claude-3.5-sonnet), writing, and daily use.

View Comparison

GPT-4o vs DeepSeek V2.5

In-depth comparison of [GPT-4o](/lab?model=openai/gpt-4o) and [DeepSeek](/lab?model=deepseek/deepseek-chat) V2.5 based on real-world benchmarks (HumanEval, LMSYS). Find out which one wins for [coding](/lab?model=anthropic/claude-3.5-sonnet), writing, and daily use.

View Comparison

Claude 3.5 Sonnet vs Gemini 1.5 Pro

In-depth comparison of [Claude](/lab?model=anthropic/claude-3.5-sonnet) 3.5 Sonnet and [Gemini](/lab?model=google/gemini-flash-1.5) 1.5 Pro based on real-world benchmarks (HumanEval, LMSYS). Find out which one wins for coding, writing, and daily use.

View Comparison

Claude 3.5 Sonnet vs Llama 3.1 405B

In-depth comparison of [Claude](/lab?model=anthropic/claude-3.5-sonnet) 3.5 Sonnet and Llama 3.1 405B based on real-world benchmarks (HumanEval, LMSYS). Find out which one wins for coding, writing, and daily use.

View Comparison

Claude 3.5 Sonnet vs DeepSeek V2.5

In-depth comparison of [Claude](/lab?model=anthropic/claude-3.5-sonnet) 3.5 Sonnet and [DeepSeek](/lab?model=deepseek/deepseek-chat) V2.5 based on real-world benchmarks (HumanEval, LMSYS). Find out which one wins for coding, writing, and daily use.

View Comparison

Gemini 1.5 Pro vs Llama 3.1 405B

In-depth comparison of [Gemini](/lab?model=google/gemini-flash-1.5) 1.5 Pro and Llama 3.1 405B based on real-world benchmarks (HumanEval, LMSYS). Find out which one wins for [coding](/lab?model=anthropic/claude-3.5-sonnet), writing, and daily use.

View Comparison

Gemini 1.5 Pro vs DeepSeek V2.5

In-depth comparison of [Gemini](/lab?model=google/gemini-flash-1.5) 1.5 Pro and [DeepSeek](/lab?model=deepseek/deepseek-chat) V2.5 based on real-world benchmarks (HumanEval, LMSYS). Find out which one wins for [coding](/lab?model=anthropic/claude-3.5-sonnet), writing, and daily use.

View Comparison

Llama 3.1 405B vs DeepSeek V2.5

In-depth comparison of Llama 3.1 405B and [DeepSeek](/lab?model=deepseek/deepseek-chat) V2.5 based on real-world benchmarks (HumanEval, LMSYS). Find out which one wins for [coding](/lab?model=anthropic/claude-3.5-sonnet), writing, and daily use.

View Comparison