Skip to content Skip to footer

Anthropic unveils Claude 3, outperforming GPT-4 in performance tests.

San Francisco-based artificial intelligence (AI) startup Anthropic has unveiled its most recent language learning model (LLM), Claude 3. This innovative development comes in three variations labeled Haiku, Sonnet, and Opus, translating to the sizes of small, medium, and large. Claude 3 Opus is the most progressive in the family and is the first in the industry to claim lead over OpenAI’s GPT-4 on wide-ranging benchmarks, a model which has been the golden standard for AI establishments to gauge their LLM performance.

The achievement makes a difference because, until now, organizations have typically used phrases such as ‘approaching’ or ‘nearly’ to elucidate how their technology measures up against GPT-4. However, the newly-revealed Claude 3 Opus affirms a benchmark prowess surpassing that of GPT-4. That said, credit is given to higher scores recorded for GPT-4 Turbo. Nevertheless, Claude 3 Opus is seen as being significant due to its representation of ‘higher intelligence than any other model available’.

While the costs associated with Claude 3 Opus API are higher than GPT-4 Turbo, the smaller models of Sonnet and Haiku present good value for their performance levels. The free trial of Claude 3 is available via Anthropic’s Claude.ai chatbot, subject to server availability after high demand.

Notably, the Claude 3 Opus model does not have multimodal abilities; however, its vision capabilities are impressive. While it cannot generate images, it demonstrates competence at photo, chart, graph, and technical diagram analysis. Moreover, Claude 3 models are capable of accepting inputs over 1 million tokens, and when paired with advanced recall, they deliver near-perfect accuracy of over 99%.

Anthropic advocates for ‘Constitutional AI’ which it views as enhancing the safety and transparency of its models. This pursuit of safety previously resulted in Claude 2 refusing to react to harmless prompts but with improved nuance understanding of prompts, Claude 3 balances the decision-making better. Additionally, Claude 3 demonstrates increased accuracy and reduced hallucinations compared to Claude 2.1.

The company disputes pessimistic predictions of an AI winter by asserting, “model intelligence is nowhere near its limits.” They have plans to upgrade Claude 3 with advanced agentic capabilities, including Tool Use and interactive coding (REPL).

Although the high pricing may limit the initial adoption of Claude 3 Opus to niche research or professional applications, the cost-effectiveness and performance of Sonnet and Haiku are anticipated to generate broader adoption. This advancement puts pressure on OpenAI, and hints at a potential price drop from them or possibly the imminent announcement of GPT-5.

Leave a comment

0.0/5