Anthropic Unveils Claude 3.7 Sonnet with Innovative “Extended Thinking” Mode
Anthropic has announced the launch of Claude 3.7 Sonnet, introducing a groundbreaking “extended thinking” mode. This new iteration is being hailed as the first “hybrid reasoning model,” capable of alternating between quick responses and more in-depth analysis.
The extended thinking feature is available to users with a Pro subscription, priced at $20 per month. To evaluate the effectiveness of this new mode, Business Insider conducted a comparative test between Claude 3.7, OpenAI’s ChatGPT, and xAI’s Grok, focusing on logical reasoning and creative tasks.
In a logic-based challenge involving a riddle, ChatGPT demonstrated superior performance, providing a swift and accurate response with minimal explanation. Grok 3 offered a more detailed reasoning process but took longer to arrive at the answer. Claude 3.7’s standard mode responded quickly but with some uncertainty, while its extended thinking mode explored multiple possibilities, taking more time in the process.
Anthropic acknowledges that the extended thinking mode may occasionally lead to incomplete or incorrect conclusions, highlighting the complexity of balancing depth and accuracy in AI reasoning.
For the creativity test, the AI models were tasked with composing a poem about AI sentience. ChatGPT quickly produced a poem that lacked focus, while Grok 3 created a dream-themed piece. Claude 3.7’s normal mode suggested metaphors rapidly, but its extended thinking mode showcased a more thorough approach, brainstorming various concepts before crafting a more nuanced poem.
The extended thinking mode appeared particularly beneficial for creative endeavors, allowing for deeper exploration and refinement of ideas. However, for logical reasoning tasks, the feature sometimes proved to be a hindrance, with ChatGPT maintaining an edge in both speed and accuracy.
Anthropic emphasizes that the extended thinking mode is designed for complex real-world challenges, such as intricate coding problems, where comprehensive analysis can be advantageous. Developers have the option to adjust the “thinking budget” to optimize the balance between speed, cost, and output quality.
According to Anthropic, Claude 3.7 Sonnet outperforms competitors in certain benchmarks, particularly in software engineering tasks as measured by the SWE benchmark. As AI technology continues to evolve, the introduction of features like extended thinking mode represents a significant step towards more versatile and capable AI assistants.