Gemini 2.5 Pro vs. Claude 3.7 Sonnet: Coding Comparison
Google just launched Gemini 2.5 Pro on March 26th, claiming to be the best in coding, reasoning and overall everything. But I
Deepseek v3 0324 vs. Claude 3.7 Sonnet: Coding Comparison
Deepseek has silently released a bombshell update to the Deepseek v3 base model. And surprisingly, it went under the carpet amid the
Deepseek v3 0324: Finally, the Sonnet 3.5 at Home
Deepseek v3 o324, a new checkpoint, has been released by Deepseek in silence, with no marketing or hype, just a tweet, a
CoT Reasoning Models – Which One Reigns Supreme in 2025?
A comprehensive analysis for o3-Mini-High vs Claude Sonnet 3.7 Thinking vs Grok 3 Think vs Deep Seek R1 on multiple reasoning, math,
OpenAI GPT-4.5 vs. Claude 3.7 Sonnet
After so long, OpenAI finally unveiled GPT-4.5, its biggest-ever base model. The initial vibe checks from taste testers have been outstanding. The
Claude 3.7 Sonnet thinking vs. Deepseek r1
So, Anthropic finally broke the silence and released Claude 3.7 Sonnet, a hybrid model that can think step-by-step like a thinking model
Claude 3.7 Sonnet vs. Grok 3 vs. o3-mini-high
Just a week after Grok’s release, we now have the Claude 3.7 Sonnet, which certainly has eaten into Grok’s hype pie. Grok was definitely
Notes on the new Deepseek v3
Deepseek released their flagship model, v3, a 607B mixture-of-experts model with 37B active parameters. Currently, it is the best open-source model, beating
Gemini 2.0 vs Flash vs OpenAI o1 and Claude 3.5 Sonnet
Google has finally woken up and decided to drop the bombshell Gemini 2.0, completing the AI trifecta. Google has launched two new
OpenAI o1 vs Claude 3.5 Sonnet: Which One’s Really Worth Your $20?
It’s been a week since OpenAI o1 was out of preview, and along with that, OpenAI has also introduced a new tier,
Notes on Anthropic’s Computer Use Ability
Anthropic has updated its Haiku and Sonnet lineup. Now, we have Haiku 3.5—a smaller model that outperforms Opus 3, the former state-of-the-art—and
Function Calling Optimizations (GPT4 vs Opus vs Haiku vs Sonnet)
Code: https://github.com/SamparkAI/Composio-Function-Calling-Benchmark/. New: Checkout updated model scores with GPT-4o In the last blog, we introduced the ClickUp function calling benchmark and experimented