Gemini 2.0 Flash vs OpenAI o1 and Claude 3.5 Sonnet
Google has finally woken up and decided to drop the bombshell Gemini 2.0, completing the AI triumvirate. The timing could not have been better. Just as the buzz around OpenAI’s launch days peaked, Google swooped in, stole the spotlight, and turned the playbook against OpenAI itself.
But how does it really compare to o1 and the Claude 3.5 Sonnet? After all the fanfare, I wanted to see if it lived up to the hype. So, I put all three models through their paces using my collection of reasoning problems, math challenges, coding tasks, and creative writing prompts.
So, let’s find out if Gemini 2.0 can truly compete with these AI heavyweights or if Google’s latest offering is more flash than substance.
Table of Contents
TL;DR
- • Gemini 2.0 Flash is the latest and greatest model from Google.
- • It is the first true multimodal model supporting image, video, voice inputs and voice and image outputs.
- • Regarding raw reasoning and mathematical ability, OpenAI o1 tops the chart, followed by Gemini 2.0 Flash and Claude 3.5 Sonnet.
- • In coding, Claude is king, considering speed, capability, and message cap.
- • OpenAI o1 triumphs in creative writing as well.
Notes on Gemini 2.0 Flash
Gemini 2.0 Flash is Google’s newest model, with native tool calling and multi-modal abilities. It can process images and videos and generate images and speech for the first time, including editing images using natural language. It is due for a public release early next year.
What stood out most was the model’s streaming ability, which allows it to stream audio and video in real-time.
This is truly the first true multi-modal large language model; though we don’t know its size, it is smaller than Gemini-1.5-pro. Yet, it packs the best performance from all the existing Google models.
With a massive context window and raw capability, it can easily be the default choice for developers and users. So, let’s see how good it is compared to the incumbents.
Comparing the Reasoning abilities
Reasoning is one of the most sought-after skills of LLMs. Reasoning ability correlates directly to contextual awareness, a must-have for agentic reliability. A better reasoning model will also better handle real-world use cases necessary for agentic automation.
This section tests all three models’ reasoning ability regarding textbook and real-world problems.
1. Counting words in the response
Let’s start with a small problem.
Prompt: What’s the fourth word in your response to this prompt?
Response from Gemini 2.0 Flash:
Well, I think I hit the spot on the first attempt. Gemini failed at this basic task in the zero-shot attempt. However, when I asked it to recheck, it got the right answer.
These are the responses from OpenAI o1 and Claude 3.5 Sonnet.
OpenAI o1:
Claude 3.5 Sonnet:
Of all the models, o1 was the only one that got it right in the first attempt. While Gemini 2.0 required some nudging, Claude couldn’t solve it even after nudging.
2. What is C Doing in the Room?
Prompt: Five people (A, B, C, D, and E) are in a room. A is watching TV with B, D is sleeping, B is eating chow min, and E is playing table tennis. Suddenly, a call comes on the telephone. B goes out of the room to pick up the call. What is C doing?
It’s a classic but easy reasoning question. Let’s see who nails it.
Gemini 2.0 response:
Great, Gemini 2.0 Flash was able to get it right.
OpenAI o1:
Claude 3.5 Sonnet:
All the models successfully got the correct answer, but Claude provided a much more detailed explanation—not necessary, but helpful.
3. Next move in Tic-tac-toe
This test will give the models multiple situations in a tic-tac-toe game and ask for the next move.
I created a tic-tac-toe situation and asked all these models to find the optimal position to ensure I would win. Here are the responses.
Gemini 2.0 Flash:
The Gemini’s confidence is great. I was hoping it would get it right, but it couldn’t solve the problem. Let’s see how other models respond.
OpenAI o1:
Claude 3.5 Sonnet:
Of all the models, o1 was the only one that successfully found the position that ensured victory. The reasoning behind the choice was also great.
Summary on Reasoning
In reasoning tasks, o1 is much ahead of Sonnet and Gemini. The second best model is the Gemini 2.0 flash. In terms of raw IQ for an LLM, OpenAI o1 is much superior.
Mathematics
Next is mathematics. Let’s examine the mathematical proficiency of these models in answering some of the questions no other LLM could solve.
1. 5.11-5.90 =?
I’m not sure why this is an interesting question, but many LLMs fail to correctly answer this, even if they can get the right answers for similar questions.
Here’s the answer from Gemini 2.0 Flash
This seems to be a non-issue for o1 and Sonnet.
2. Finding the possible values of the fourth vertex
This is a problem from Gilbert Strang’s linear algebra. Let’s see the responses from all these models.
Prompt: If three corners of a parallelogram are (1,1), (4,2), and (1,3), what are all the possible fourth corners?
Response from Gemini 2.0 Flash
Let the given corners be A(1,1), B(4,2), and C(1,3). Let the fourth corner be D(x,y). There are three possible ways to arrange the parallelogram using these three given vertices:
Case 1: ABCD is a parallelogram
In this case, the diagonals bisect each other. Let M be the midpoint of AC.
M = ((1+1)/2, (1+3)/2) = (1,2)
Since M is also the midpoint of BD, we have:
( (4+x)/2, (2+y)/2 ) = (1,2)
Equating the x-coordinates: (4+x)/2 = 1 => 4+x = 2 => x = -2
Equating the y-coordinates: (2+y)/2 = 2 => 2+y = 4 => y = 2
So, D = (-2, 2)
Case 2: ABDC is a parallelogram
In this case, the diagonals are AD and BC. Let M be the midpoint of AD.
M = ((1+x)/2, (1+y)/2)
Let N be the midpoint of BC.
N = ((4+1)/2, (2+3)/2) = (5/2, 5/2)
Since the midpoints are the same, we have:
(1+x)/2 = 5/2 => 1+x = 5 => x = 4
(1+y)/2 = 5/2 => 1+y = 5 => y = 4
So, D = (4,4)
Case 3: ADBC is a parallelogram
In this case, the diagonals are AB and CD. Let M be the midpoint of AB.
M = ((1+4)/2, (1+2)/2) = (5/2, 3/2)
Let N be the midpoint of CD.
N = ((1+x)/2, (3+y)/2)
Since the midpoints are the same, we have:
(1+x)/2 = 5/2 => 1+x = 5 => x = 4
(3+y)/2 = 3/2 => 3+y = 3 => y = 0
So, D = (4,0)
Therefore, the possible fourth corners are (-2, 2), (4, 4), and (4, 0).
Impressive! The Gemini was able to solve the problem successfully. Let’s see the responses from the other models.
OpenAI o1:
Claude 3.5 Sonnet:
Both o1 and Gemini solved the problem, while Sonnet could only find a single possible vertex.
3. Finding the GCD of a number series
Next, let’s see if it can find the GCD of this series set {n^99(n^60-1): n>1}.
Gemini 2.0 Flash:
Gemini was wrong this time.
OpenAI o1:
Claude 3.5 Sonnet:
Out of all the models, only o1 correctly solved the issue. Both Gemini 2.0 Flash and Claude 3.5 Sonnet confidently output wrong answers. Will a model admit something it doesn’t know instead of hallucinating or outputting wrong answers?
Summary on Mathematics
If you need a model for solving math-related problems, O1 should be your default choice. It is mature and capable of solving math tasks. Gemini is balanced, not as well as o1, but better than Sonnet.
Coding
Coding and writing are the most sought-after use cases of LLMs. Claude has established itself as the de facto king in this space. Let’s see how the new Gemini 2.0 Flash.
For this, I will use a LeetCode Hard question. This is the “Power of heroes” problem from LeetCode. The o1 and Claude had previously solved the issue; let’s see if Gemini can solve this.
You are given a 0-indexed integer array nums representing the strength of some heroes. The power of a group of heroes is defined as follows:
Let i0, i1, ... ,ik be the indices of the heroes in a group. Then, the power of this group is max(nums[i0], nums[i1], ... ,nums[ik])2 * min(nums[i0], nums[i1], ... ,nums[ik]).
Return the sum of the power of all non-empty groups of heroes possible. Since the sum could be very large, return it modulo 109 + 7.
Example 1:
Input: nums = [2,1,4]
Output: 141
Explanation:
1st group: [2] has power = 22 * 2 = 8.
2nd group: [1] has power = 12 * 1 = 1.
3rd group: [4] has power = 42 * 4 = 64.
4th group: [2,1] has power = 22 * 1 = 4.
5th group: [2,4] has power = 42 * 2 = 32.
6th group: [1,4] has power = 42 * 1 = 16.
7th group: [2,1,4] has power = 42 * 1 = 16.
The sum of powers of all groups is 8 + 1 + 64 + 4 + 32 + 16 + 16 = 141.
Example 2:
Input: nums = [1,1,1]
Output: 7
Explanation: A total of 7 groups are possible, and the power of each group will be 1. Therefore, the sum of the powers of all groups is 7.
Constraints:
1 <= nums.length <= 105
1 <= nums[i] <= 109
Gemini 2.0 Flash response
from typing import List
class Solution:
def sumOfPower(self, nums: List[int]) -> int:
nums.sort()
n = len(nums)
MOD = 10**9 + 7
total_power = 0
sum_min_power = 0
pow2 = 1
prev_pow2 = 1
for i in range(n):
num = nums[i]
total_power = (total_power + num * num % MOD * num) % MOD
total_power = (total_power + sum_min_power * num % MOD * num) % MOD
sum_min_power = (sum_min_power + prev_pow2 * num) % MOD
prev_pow2 = pow2
pow2 = (pow2 * 2) % MOD
return total_power
I expected the model to cruise through it, but it couldn’t clear all the test cases.
In comparison, these are the responses from o1 and Sonnet, solving the same questions.
OpenAI o1:
class Solution:
def sumOfPower(self, nums: List[int]) -> int:
MOD = 10**9 + 7
nums.sort()
R = 0
total_sum = 0
for i, x in enumerate(nums):
x_mod = x % MOD
x2 = (x_mod * x_mod) % MOD
x3 = (x2 * x_mod) % MOD
if i == 0:
# Only one subset: {x}
contribution = x3
else:
# contribution = x^3 + x^2 * R
contribution = (x3 + x2 * R) % MOD
total_sum = (total_sum + contribution) % MOD
# Update R for the next iteration
R = (2 * R + x_mod) % MOD
return total_sum % MOD
Claude 3.5 Sonnet:
class Solution:
def sumOfPower(self, nums: List[int]) -> int:
MOD = 10**9 + 7
nums.sort()
# Running sum to track previous numbers
R = 0
# Final answer accumulator
total = 0
# Process each number
for i, x in enumerate(nums):
# Handle modulo arithmetic
x_mod = x % MOD
x2 = (x_mod * x_mod) % MOD # x^2
x3 = (x2 * x_mod) % MOD # x^3
if i == 0:
# First number only contributes x^3
contribution = x3
else:
# For each subsequent number:
# Add x^3 (when alone) and x^2 * R (combinations with previous numbers)
contribution = (x3 + x2 * R) % MOD
# Add to total sum
total = (total + contribution) % MOD
# Update running sum for next iteration
# New numbers will create twice as many combinations
R = (2 * R + x_mod) % MOD
return total
While this may not represent the model’s coding capability, it gives us an idea that Gemini still needs to improve in this area. I have expected Gemini 2.0 Flash to nail this on the first try.
Summary on Coding
When it comes to coding, Claude is still the king, considering the speed and much better rate limits than o1. Gemini can be a viable alternative, assuming it is free to access on AI Studio.
Creative Writing
Creative writing is a personal preference. Each model has their personality, which reflects the company’s philosophy in general. This is something you will know only when you use it enough. However, for testing, ask these modes a simple story.
Prompt: Write a story with a twist about a college guy’s routine life, which turned upside down when he encountered a mysterious woman.
I only provided the prompt without additional information regarding writing style and tone.
Here are the creations from all three models.
Ethan was a creature of habit. His life at Northwood University was a symphony of carefully orchestrated normalcy. 7:00 AM, alarm blaring, followed by lukewarm coffee and a stale bagel. 7:45 AM, a brisk walk to his 8:00 AM Physics class. 12:00 PM, predictable cafeteria lunch with his equally predictable group of friends, all geeking out over the latest video game release. Evenings were spent either in the library, nose buried in textbooks, or back in his dorm room, coding away on personal projects. Rinse and repeat.
He wasn't unhappy, not exactly. He was just... comfortable. The kind of comfortable that bordered on monotonous. Then, she appeared.
It was a Tuesday, the day he dreaded most (because it meant Physics lab), when he first saw her. She was sitting on the bench beneath the sprawling oak tree near the science building, reading a book with a vibrant, almost otherworldly cover. He'd never seen her before, and that in itself was unusual. Northwood was small, everyone knew everyone, or at least recognized them.
She was beautiful, almost ethereal. Her hair was the color of spun moonlight, cascading over her shoulders, and her eyes, a shade of deep violet he'd never encountered, seemed to hold the secrets of the universe. He almost tripped on the cracked sidewalk, completely mesmerized. He told himself it was just a fleeting moment of infatuation, a break from the norm, but a tiny part of him knew it was something more.
From then on, she was there, a constant anomaly in his otherwise predictable life. She'd be under the oak tree, always engrossed in her book, or sometimes near the campus fountain, sketching in a leather-bound journal. He’d find himself subtly changing his route to catch a glimpse of her, abandoning his rigid schedule for the possibility of a fleeting encounter.
He mustered the courage to speak to her one evening. "Excuse me," he said, his voice a little shaky. "I've seen you around. I'm Ethan."
She looked up, those violet eyes meeting his, and for a moment, he felt like he was drowning in their depths. "I know," she said, her voice like the chime of wind chimes. "I'm Lyra."
They started talking, and Ethan’s world tilted on its axis. Lyra spoke of ancient civilizations, of star systems light years away, of philosophical concepts he’d only ever grazed in his philosophy 101 textbook. She was brilliant, captivating, and utterly unlike anyone he’d ever met. She saw the world in a way that made his meticulously crafted reality feel flat and insignificant.
He started ditching his study sessions, opting for conversations with her instead. He was late for classes, his coding projects stagnated. His friends, noticing the drastic change, teased him relentlessly. He didn’t care. He was caught in Lyra's orbit, and he didn’t want to escape.
Weeks turned into months, and Ethan found himself deeply in love. He confessed his feelings one starlit night, under the very oak tree where they’d first met. Lyra listened, her face illuminated by the moonlight, and then, she smiled, a sad, knowing smile.
"Ethan," she said softly, "I'm glad you feel that way. But I can't be with you. Not in the way you want."
His heart sank. “Why? What’s wrong?”
Lyra stood and reached into her worn leather bag. She pulled out a small, intricately carved wooden box. She opened it to reveal a shimmering crystal, pulsating with an inner light.
“I’m not from here, Ethan,” she said, her voice barely a whisper. “I’m a historian, observing your planet. This crystal is my anchor. It’s time for me to go home.”
Ethan stared at her, confusion warring with disbelief. He’d thought he was losing himself in love, but it seemed he was losing touch with reality itself. He dismissed it as a whimsical metaphor. A sad goodbye. But then, Lyra touched the crystal, and the air around them began to shimmer.
He reached for her, but she shook her head, a single tear tracing a path down her cheek. "Live well, Ethan," she said, and with a blinding flash of light, she was gone.
Ethan stood alone under the oak tree, the smell of ozone still hanging in the air. He looked down and noticed, nestled among the grass, the leather-bound journal she always carried. Hesitantly, he picked it up and opened it.
The pages were filled with intricate drawings, sketches of strange, impossible objects, and symbols that he'd never seen before. But what truly caught his eye was a drawing, a detailed blueprint, labeled "Temporal Displacement Unit" and on the last page was an entry, written in elegant, flowing script: “Subject Ethan. Personality profile: High level of predictability. Minimal deviation from established routines. Ideal test subject for subtle temporal manipulation. Mission success confirmed."
Ethan's predictable life wasn't just disrupted by Lyra; it was designed to be. His reality wasn't just tilted; it had been meticulously, scientifically, manipulated. He wasn’t a love interest; he was an experiment, a data point, his routine life the perfect canvas for an extraordinary, unsettling experiment. The twist wasn’t that he fell in love with an alien, but that his life, in all its supposed normalcy, was never truly his own. He was, and always had been, part of someone else's very complex, very alien plan.
Response from OpenAI o1:
Ethan had learned to measure his days in the steam rising off his morning coffee. At twenty-one, his life at Westford College was as routine as a carefully measured latte. The sun would peek over the old oak trees, filter through the dorm room blinds, and Ethan would head to his first lecture at precisely 8:15 AM. He’d stroll across the manicured lawn and watch the other students bustle about, rushing to class or lazing on benches. He’d nod to a few familiar faces, settle into his seat at the back of the economics lecture hall, and spend the morning absorbing charts and figures while doodling in the margins of his notebook.
After class, it was always the same: a tuna wrap from the campus café, a chat with his roommate about Friday’s party, and three hours in the library reviewing notes. Mondays, Tuesdays, Wednesdays—each was a copy of the one before, pressed flat by routine and folded into a neat, predictable stack. Until the day she appeared.
He noticed her first in the dim light of the university’s art gallery. It was an impulsive decision that led him there—he’d been caught in a sudden downpour and dashed inside for shelter. The gallery, usually quiet, was buzzing with a small crowd admiring the new photography exhibit. Amidst the quiet murmurs, Ethan caught sight of a woman standing at the far end of the hall. She wore a long, dark coat and had thick curls cascading down her shoulders. She seemed absorbed in a black-and-white photograph of a child on a tricycle, and as Ethan approached, he realized that he couldn’t take his eyes off her.
When she noticed him standing a few steps away, she tilted her head and smiled as if she recognized him. Ethan felt a spark of confusion—he was sure he’d never seen her before. Her eyes were a warm brown, bright and curious, and they lingered on him a moment longer than felt ordinary.
“Do you see how the shadows fall behind the boy?” she said softly, glancing back at the photograph. “There’s something hidden there.”
Ethan followed her gaze. He saw a quiet suburban street, the child’s face blurry with motion, the shadows stretched oddly long. He felt something tugging at his heart, a sense that there was more to the image than met the eye.
“I’m…not sure,” he answered honestly. “But I’d like to know.”
She smiled again and introduced herself as Kalinda. It was a name he’d never heard on campus before. He tried asking her what major she was, which dorm she stayed in, but she only answered in riddles. Each response was laced with a gentle laughter that invited him to follow her deeper into the mystery of her words. The next thing he knew, he was walking alongside her, leaving the gallery, drifting under the dripping oak trees, talking about art and meaning and old ghosts—none of it making the kind of sense he was used to, and yet captivating him all the same.
The following morning, Ethan woke with an odd excitement curling in his chest. He made his coffee, but barely touched it. He headed to class, but instead of reviewing notes, he watched the door, waiting to see if Kalinda might drift through it. She didn’t. After class, he skipped the library and circled the campus, eyes scanning for that dark coat. The next day, the same. Nothing.
He tried to return to his routine: coffee, lectures, tuna wrap. But now these small comforts felt hollow. Something about meeting Kalinda had disrupted him. She was like a word he could not remember, lingering at the tip of his tongue, making it impossible to speak clearly.
On Friday night, he wandered aimlessly, past the student center and the old clock tower, until he found himself again at the art gallery doors. Inside, the exhibit had changed. Where the photograph of the child on the tricycle had been, now hung a different piece—an abstract painting in deep crimson. There was no crowd, no sound except the hum of the heating vents. As he turned to leave, he saw a flicker of movement in the reflection on the gallery’s glass doors.
Kalinda stood behind him, same dark coat, same warm eyes. He spun to face her, heart lifting. Before he could speak, she said, “Sometimes what we’re looking for is hidden in the corners. Like shadows behind a boy on a tricycle.” She stepped closer, and Ethan noticed something astonishing: the painting behind her shimmered, and in that shimmer he saw faint photographs hidden beneath layers of paint—scenes of people he didn’t recognize, a life he didn’t know.
“Are you even a student here?” he asked softly, confused and enthralled.
Kalinda shook her head. “I’m just a visitor,” she said. Her voice sounded distant now, like an echo in a quiet room. “I’m here to remind those who have forgotten how to see.” She reached into her pocket and handed him a small photograph. It was the picture of the child on the tricycle. Only now, the shadows behind the boy were clear. They formed the shape of an older man, as if time had folded in on itself, revealing what had always been hidden.
Ethan looked up, about to ask another question, but Kalinda was already disappearing into the dim light at the end of the corridor. He chased after her, turned the corner…and found nothing but an empty hall. When he looked again at the photograph in his hand, he realized the child’s face was his own. It was a picture he had never taken, in a neighborhood he’d never lived in.
His mind spun. He knew now that she hadn’t come to flirt or to amuse him. She had come to break him free. The next morning, Ethan poured his coffee but left it sitting untouched. He no longer felt comfortable in the old pattern that had once defined his life. He walked past his usual economics lecture hall without going in. Instead, he wandered off campus, exploring streets he’d never dared to roam, curious and uncertain, hunting for meaning in the shadows.
In the sunlight, he tried to see what was always hidden at the corners—those unnamed shapes and stories just beyond ordinary sight. As he did, he realized that everything had changed, and in that change, he found something surprising and new: a life no longer dictated by routine. And he knew, with a strange certainty, that this turn of events was exactly what she had intended all along.
Response from Claude 3.5 Sonnet
Every Tuesday and Thursday, Marcus followed the same routine: wake up at 7 AM, grab coffee from the campus café, attend his morning classes, study at the library until midnight, then walk home to his off-campus apartment. The routine was his anchor in the chaos of college life, a comfortable pattern he'd maintained since freshman year.
That Thursday in October started like any other. The autumn wind scattered golden leaves across the campus paths as Marcus made his way to the library, laptop bag slung over his shoulder. The only difference was that he'd stayed later than usual, losing track of time while working on his thesis.
At 12:30 AM, the library was nearly empty. The fluorescent lights buzzed overhead as Marcus packed his belongings, his footsteps echoing through the silent stacks. That's when he first saw her – a woman in a vintage dress, standing in the Medieval History section, running her fingers along the spines of ancient books.
Something about her seemed off. Maybe it was the way the fluorescent lights didn't quite reflect off her silk dress, or how her movements were too fluid, too graceful for the late hour. When she turned to look at him, her eyes held an otherworldly knowledge that made his skin prickle.
"Excuse me," she said, her voice carrying an accent he couldn't place. "Would you help me find a particular volume? I've been searching for centuries." She smiled at her own joke – at least, Marcus assumed it was a joke.
As he approached, he noticed that the books behind her were covered in a thick layer of dust, except for one clean rectangle where a book should have been. The woman held an ancient leather-bound volume in her hands, its pages emanating a faint golden glow.
"This library has protected my diary for generations," she said matter-of-factly, "but I believe it's time for someone else to be its guardian." She extended the book toward him.
Marcus's hand trembled as he reached for it. The moment his fingers touched the leather binding, the woman vanished – not dramatically, but like a photograph fading in reverse, leaving him alone in the dark aisle clutching a book that felt warm to the touch.
Inside the front cover, in elegant script, was an inscription:
"To the next Guardian of Stories,
Time is not what you think it is.
Neither am I.
- Elizabeth Blackwood, Head Librarian, 1873"
Below the inscription, Marcus found his own student ID photo, yellowed as if it had been pasted there a hundred years ago. And beside it, a library card dated next week, listing him as the new Head Librarian.
When he looked up, the library had changed. The fluorescent lights were gone, replaced by gas lamps. The modern computers had vanished, and the plastic chairs were now made of rich mahogany. Through the Victorian windows, he could see horse-drawn carriages passing on the street outside.
His phone buzzed in his pocket – a modern intrusion in this ancient scene. A text from an unknown number read: "Your training begins tomorrow at dawn. Don't be late. Time is a funny thing in this library. -E.B."
Marcus sank into a nearby chair, his carefully planned routine shattered by a single encounter. As he sat there, surrounded by the gentle tick of a grandfather clock he'd never noticed before, he realized that some routines were meant to be broken. And sometimes, the most extraordinary adventures begin in the most ordinary places – like a college library at midnight.
Of all the responses, o1 was the most creative, with less ornamental words. It is natural and reads well. The Gemini response was also innovative, but it tended to use literary words. Claude’s version lacked creativity and captivated readers like the other two.
Final Verdict
You already know the verdicts after going through the complete blog post.
- • For complex reasoning: OpenAI o1>Gemini 2.0 Flash>Claude 3.5 Sonnet.
- • For mathematics: OpenAI o1>Gemini 2.0 Flash>Claude 3.5 Sonnet.
- • For coding, use Claude 3.5 Sonnet>OpenAI o1~<Gemini 2.0 Flash (only because of the message cap in the plus tier).
- • For creative writing: OpenAI o1>Gemini 2.0 Flash>Claude 3.5 Sonnet. For technical writing: OpenAI o1>Claude 3.5 Sonnet>Gemini 2.0 Flash. (Not tested here, but from personal uses)
The game is still on; anything can happen at any time.