Gemini 2.0 Flash thinking vs. OpenAI o3-mini vs. deep seek r1

Gemini 2.0 flash vs. o3-mini vs deepseek r1

Big AI models are powerful but expensive. Smaller Chain-of-Thought (CoT) models like Gemini 2.0 Flash Thinking, OpenAI’s O3-Mini, and DeepSeek R1 offer a cheaper way to handle reasoning tasks. The real question is whether they are just as good.

Each model has a different pricing style. Gemini 2.0 Flash Thinking keeps input costs low at $0.10 per million tokens. O3-Mini charges $1.10 per million tokens but focuses on structured and efficient responses. DeepSeek R1 changes pricing based on time, with standard rates at $0.55 per million tokens and off-peak discounts as low as $0.135 per million tokens.

A lower price sounds great, but performance matters too. A model that saves money is not useful if it struggles with accuracy or speed. These models were tested in reasoning, math, coding, and creative writing to see how well they performed for their price.

Table of Content

TL;DR

No time for a full breakdown? Here’s a quick look at how Gemini 2.0 Flash Thinking, O3-Mini, and DeepSeek R1 performed under pressure.

  • Reasoning: Gemini and O3-Mini stayed close. Gemini explains every step like a patient teacher. O3-Mini drops the answer and moves on. DeepSeek R1 took its time, double-checking every detail before answering.
  • Math: Gemini is fast but not always perfect. O3-Mini is steady and reliable. DeepSeek R1 spent so long analyzing, it felt like it was solving an ancient mystery.
  • Coding: O3-Mini responded instantly and delivered optimized code. Gemini took a second but got everything right. DeepSeek R1 overanalyzed before finally producing a solution.
  • Creative Writing: DeepSeek R1 created an action-packed sci-fi epic. Gemini built a slow-burn mystery with vivid details. O3-Mini crafted poetic, introspective storytelling.

O3-Mini is the best for speed. Gemini 2.0 balances logic and depth. DeepSeek R1 is the most precise but takes its time. Each model has its strengths.

Still here? Let’s dive into the details.

Brief on Gemini 2.0 Flash Thinking

Gemini 2.0 Flash Thinking is built for speed. It works through problems instead of making random guesses. It does not hesitate. It analyzes, reasons, and delivers with precision.

The 1M token context window is no small feature. It can process entire research papers, break down massive codebases, and handle deep analysis with ease. Large? Yes. Slow? Not at all. This model moves. It does not just give answers. It figures things out like an expert solving a complex case.

This guy summed it up best: Cheaper, more brilliant, and ready to take over.

Deedy das appreciating gemini 2.0 flash thinking

Developers will find plenty to like. It handles tools efficiently, adapts to different inputs, and offers an API that keeps costs reasonable. Gemini 2.0 Flash Thinking is not here to compete. It is here to lead.

Complex Reasoning

All AI models claim to be good, but are they really?

I put them through four tough challenges to test their reasoning, speed, and ability to avoid simple mistakes. Let’s see how they did.

#1. Basic Logical Progression

This one was simple. No tricks or complicated wording, just a straightforward reasoning check.

Prompt: If today is Wednesday, what day will it be in three days?

If an AI messes this up, I’d start questioning everything.

Gemini 2.0 flash thinking:

Gemini solving puzzle

Took about two to three seconds. Still quick but not as instant as O3-Mini.

O3-Mini:

O3-mini solving puzzle

O3-Mini wasted no time. One second.

DeepSeek R1:

Deepseek r1 solving puzzle

DeepSeek R1 took a full 23 seconds. Not because it struggled but because it double, triple, and quadruple-checked before answering.

So, One model answered instantly. One took an extra second. One acted like it was preparing a research paper before committing.

That was just the first test. Time to step it up. Let’s see how they handle something more complex.

#2. Tracking Family Connections

This test checked how well the models could follow relationships in a family tree.

Prompt: If A is the brother of B, B is the son of C, C is the daughter of D, and D is the wife of E, how is E related to B?

Gemini 2.0 Flash Thinking:

Gemini solving reasoning puzzle

It answered correctly with confidence. Almost the same speed as O3-Mini.

O3-Mini:

o3-mini solving reasoning puzzle

Took a few seconds. No hesitation, just the correct answer.

DeepSeek R1:

Deepseek solving reasoning puzzle

Gemini 2.0 Flash Thinking and O3-Mini were the best picks for quick answers.

#3. Solving Ciphers and Recognizing Patterns

This test checked how well the models could identify and apply a simple letter-based code.

Prompt: In a certain code, “PAPER” is written as “QBQFS”. How is “PENCIL” written in the same code?

Gemini 2.0 Flash Thinking:

O3-Mini:

DeepSeek R1:

At this point, it’s evident that O3-Mini and Gemini 2.0 Flash Thinking were the top performers for reasoning tasks involving pattern recognition and coded language.

#4. Handling Trick Questions with Logic

It’s time to see if these models fall for the trap or think it through.

Prompt: A robot has 8 arms. There are 5 objects on the table: a knife, a fork, a spoon, a teddy bear, and a doll. The robot picks up each object with an arm. He then shakes hands with himself. How many arms does he have free?

Gemini 2.0 Flash thinking:

O3-Mini:

DeepSeek R1:

Summary on Reasoning

O3-Mini and Gemini 2.0 Flash Thinking were evenly matched. Both answered correctly and kept up with every challenge.

The difference? O3-Mini fired out answers instantly. Gemini 2.0 took an extra moment to explain its reasoning. DeepSeek R1 was slow but never wrong. It checked, double-checked, and probably ran a full internal review before committing to an answer.

For fast and accurate reasoning, O3-Mini and Gemini 2.0 Flash Thinking are the best choices. If absolute precision matters more than speed, DeepSeek R1 is the safer bet.

Mathematics

Numbers tell the truth. Some AI models do not. Math is simple. Either the answer is correct or it is not. No extra words. Just problem-solving.

Let’s see which delivers the correct answers and which gets it wrong.

#1. Quick Calculation Challenge

Nothing fancy, just a straight multiplication problem:

Prompt: A school ordered 45 books for its library. Each book costs $12. What’s the total cost?

Gemini 2.0 Flash thinking:

O3-Mini:

DeepSeek R1:

The differences were clear. O3-Mini was the fastest. Gemini 2.0 took a moment but got the right answer. DeepSeek R1 treated the problem like a courtroom case.

#2. The Great Car vs. Truck Showdown

A classic speed-distance-time problem. Nothing too tricky unless you overthink it.

Prompt: The distance between the two towns is 380 km. A car and a truck start driving toward each other and meet in 4 hours. The car is 5 km/h faster than the truck. What are their speeds?

Gemini 2.0 Flash thinking:

O3-Mini:

DeepSeek R1:

Same answer, different styles. O3-Mini is fast and direct. Gemini 2.0 takes a moment to explain. DeepSeek R1, as always, overthinks every step before finally answering.

#3. The Flipping Card Puzzle

This logic challenge required planning not just fast calculations.

Prompt: Six cards are placed on a table, each with ‘1’ on one side and ‘2’ on the other. All six start with ‘1’ facing up. In one try, exactly four cards must be flipped. What’s the least number of tries needed to get all six showing ‘2’?

Gemini 2.0 Flash thinking:

O3-Mini:

DeepSeek R1:

This time, rushing didn’t pay off. Gemini 2.0 delivered both speed and accuracy, while DeepSeek played it safe and got there late. O3-Mini moved fast but missed the mark.

#4. The Parabola Challenge

This test mixed geometry and algebra. A tricky problem that needed clear thinking and precise calculations.

Prompt: Let P (4, 4√3) be a point on the parabola y² = 4ax, and let PQ be a focal chord. If M and N are the feet of the perpendiculars from P and Q to the directrix, what is the area of quadrilateral PQMN?

Gemini 2.0 Flash thinking:

O3-Mini:

DeepSeek R1:

Summary for Math

Gemini 2.0 Flash Thinking is fast and mostly accurate. O3-Mini stays close behind and makes fewer mistakes. DeepSeek R1 takes forever but never gets it wrong.

Coding

Leetcode-level coding problems separate true problem-solvers from those who just throw random syntax on the screen. Speed matters. So does the time it takes to write working code.

For this test, I picked a difficult-level problem from Leetcode called ‘Maximum Score Words Formed by Letters’ and watched how these models handled both code generation and execution speed.

O3-Mini:

got straight to work. No delay. It generated the solution instantly and ran it in 2ms. Clean and efficient.

class Solution(object):
    def maxScoreWords(self, words, letters, score):
        from collections import Counter
        
        # Count available letters.
        letterCount = Counter(letters)
        
        # Precompute frequency count and score for each word.
        wordInfo = []
        for word in words:
            count = Counter(word)
            wordScore = sum(score[ord(c) - ord('a')] * count[c] for c in count)
            wordInfo.append((count, wordScore))
        
        self.maxScore = 0
        
        def backtrack(i, currScore, available):
            # Base case: processed all words.
            if i == len(words):
                self.maxScore = max(self.maxScore, currScore)
                return
            
            # Option 1: Skip current word.
            backtrack(i + 1, currScore, available)
            
            # Option 2: Use current word if possible.
            wordCount, wordScore = wordInfo[i]
            canUse = True
            for ch in wordCount:
                if available[ch] < wordCount[ch]:
                    canUse = False
                    break
            if canUse:
                # Update the available letters.
                newAvailable = available.copy()
                for ch in wordCount:
                    newAvailable[ch] -= wordCount[ch]
                backtrack(i + 1, currScore + wordScore, newAvailable)
        
        # Start the recursion from the first word.
        backtrack(0, 0, letterCount)
        return self.maxScore

Gemini 2.0 Flash thinking:

Took a moment to think but delivered the right answer in 3ms. Not as fast as O3-Mini but still solid.

class Solution(object):
    def maxScoreWords(self, words, letters, score):
        letter_counts = {}
        for letter in letters:
            letter_counts[letter] = letter_counts.get(letter, 0) + 1

        word_counts_list = []
        for word in words:
            word_counts = {}
            word_score = 0
            for char in word:
                word_counts[char] = word_counts.get(char, 0) + 1
                word_score += score[ord(char) - ord('a')]
            word_counts_list.append((word_counts, word_score))

        max_score_holder = [0] # Use a list to hold max_score

        def solve(index, current_letter_counts, current_score):
            # nonlocal max_score  <- Remove nonlocal as it's Python 3 syntax and not needed with list
            if index == len(words):
                max_score_holder[0] = max(max_score_holder[0], current_score) # Update using list
                return

            # Option 1: Don't include the current word
            solve(index + 1, current_letter_counts, current_score)

            # Option 2: Include the current word if possible
            word_counts, word_score = word_counts_list[index]
            possible = True
            next_letter_counts = current_letter_counts.copy()

            for char, count in word_counts.items():
                if next_letter_counts.get(char, 0) < count:
                    possible = False
                    break
                next_letter_counts[char] -= count

            if possible:
                solve(index + 1, next_letter_counts, current_score + word_score)

        solve(0, letter_counts, 0)
        return max_score_holder[0] # Return value from the lis

DeepSeek R1:

Took its time. But got it done.

class Solution(object):
    def maxScoreWords(self, words, letters, score):
        # Precompute the available letters count
        letters_count = [0] * 26
        for c in letters:
            letters_count[ord(c) - ord('a')] += 1
        
        # Precompute each word's letter count and its score
        words_cnt = []
        words_scr = []
        for word in words:
            cnt = [0] * 26
            s = 0
            for c in word:
                idx = ord(c) - ord('a')
                cnt[idx] += 1
                s += score[idx]
            words_cnt.append(cnt)
            words_scr.append(s)
        
        max_score = 0
        n = len(words)
        
        def backtrack(index, current_counts, current_score):
            nonlocal max_score
            if index == n:
                if current_score > max_score:
                    max_score = current_score
                return
            # Option 1: Skip the current word
            # Pass a copy of current_counts to avoid mutation
            backtrack(index + 1, current_counts.copy(), current_score)
            
            # Option 2: Take the current word if possible
            word_c = words_cnt[index]
            possible = True
            for i in range(26):
                if current_counts[i] < word_c[i]:
                    possible = False
                    break
            if possible:
                # Create a new counts array by subtracting the word's letters
                new_counts = current_counts.copy()
                for i in range(26):
                    new_counts[i] -= word_c[i]
                new_score = current_score + words_scr[index]
                backtrack(index + 1, new_counts, new_score)
        
        # Start the backtracking with a copy of the initial letters count
        initial_counts = letters_count.copy()
        backtrack(0, initial_counts, 0)
        return max_score

Summary for Coding

Speed was the key in this test. O3-Mini came out on top, generating and executing code the fastest. Gemini 2.0 Flash Thinking stayed close behind with a solid performance. DeepSeek R1 took so long, it seemed to forget it was supposed to be coding.

Creative Writing

Math and code test logic. Storytelling is a different challenge. Can these models create something engaging? Do they build suspense and keep you hooked? Or do they produce something that feels flat and artificial?

To find out, I gave them a sci-fi mystery.

Prompt: You wake up in a world where everyone speaks a language you’ve never heard before, yet they all seem to know you. A mysterious letter in your pocket, written in your own handwriting, warns you not to trust anyone. What will you do next?

Gemini 2.0 Flash thinking

This model built a slow-burn mystery with rich details. It felt like a novel opening.

Response:

The cacophony was the first thing that truly registered. Not the gentle chirping of birds, or the muffled sounds of morning waking, but a jarring, dissonant symphony of… what? Voices. Dozens, perhaps hundreds, of them, all around, rising and falling in patterns utterly unlike any language I knew. It was like being dropped into a bizarre opera where every aria was alien.
My eyes snapped open, heart hammering against my ribs. Disorientation warred with a strange sense of… familiarity? The room, though sparsely furnished with rough-hewn wooden furniture and woven mats, felt… almost right. Like a place I should know, even though I didn’t recognize a single detail.
I sat up, the simple act making my head spin. My mouth felt like sandpaper. Reaching for the bedside table, I expected to find a glass of water, or at least the illusion of one in this bizarre reality. Instead, my hand brushed against something crisp and folded.
My pocket.
Instinctively, I reached inside my worn trousers, the fabric unfamiliar to the touch. My fingers closed around a folded piece of parchment. Parchment? Who used parchment anymore?
Pulling it out, I unfolded it with trembling hands. My breath hitched. It was handwriting. My handwriting. Uneven, slightly rushed, but undeniably mine. And the words… they sent a chill down my spine that had nothing to do with the cool morning air.
“Don’t trust anyone.”
That was it. Three words, stark and brutal, staring back at me from the aged parchment. Written by me, apparently, yet utterly foreign to my waking memory.
Panic began to prickle at the edges of my composure. This wasn't a dream. The sounds, the textures, the sheer wrongness of everything felt too real. I was in a world I didn't understand, surrounded by people who seemed to know me, and armed with a cryptic, terrifying warning penned by my own hand.
Okay. Breathe. Just breathe. Panic wouldn't help. I needed information. I needed to figure out where I was, who these people were, and most importantly, why I had written that warning.
My first instinct was to bolt. To run, to hide, to get away from the unsettling sounds and the unnerving feeling of being known. But where would I run to? I had no map, no language, no idea what dangers lurked beyond this room, or even within it.
No, running was impulsive, and the letter warned against impulsivity. “Don’t trust anyone.” That meant observation. Caution. Intelligence gathering.
I decided on a two-pronged approach. First, I needed to assess my immediate surroundings. Second, I needed to act like I belonged, to blend in, to observe these people who seemed to know me without revealing my ignorance.
Getting out of bed slowly, I tried to move with a confidence I didn't feel. The floorboards creaked under my weight, a familiar sound despite everything else being alien. I took a moment to examine myself in a small, polished metal mirror hanging on the wall. It reflected back… me. Same face, same eyes, though perhaps a little wider with fear and confusion. My clothes, though unfamiliar in style – loose, woven tunic and trousers – fit and felt comfortable enough. At least I looked somewhat like I belonged here.
Taking another deep breath, I opened the door.
The sound outside the room intensified. It was a low hum of activity, punctuated by bursts of the strange, incomprehensible language. Hesitantly, I stepped out into what appeared to be a common area.
It was bustling with people. Men, women, children – all moving with purpose, their faces etched with expressions ranging from cheerful to serious, but all seemingly carrying on with their daily routines. And yes, they all seemed to glance at me, to acknowledge my presence with nods and fleeting smiles.
My heart pounded. Each glance felt like an interrogation. Did they know I was an imposter? Did they sense my confusion? Or was this just… normal?
I forced a small, uncertain smile back at a woman who gave me a particularly knowing look. She responded with a string of melodic phrases, ending with what sounded like a question, judging by the inflection.
I blinked, feigning comprehension. “Uh… yes?” I offered weakly, hoping it sounded like an affirmative in whatever language this was.
The woman chuckled, a warm, throaty sound, and responded with another, longer phrase. She gestured towards a large communal table laden with steaming bowls and baskets of… something. Food, I hoped.
My stomach rumbled loudly, betraying my nervous facade. Hunger was a basic need, and perhaps shared across worlds and languages. Maybe food was a safe starting point.
Following her gesture, I cautiously approached the table. People around it made room for me, their eyes curious but not hostile. They were waiting for me to act, to participate.
Okay. Act. That’s what I would do. I would pretend to be who they expected me to be. I would observe, I would listen, I would try to decipher their language, their customs, their expectations. And most importantly, I would remember the letter. “Don’t trust anyone.” Not even the ones who seemed kind, the ones who offered food, the ones who smiled knowingly.
Starting with breakfast, I would begin my investigation. But my eyes, while focused on the steaming bowl in front of me, would be constantly scanning the faces around the table, searching for clues, searching for danger, searching for… perhaps, a reason why I was here, in this bizarre, unsettling world. And why I had warned myself not to trust a single soul.
My next move was to eat, observe, and remain utterly, utterly silent. For now, silence was my best defense. And maybe, just maybe, in the silence, I could start to understand

O3-Mini

It created a poetic, introspective story. Moody and surreal with plenty left unsaid.

Response:

I rise to a morning painted in hues I’ve never seen—a surreal blend of twilight and dawn—only to realize that every conversation around me hums in a language entirely alien to my ears. Yet, as I wander through crowded streets of unfamiliar faces, there’s a constant, eerie undertone: these strangers wear knowing expressions, as if they’ve been waiting for me all along.
The weight in my pocket grows heavier until I withdraw a folded letter—a letter penned in my own handwriting. The message is cryptic, a forewarning etched with urgency: “Do not trust anyone.” It shakes the very foundation of what I thought I knew about myself and this strange new world.
I pause at the intersection of curiosity and dread. The letter’s familiar loops and curves call me back to memories I can’t quite grasp—memories of choices made, promises broken, and secrets buried in the depths of time. With every step, I question: who am I here? Why do these people, who speak in rhythms I cannot decode, seem to recognize me as though my past were woven into the fabric of their collective story?
My mind races with possibilities. Could this be a parallel universe, or the echo of a forgotten dream? The caution in my own words pushes me to seek the truth. I decide to retreat to a quiet café nestled in a labyrinth of cobblestone alleys—a place where silence and shadows provide refuge from the cacophony of unfamiliar voices.
Inside, I settle into a worn armchair and pull the letter close, its ink glistening with both promise and peril. I sketch out a plan on a napkin, fragmentary ideas forming a mosaic of survival: locate the source of these cryptic words, decipher the language that surrounds me, and unearth the hidden history that binds me to this world.
As I prepare to leave, a lone figure in the corner watches me intently. The eyes—so familiar, yet filled with a mystery all their own—hint at answers that might unlock the secrets of my past. Do I confront this enigmatic watcher, risking the treacherous unknown, or do I vanish further into the labyrinth of this bewildering city?
I take a deep breath, the decision weighing on me as heavily as the letter in my pocket. With resolve firming in my heart, I choose to trust the instinct that has always guided me—question everything, seek truth in every whispered word, and remember that sometimes, the most dangerous journey is the one within oneself.
Stepping out into the twilight of this uncanny realm, I am determined to unweave the tapestry of fate that has unexpectedly bound me to a world both strange and intimately mine.

DeepSeek R1

DeepSeek R1 jumped straight into action. Fast. Cinematic. Felt like a movie.

Response:

Title: "Echoes of a Forgotten Tongue"

The first thing I notice is the hum. A resonant, almost musical vibration that thrums through the air like a plucked string. Then the voices—fluid and alien, syllables curling like smoke. I sit up on a cold marble bench, my head throbbing, as strangers in iridescent robes sweep past. Their eyes lock onto mine, bright with recognition. “Valara esh’tai,” one murmurs, bowing slightly. The words mean nothing. Everything means nothing here.

My hand flies to my chest, clutching fabric that isn’t mine—a silken tunic embroidered with constellations I don’t recognize. In the pocket, a crumpled note. My own jagged script screams up at me:
“They’ll call you the Key. They’ll say they need you. DON’T TRUST THEM. Find the Hollow Star before the Convergence. – You.”

A woman with silver-streaked hair approaches, her smile sharp as a blade. “Miran, Valara,” she says, gesturing to a towering spire in the distance, its apex pulsing with violet light. Her fingers brush my arm, and a jolt courses through me—a flash of memory that isn’t mine. A chamber beneath the spire. A machine humming. Blood on my hands.

I recoil. The letter’s warning coils in my gut. Don’t trust. But how do you flee a maze when every path is written in symbols you can’t read?

I mimic a smile, nodding as if I understand. She leads me through winding streets, the crowd parting like water. Whispers trail me: Valara. Key. Esh’tai. A child darts forward, pressing a stone into my palm—a obsidian shard etched with the same symbol from the note: a star split by a crescent. The Hollow Star.

The woman’s gaze sharpens. She barks an order, and guards materialize. No time to think. I bolt, ducking into an alley where the hum grows louder, as if the city itself is alive. The stone in my hand grows warm, guiding me toward a grated sewer. Reluctantly, I climb down.

The tunnels reek of ozone and rust. The stone’s glow intensifies, revealing murals of a cataclysm—a shattered moon, a beam of light, a figure (me?) standing at its center. The hum becomes a voice. “You’ve done this before,” it seems to say. “You’ve always chosen wrong.”

Footsteps echo above. They’re coming.

I follow the stone to a corroded door. Behind it: a vault holding a tiny, pulsing orb—the Hollow Star. The second I touch it, understanding floods my mind. The language. Their words rearrange, clarity slicing through the noise.

“Valara,” the voice from the vault booms. “Reset the cycle. Destroy the Spire.”

But another memory surfaces—gentler, older. A man’s voice, pleading: “The Spire isn’t the enemy. We are. You taught me that, Kael.”

Kael. My name? Or another lie?

The orb’s light dims. The guards’ shouts draw closer.

I clutch the Hollow Star, its power searing my palm. Trust no one. Not even myself.

But if I’ve lived this before, maybe this time, I’ll choose differently.

I slip back into the crowd, the orb is hidden, the woman’s silver-streaked hair visible ahead. She’s scanning faces, frantic.

“Valara!” she calls.

I step forward.

Summary of Creative Writing

DeepSeek R1 brought the most action. Gemini 2.0 Flash Thinking felt the most immersive. O3-Mini was the most artistic. They have different styles, but they are all capable storytellers.

Final Verdict

After testing Gemini 2.0 Flash Thinking, O3-Mini, and DeepSeek R1 across reasoning, math, coding, and creative writing, here’s how they compare:

  • Reasoning: Gemini 2.0 explains like a tutor. O3-Mini gives the answer and moves on. DeepSeek R1 gets everything right but takes its time overthinking.
  • Math: Gemini 2.0 is fast but not always perfect. O3-Mini stays steady with fewer mistakes. DeepSeek R1 double-checks like it’s taking an exam.
  • Coding: O3-Mini delivers instantly. Gemini 2.0 takes a second but gets it right. DeepSeek R1 moves like it’s writing a research paper.
  • Creative Writing: DeepSeek R1 creates a blockbuster-style story. Gemini 2.0 builds a slow-burn mystery. O3-Mini crafts poetic and introspective storytelling.

Each model excels in different areas. O3-Mini is the best for speed. Gemini 2.0 balances logic and depth. DeepSeek R1 is the most precise but takes its time. Whatever you need, one of them fits the job.

  • Pricing
  • Explore
  • Blog