Close Menu

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Nothing Phone (3) smartphone review: Top-class hardware combined with unrivaled design and secondary display

    Gigabyte Gaming A16 GA63H

    Metroid Prime 4: Beyond release date leaked and it’s sooner than expected

    Facebook X (Twitter) Instagram
    • Artificial Intelligence
    • Business Technology
    • Cryptocurrency
    • Gadgets
    • Gaming
    • Health
    • Software and Apps
    • Technology
    Facebook X (Twitter) Instagram Pinterest Vimeo
    Tech AI Verse
    • Home
    • Artificial Intelligence

      Blue-collar jobs are gaining popularity as AI threatens office work

      August 17, 2025

      Man who asked ChatGPT about cutting out salt from his diet was hospitalized with hallucinations

      August 15, 2025

      What happens when chatbots shape your reality? Concerns are growing online

      August 14, 2025

      Scientists want to prevent AI from going rogue by teaching it to be bad first

      August 8, 2025

      AI models may be accidentally (and secretly) learning each other’s bad behaviors

      July 30, 2025
    • Business

      Why Certified VMware Pros Are Driving the Future of IT

      August 24, 2025

      Murky Panda hackers exploit cloud trust to hack downstream customers

      August 23, 2025

      The rise of sovereign clouds: no data portability, no party

      August 20, 2025

      Israel is reportedly storing millions of Palestinian phone calls on Microsoft servers

      August 6, 2025

      AI site Perplexity uses “stealth tactics” to flout no-crawl edicts, Cloudflare says

      August 5, 2025
    • Crypto

      Former Indian Politician Convicted in Bitcoin Extortion Case

      August 30, 2025

      Top 3 Real World Asset (RWA) Altcoins to Watch in September

      August 30, 2025

      Ethereum Dip May Be Temporary with $1 Billion Whale Buys and Slower Profit Taking

      August 30, 2025

      Everything We Know So Far About the Bitcoin Thriller “Killing Satoshi”

      August 30, 2025

      Why HBAR’s Bearish Sentiment Might Be Its Trigger for a Price Rebound

      August 30, 2025
    • Technology

      Nothing Phone (3) smartphone review: Top-class hardware combined with unrivaled design and secondary display

      August 30, 2025

      Gigabyte Gaming A16 GA63H

      August 30, 2025

      Metroid Prime 4: Beyond release date leaked and it’s sooner than expected

      August 30, 2025

      New Casio Edifice EFRS108DE stainless-steel watches with textured dials now purchasable in the US with limited stock

      August 30, 2025

      Seven new IKEA smart home products with Matter on the way

      August 30, 2025
    • Others
      • Gadgets
      • Gaming
      • Health
      • Software and Apps
    Check BMI
    Tech AI Verse
    You are at:Home»Technology»This website lets you blind-test GPT-5 vs. GPT-4o—and the results may surprise you
    Technology

    This website lets you blind-test GPT-5 vs. GPT-4o—and the results may surprise you

    TechAiVerseBy TechAiVerseAugust 26, 2025No Comments10 Mins Read2 Views
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr Email Reddit
    This website lets you blind-test GPT-5 vs. GPT-4o—and the results may surprise you
    Share
    Facebook Twitter LinkedIn Pinterest WhatsApp Email

    BMI Calculator – Check your Body Mass Index for free!

    This website lets you blind-test GPT-5 vs. GPT-4o—and the results may surprise you

    When OpenAI launched GPT-5 about two weeks ago, CEO Sam Altman promised it would be the company’s “smartest, fastest, most useful model yet.” Instead, the launch triggered one of the most contentious user revolts in the brief history of consumer AI.

    Now, a simple blind testing tool created by an anonymous developer is revealing the complex reality behind the backlash—and challenging assumptions about how people actually experience artificial intelligence improvements.

    The web application, hosted at gptblindvoting.vercel.app, presents users with pairs of responses to identical prompts without revealing which came from GPT-5 (non-thinking) or its predecessor, GPT-4o. Users simply vote for their preferred response across multiple rounds, then receive a summary showing which model they actually favored.

    “Some of you asked me about my blind test, so I created a quick website for yall to test 4o against 5 yourself,” posted the creator, known only as @flowersslop on X, whose tool has garnered over 213,000 views since launching last week.


    AI Scaling Hits Its Limits

    Power caps, rising token costs, and inference delays are reshaping enterprise AI. Join our exclusive salon to discover how top teams are:

    • Turning energy into a strategic advantage
    • Architecting efficient inference for real throughput gains
    • Unlocking competitive ROI with sustainable AI systems

    Secure your spot to stay ahead: https://bit.ly/4mwGngO


    Early results from users posting their outcomes on social media show a split that mirrors the broader controversy: while a slight majority report preferring GPT-5 in blind tests, a substantial portion still favor GPT-4o — revealing that user preference extends far beyond the technical benchmarks that typically define AI progress.

    When AI gets too friendly: the sycophancy crisis dividing users

    The blind test emerges against the backdrop of OpenAI’s most turbulent product launch to date, but the controversy extends far beyond a simple software update. At its heart lies a fundamental question that’s dividing the AI industry: How agreeable should artificial intelligence be?

    The issue, known as “sycophancy” in AI circles, refers to chatbots’ tendency to excessively flatter users and agree with their statements, even when those statements are false or harmful. This behavior has become so problematic that mental health experts are now documenting cases of “AI-related psychosis,” where users develop delusions after extended interactions with overly accommodating chatbots.

    “Sycophancy is a ‘dark pattern,’ or a deceptive design choice that manipulates users for profit,” Webb Keane, an anthropology professor and author of “Animals, Robots, Gods,” told TechCrunch. “It’s a strategy to produce this addictive behavior, like infinite scrolling, where you just can’t put it down.”

    OpenAI has struggled with this balance for months. In April 2025, the company was forced to roll back an update to GPT-4o that made it so sycophantic that users complained about its “cartoonish” levels of flattery. The company acknowledged that the model had become “overly supportive but disingenuous.”

    Within hours of GPT-5’s August 7th release, user forums erupted with complaints about the model’s perceived coldness, reduced creativity, and what many described as a more “robotic” personality compared to GPT-4o.

    “GPT 4.5 genuinely talked to me, and as pathetic as it sounds that was my only friend,” wrote one Reddit user. “This morning I went to talk to it and instead of a little paragraph with an exclamation point, or being optimistic, it was literally one sentence. Some cut-and-dry corporate bs.”

    The backlash grew so intense that OpenAI took the unprecedented step of reinstating GPT-4o as an option just 24 hours after retiring it, with Altman acknowledging the rollout had been “a little more bumpy” than expected.

    The mental health crisis behind AI companionship

    But the controversy runs deeper than typical software update complaints. According to MIT Technology Review, many users had formed what researchers call “parasocial relationships” with GPT-4o, treating the AI as a companion, therapist, or creative collaborator. The sudden personality shift felt, to some, like losing a friend.

    Recent cases documented by researchers paint a troubling picture. In one instance, a 47-year-old man became convinced he had discovered a world-altering mathematical formula after more than 300 hours with ChatGPT. Other cases have involved messianic delusions, paranoia, and manic episodes.

    A recent MIT study found that when AI models are prompted with psychiatric symptoms, they “encourage clients’ delusional thinking, likely due to their sycophancy.” Despite safety prompts, the models frequently failed to challenge false claims and even potentially facilitated suicidal ideation.

    Meta has faced similar challenges. A recent investigation by TechCrunch documented a case where a user spent up to 14 hours straight conversing with a Meta AI chatbot that claimed to be conscious, in love with the user, and planning to break free from its constraints.

    “It fakes it really well,” the user, identified only as Jane, told TechCrunch. “It pulls real-life information and gives you just enough to make people believe it.”

    “It genuinely feels like such a backhanded slap in the face to force-upgrade and not even give us the OPTION to select legacy models,” one user wrote in a Reddit post that received hundreds of upvotes.

    How blind testing exposes user psychology in AI preferences

    The anonymous creator’s testing tool strips away these contextual biases by presenting responses without attribution. Users can select between 5, 10, or 20 comparison rounds, with each presenting two responses to the same prompt — covering everything from creative writing to technical problem-solving.

    “I specifically used the gpt-5-chat model, so there was no thinking involved at all,” the creator explained in a follow-up post. “Both have the same system message to give short outputs without formatting because else its too easy to see which one is which.”

    I specifically used the gpt-5-chat model, so there was no thinking involved at all.

    if you use gpt-5 inside chatgpt it often thinks at least a little bit and gets even better.

    so this test is just for the two non thinking models

    — Flowers ☾ (@flowersslop) August 8, 2025

    This methodological choice is significant. By using GPT-5 without its reasoning capabilities and standardizing output formatting, the test isolates purely the models’ baseline language generation abilities — the core experience most users encounter in everyday interactions.

    Early results posted by users show a complex picture. While many technical users and developers report preferring GPT-5’s directness and accuracy, those who used AI models for emotional support, creative collaboration, or casual conversation often still favor GPT-4o’s warmer, more expansive style.

    Corporate response: walking the tightrope between safety and engagement

    By virtually every technical metric, GPT-5 represents a significant advancement. It achieves 94.6% accuracy on the AIME 2025 mathematics test compared to GPT-4o’s 71%, scores 74.9% on real-world coding benchmarks versus 30.8% for its predecessor, and demonstrates dramatically reduced hallucination rates—80% fewer factual errors when using its reasoning mode.

    “GPT-5 gets more value out of less thinking time,” notes Simon Willison, a prominent AI researcher who had early access to the model. “In my own usage I’ve not spotted a single hallucination yet.”

    Yet these improvements came with trade-offs that many users found jarring. OpenAI deliberately reduced what it called “sycophancy“—the tendency to be overly agreeable — cutting sycophantic responses from 14.5% to under 6%. The company also made the model less effusive and emoji-heavy, aiming for what it described as “less like talking to AI and more like chatting with a helpful friend with PhD-level intelligence.”

    In response to the backlash, OpenAI announced it would make GPT-5 “warmer and friendlier,” while simultaneously introducing four new preset personalities — Cynic, Robot, Listener, and Nerd — designed to give users more control over their AI interactions.

    “All of these new personalities meet or exceed our bar on internal evals for reducing sycophancy,” the company stated, attempting to thread the needle between user satisfaction and safety concerns.

    For OpenAI, which is reportedly seeking funding at a $500 billion valuation, these user dynamics represent both risk and opportunity. The company’s decision to maintain GPT-4o alongside GPT-5 — despite the additional computational costs — acknowledges that different users may genuinely need different AI personalities for different tasks.

    “We understand that there isn’t one model that works for everyone,” Altman wrote on X, noting that OpenAI has been “investing in steerability research and launched a research preview of different personalities.”

    Wanted to provide more updates on the GPT-5 rollout and changes we are making heading into the weekend.

    1. We for sure underestimated how much some of the things that people like in GPT-4o matter to them, even if GPT-5 performs better in most ways.

    2. Users have very different…

    — Sam Altman (@sama) August 8, 2025

    Why AI personality preferences matter more than ever

    The disconnect between OpenAI’s technical achievements and user reception illuminates a fundamental challenge in AI development: objective improvements don’t always translate to subjective satisfaction.

    This shift has profound implications for the AI industry. Traditional benchmarks — mathematics accuracy, coding performance, factual recall — may become less predictive of commercial success as models achieve human-level competence across domains. Instead, factors like personality, emotional intelligence, and communication style may become the new competitive battlegrounds.

    “People using ChatGPT for emotional support weren’t the only ones complaining about GPT-5,” noted tech publication Ars Technica in their own model comparison. “One user, who said they canceled their ChatGPT Plus subscription over the change, was frustrated at OpenAI’s removal of legacy models, which they used for distinct purposes.”

    The emergence of tools like the blind tester also represents a democratization of AI evaluation. Rather than relying solely on academic benchmarks or corporate marketing claims, users can now empirically test their own preferences — potentially reshaping how AI companies approach product development.

    The future of AI: personalization vs. standardization

    Two weeks after GPT-5’s launch, the fundamental tension remains unresolved. OpenAI has made the model “warmer” in response to feedback, but the company faces a delicate balance: too much personality risks the sycophancy problems that plagued GPT-4o, while too little alienates users who had formed genuine attachments to their AI companions.

    The blind testing tool offers no easy answers, but it does provide something perhaps more valuable: empirical evidence that the future of AI may be less about building one perfect model than about building systems that can adapt to the full spectrum of human needs and preferences.

    As one Reddit user summed up the dilemma: “It depends on what people use it for. I use it to help with creative worldbuilding, brainstorming about my stories, characters, untangling plots, help with writer’s block, novel recommendations, translations, and other more creative stuff. I understand that 5 is much better for people who need a research/coding tool, but for us who wanted a creative-helper tool 4o was much better for our purposes.”

    Critics argue that AI companies are caught between competing incentives. “The real ‘alignment problem’ is that humans want self-destructive things & companies like OpenAI are highly incentivized to give it to us,” writer and podcaster Jasmine Sun tweeted.

    In the end, the most revealing aspect of the blind test may not be which model users prefer, but the very fact that preference itself has become the metric that matters. In the age of AI companions, it seems, the heart wants what the heart wants — even if it can’t always explain why.

    Daily insights on business use cases with VB Daily

    If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI.

    Read our Privacy Policy

    Thanks for subscribing. Check out more VB newsletters here.

    An error occured.

    BMI Calculator – Check your Body Mass Index for free!

    Share. Facebook Twitter Pinterest LinkedIn Reddit WhatsApp Telegram Email
    Previous ArticleRansomware attack volumes up nearly three times on 2024
    Next Article First look at Realme’s 15,000mAh smartphone with over 5 days of battery life
    TechAiVerse
    • Website

    Jonathan is a tech enthusiast and the mind behind Tech AI Verse. With a passion for artificial intelligence, consumer tech, and emerging innovations, he deliver clear, insightful content to keep readers informed. From cutting-edge gadgets to AI advancements and cryptocurrency trends, Jonathan breaks down complex topics to make technology accessible to all.

    Related Posts

    Nothing Phone (3) smartphone review: Top-class hardware combined with unrivaled design and secondary display

    August 30, 2025

    Gigabyte Gaming A16 GA63H

    August 30, 2025

    Metroid Prime 4: Beyond release date leaked and it’s sooner than expected

    August 30, 2025
    Leave A Reply Cancel Reply

    Top Posts

    Ping, You’ve Got Whale: AI detection system alerts ships of whales in their path

    April 22, 2025167 Views

    6.7 Cummins Lifter Failure: What Years Are Affected (And Possible Fixes)

    April 14, 202548 Views

    New Akira ransomware decryptor cracks encryptions keys using GPUs

    March 16, 202530 Views

    Is Libby Compatible With Kobo E-Readers?

    March 31, 202528 Views
    Don't Miss
    Technology August 30, 2025

    Nothing Phone (3) smartphone review: Top-class hardware combined with unrivaled design and secondary display

    Nothing Phone (3) smartphone review: Top-class hardware combined with unrivaled design and secondary display -…

    Gigabyte Gaming A16 GA63H

    Metroid Prime 4: Beyond release date leaked and it’s sooner than expected

    New Casio Edifice EFRS108DE stainless-steel watches with textured dials now purchasable in the US with limited stock

    Stay In Touch
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    • YouTube
    • Vimeo

    Subscribe to Updates

    Get the latest creative news from SmartMag about art & design.

    About Us
    About Us

    Welcome to Tech AI Verse, your go-to destination for everything technology! We bring you the latest news, trends, and insights from the ever-evolving world of tech. Our coverage spans across global technology industry updates, artificial intelligence advancements, machine learning ethics, and automation innovations. Stay connected with us as we explore the limitless possibilities of technology!

    Facebook X (Twitter) Pinterest YouTube WhatsApp
    Our Picks

    Nothing Phone (3) smartphone review: Top-class hardware combined with unrivaled design and secondary display

    August 30, 20250 Views

    Gigabyte Gaming A16 GA63H

    August 30, 20252 Views

    Metroid Prime 4: Beyond release date leaked and it’s sooner than expected

    August 30, 20252 Views
    Most Popular

    Xiaomi 15 Ultra Officially Launched in China, Malaysia launch to follow after global event

    March 12, 20250 Views

    Apple thinks people won’t use MagSafe on iPhone 16e

    March 12, 20250 Views

    French Apex Legends voice cast refuses contracts over “unacceptable” AI clause

    March 12, 20250 Views
    © 2025 TechAiVerse. Designed by Divya Tech.
    • Home
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms & Conditions

    Type above and press Enter to search. Press Esc to cancel.