Close Menu

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Newegg’s Ryzen 7 9850X3D combo bundle offers over $500 in savings on three key components, including 64GB DDR5 RAM

    Tenku Pocket 8 micro laptop launched with 8-inch touch display and 8-core Intel Alder Lake-N CPU

    Endorfy Signum M30 Air and M30 ARGB arrive as brand-new micro ATX PC towers

    Facebook X (Twitter) Instagram
    • Artificial Intelligence
    • Business Technology
    • Cryptocurrency
    • Gadgets
    • Gaming
    • Health
    • Software and Apps
    • Technology
    Facebook X (Twitter) Instagram Pinterest Vimeo
    Tech AI Verse
    • Home
    • Artificial Intelligence

      Tensions between the Pentagon and AI giant Anthropic reach a boiling point

      February 21, 2026

      Read the extended transcript: President Donald Trump interviewed by ‘NBC Nightly News’ anchor Tom Llamas

      February 6, 2026

      Stocks and bitcoin sink as investors dump software company shares

      February 4, 2026

      AI, crypto and Trump super PACs stash millions to spend on the midterms

      February 2, 2026

      To avoid accusations of AI cheating, college students are turning to AI

      January 29, 2026
    • Business

      How Smarsh built an AI front door for regulated industries — and drove 59% self-service adoption

      February 24, 2026

      Where MENA CIOs draw the line on AI sovereignty

      February 24, 2026

      Ex-President’s shift away from Xbox consoles to cloud gaming reportedly caused friction

      February 24, 2026

      Gartner: Why neoclouds are the future of GPU-as-a-Service

      February 21, 2026

      The HDD brand that brought you the 1.8-inch, 2.5-inch, and 3.5-inch hard drives is now back with a $19 pocket-sized personal cloud for your smartphones

      February 12, 2026
    • Crypto

      BitMine Buys $93 Million in ETH, but Ethereum Slides as Holders Resume Selling

      February 24, 2026

      XRP Ledger Sets Multiple Key Records in February Despite Price Decline

      February 24, 2026

      Bhutan Rolls Out Solana-Backed Visas Even As Demand Stays Weak

      February 24, 2026

      ZachXBT Teases Major Crypto Exposé Ahead of Feb. 26 — How Is Smart Money Positioned?

      February 24, 2026

      Acurast turns 225,000 smartphones into a secure AI network on Base

      February 24, 2026
    • Technology

      Newegg’s Ryzen 7 9850X3D combo bundle offers over $500 in savings on three key components, including 64GB DDR5 RAM

      February 25, 2026

      Tenku Pocket 8 micro laptop launched with 8-inch touch display and 8-core Intel Alder Lake-N CPU

      February 25, 2026

      Endorfy Signum M30 Air and M30 ARGB arrive as brand-new micro ATX PC towers

      February 25, 2026

      Ditch the Adobe subscription: This PDF editor is yours for life for $25

      February 25, 2026

      Her AI agent nuked 200 emails. This guardrail stops the next disaster

      February 25, 2026
    • Others
      • Gadgets
      • Gaming
      • Health
      • Software and Apps
    Check BMI
    Tech AI Verse
    You are at:Home»Technology»Meta defends Llama 4 release against ‘reports of mixed quality,’ blames bugs
    Technology

    Meta defends Llama 4 release against ‘reports of mixed quality,’ blames bugs

    TechAiVerseBy TechAiVerseApril 8, 2025No Comments6 Mins Read3 Views
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr Email Reddit
    Meta defends Llama 4 release against ‘reports of mixed quality,’ blames bugs
    Share
    Facebook Twitter LinkedIn Pinterest WhatsApp Email

    Meta defends Llama 4 release against ‘reports of mixed quality,’ blames bugs

    April 7, 2025 5:11 PM

    Credit: VentureBeat made with Midjourney

    Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More


    Meta’s new flagship AI language model Llama 4 came suddenly over the weekend, with the parent company of Facebook, Instagram, WhatsApp and Quest VR (among other services and products) revealing not one, not two, but three versions — all upgraded to be more powerful and performant using the popular “Mixture-of-Experts” architecture and a new training method involving fixed hyperparameters, known as MetaP.

    Also, all three are equipped with massive context windows — the amount of information that an AI language model can handle in one input/output exchange with a user or tool.

    But following the surprise announcement and public release of two of those models for download and usage — the lower-parameter Llama 4 Scout and mid-tier Llama 4 Maverick — on Saturday, the response from the AI community on social media has been less than adoring.

    Llama 4 sparks confusion and criticism among AI users

    An unverified post on the North American Chinese language community forum 1point3acres made its way over to the r/LocalLlama subreddit on Reddit alleging to be from a researcher at Meta’s GenAI organization who claimed that the model performed poorly on third-party benchmarks internally and that company leadership “suggested blending test sets from various benchmarks during the post-training process, aiming to meet the targets across various metrics and produce a ‘presentable’ result.”

    The post was met with skepticism from the community in its authenticity, and a VentureBeat email to a Meta spokesperson has not yet received a reply.

    But other users found reasons to doubt the benchmarks regardless.

    “At this point, I highly suspect Meta bungled up something in the released weights … if not, they should lay off everyone who worked on this and then use money to acquire Nous,” commented @cto_junior on X, in reference to an independent user test showing Llama 4 Maverick’s poor performance (16%) on a benchmark known as aider polyglot, which runs a model through 225 coding tasks. That’s well below the performance of comparably sized, older models such as DeepSeek V3 and Claude 3.7 Sonnet.

    Referencing the 10 million-token context window Meta boasted for Llama 4 Scout, AI PhD and author Andriy Burkov wrote on X in part that: “The declared 10M context is virtual because no model was trained on prompts longer than 256k tokens. This means that if you send more than 256k tokens to it, you will get low-quality output most of the time.”

    Also on the r/LocalLlama subreddit, user Dr_Karminski wrote that “I’m incredibly disappointed with Llama-4,” and demonstrated its poor performance compared to DeepSeek’s non-reasoning V3 model on coding tasks such as simulating balls bouncing around a heptagon.

    Former Meta researcher and current AI2 (Allen Institute for Artificial Intelligence) Senior Research Scientist Nathan Lambert took to his Interconnects Substack blog on Monday to point out that a benchmark comparison posted by Meta to its own Llama download site of Llama 4 Maverick to other models, based on cost-to-performance on the third-party head-to-head comparison tool LMArena ELO aka Chatbot Arena, actually used a different version of Llama 4 Maverick than the company itself had made publicly available — one “optimized for conversationality.”

    As Lambert wrote: “Sneaky. The results below are fake, and it is a major slight to Meta’s community to not release the model they used to create their major marketing push. We’ve seen many open models that come around to maximize on ChatBotArena while destroying the model’s performance on important skills like math or code.”

    Lambert went on to note that while this particular model on the arena was “tanking the technical reputation of the release because its character is juvenile,” including lots of emojis and frivolous emotive dialog, “The actual model on other hosting providers is quite smart and has a reasonable tone!”

    In response to the torrent of criticism and accusations of benchmark cooking, Meta’s VP and Head of GenAI Ahmad Al-Dahle took to X to state:

    “We’re glad to start getting Llama 4 in all your hands. We’re already hearing lots of great results people are getting with these models.

    That said, we’re also hearing some reports of mixed quality across different services. Since we dropped the models as soon as they were ready, we expect it’ll take several days for all the public implementations to get dialed in. We’ll keep working through our bug fixes and onboarding partners.

    We’ve also heard claims that we trained on test sets — that’s simply not true and we would never do that. Our best understanding is that the variable quality people are seeing is due to needing to stabilize implementations.

    We believe the Llama 4 models are a significant advancement and we’re looking forward to working with the community to unlock their value.“

    Yet even that response was met with many complaints of poor performance and calls for further information, such as more technical documentation outlining the Llama 4 models and their training processes, as well as additional questions about why this release compared to all prior Llama releases was particularly riddled with issues.

    It also comes on the heels of the number two at Meta’s VP of Research Joelle Pineau, who worked in the adjacent Meta Foundational Artificial Intelligence Research (FAIR) organization, announcing her departure from the company on LinkedIn last week with “nothing but admiration and deep gratitude for each of my managers.” Pineau, it should be noted also promoted the release of the Llama 4 model family this weekend.

    Llama 4 continues to spread to other inference providers with mixed results, but it’s safe to say the initial release of the model family has not been a slam dunk with the AI community.

    And the upcoming Meta LlamaCon on April 29, the first celebration and gathering for third-party developers of the model family, will likely have much fodder for discussion. We’ll be tracking it all, stay tuned.

    Daily insights on business use cases with VB Daily

    If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI.

    Read our Privacy Policy

    Thanks for subscribing. Check out more VB newsletters here.

    An error occured.

    Share. Facebook Twitter Pinterest LinkedIn Reddit WhatsApp Telegram Email
    Previous Article$115 million just poured into this startup that makes engineering 1,000x faster — and Bezos, Altman, and Nvidia are all betting on its success
    Next Article Stanford’s AI Index: 5 critical insights reshaping enterprise tech strategy
    TechAiVerse
    • Website

    Jonathan is a tech enthusiast and the mind behind Tech AI Verse. With a passion for artificial intelligence, consumer tech, and emerging innovations, he deliver clear, insightful content to keep readers informed. From cutting-edge gadgets to AI advancements and cryptocurrency trends, Jonathan breaks down complex topics to make technology accessible to all.

    Related Posts

    Newegg’s Ryzen 7 9850X3D combo bundle offers over $500 in savings on three key components, including 64GB DDR5 RAM

    February 25, 2026

    Tenku Pocket 8 micro laptop launched with 8-inch touch display and 8-core Intel Alder Lake-N CPU

    February 25, 2026

    Endorfy Signum M30 Air and M30 ARGB arrive as brand-new micro ATX PC towers

    February 25, 2026
    Leave A Reply Cancel Reply

    Top Posts

    Ping, You’ve Got Whale: AI detection system alerts ships of whales in their path

    April 22, 2025693 Views

    Lumo vs. Duck AI: Which AI is Better for Your Privacy?

    July 31, 2025279 Views

    6.7 Cummins Lifter Failure: What Years Are Affected (And Possible Fixes)

    April 14, 2025160 Views

    6 Best MagSafe Phone Grips (2025), Tested and Reviewed

    April 6, 2025122 Views
    Don't Miss
    Technology February 25, 2026

    Newegg’s Ryzen 7 9850X3D combo bundle offers over $500 in savings on three key components, including 64GB DDR5 RAM

    Newegg’s Ryzen 7 9850X3D combo bundle offers over $500 in savings on three key components,…

    Tenku Pocket 8 micro laptop launched with 8-inch touch display and 8-core Intel Alder Lake-N CPU

    Endorfy Signum M30 Air and M30 ARGB arrive as brand-new micro ATX PC towers

    Samsung Galaxy S26 series go official, 512GB base storage for Malaysia, from RM5199

    Stay In Touch
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    • YouTube
    • Vimeo

    Subscribe to Updates

    Get the latest creative news from SmartMag about art & design.

    About Us
    About Us

    Welcome to Tech AI Verse, your go-to destination for everything technology! We bring you the latest news, trends, and insights from the ever-evolving world of tech. Our coverage spans across global technology industry updates, artificial intelligence advancements, machine learning ethics, and automation innovations. Stay connected with us as we explore the limitless possibilities of technology!

    Facebook X (Twitter) Pinterest YouTube WhatsApp
    Our Picks

    Newegg’s Ryzen 7 9850X3D combo bundle offers over $500 in savings on three key components, including 64GB DDR5 RAM

    February 25, 20262 Views

    Tenku Pocket 8 micro laptop launched with 8-inch touch display and 8-core Intel Alder Lake-N CPU

    February 25, 20262 Views

    Endorfy Signum M30 Air and M30 ARGB arrive as brand-new micro ATX PC towers

    February 25, 20262 Views
    Most Popular

    7 Best Kids Bikes (2025): Mountain, Balance, Pedal, Coaster

    March 13, 20250 Views

    VTOMAN FlashSpeed 1500: Plenty Of Power For All Your Gear

    March 13, 20250 Views

    This new Roomba finally solves the big problem I have with robot vacuums

    March 13, 20250 Views
    © 2026 TechAiVerse. Designed by Divya Tech.
    • Home
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms & Conditions

    Type above and press Enter to search. Press Esc to cancel.