Close Menu

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    This stackable Xbox Game Pass Ultimate 1-month code is $25

    Turn leads into deals with a $50 CRM lifetime license

    This week’s free game on Epic Games Store is a sci-fi detective trip

    Facebook X (Twitter) Instagram
    • Artificial Intelligence
    • Business Technology
    • Cryptocurrency
    • Gadgets
    • Gaming
    • Health
    • Software and Apps
    • Technology
    Facebook X (Twitter) Instagram Pinterest Vimeo
    Tech AI Verse
    • Home
    • Artificial Intelligence

      Read the extended transcript: President Donald Trump interviewed by ‘NBC Nightly News’ anchor Tom Llamas

      February 6, 2026

      Stocks and bitcoin sink as investors dump software company shares

      February 4, 2026

      AI, crypto and Trump super PACs stash millions to spend on the midterms

      February 2, 2026

      To avoid accusations of AI cheating, college students are turning to AI

      January 29, 2026

      ChatGPT can embrace authoritarian ideas after just one prompt, researchers say

      January 24, 2026
    • Business

      The HDD brand that brought you the 1.8-inch, 2.5-inch, and 3.5-inch hard drives is now back with a $19 pocket-sized personal cloud for your smartphones

      February 12, 2026

      New VoidLink malware framework targets Linux cloud servers

      January 14, 2026

      Nvidia Rubin’s rack-scale encryption signals a turning point for enterprise AI security

      January 13, 2026

      How KPMG is redefining the future of SAP consulting on a global scale

      January 10, 2026

      Top 10 cloud computing stories of 2025

      December 22, 2025
    • Crypto

      Pi Network Tops Daily Charts with a 25% Rally, Here’s Why

      February 15, 2026

      Solana New Holders Drop by 2.3 Million, Will It Impact Price Recovery?

      February 15, 2026

      CLARITY Act’s Stablecoin Yield Restrictions Could Benefit Foreign Currencies, Not USD

      February 15, 2026

      Bitcoin Shorts Reach Most Extreme Level Since 2024 Bottom

      February 15, 2026

      Coinbase Urges Fed to Modernize US Payments to Match European Standards

      February 15, 2026
    • Technology

      This stackable Xbox Game Pass Ultimate 1-month code is $25

      February 15, 2026

      Turn leads into deals with a $50 CRM lifetime license

      February 15, 2026

      This week’s free game on Epic Games Store is a sci-fi detective trip

      February 15, 2026

      Grab 2x 100W Anker USB-C cables for $10

      February 15, 2026

      State-sponsored hackers love Gemini, Google says

      February 15, 2026
    • Others
      • Gadgets
      • Gaming
      • Health
      • Software and Apps
    Check BMI
    Tech AI Verse
    You are at:Home»Technology»DeepSeek-V3 now runs at 20 tokens per second on Mac Studio, and that’s a nightmare for OpenAI
    Technology

    DeepSeek-V3 now runs at 20 tokens per second on Mac Studio, and that’s a nightmare for OpenAI

    TechAiVerseBy TechAiVerseMarch 24, 2025No Comments10 Mins Read3 Views
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr Email Reddit
    DeepSeek-V3 now runs at 20 tokens per second on Mac Studio, and that’s a nightmare for OpenAI
    Share
    Facebook Twitter LinkedIn Pinterest WhatsApp Email

    DeepSeek-V3 now runs at 20 tokens per second on Mac Studio, and that’s a nightmare for OpenAI

    March 24, 2025 12:50 PM

    Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More


    Chinese AI startup DeepSeek has quietly released a new large language model that’s already sending ripples through the artificial intelligence industry — not just for its capabilities, but for how it’s being deployed. The 641-gigabyte model, dubbed DeepSeek-V3-0324, appeared on AI repository Hugging Face today with virtually no announcement, continuing the company’s pattern of low-key but impactful releases.

    What makes this launch particularly notable is the model’s MIT license — making it freely available for commercial use — and early reports that it can run directly on consumer-grade hardware, specifically Apple’s Mac Studio with M3 Ultra chip.

    The new Deep Seek V3 0324 in 4-bit runs at > 20 toks/sec on a 512GB M3 Ultra with mlx-lm! pic.twitter.com/wFVrFCxGS6

    — Awni Hannun (@awnihannun) March 24, 2025

    “The new DeepSeek-V3-0324 in 4-bit runs at > 20 tokens/second on a 512GB M3 Ultra with mlx-lm!” wrote AI researcher Awni Hannun on social media. While the $9,499 Mac Studio might stretch the definition of “consumer hardware,” the ability to run such a massive model locally is a major departure from the data center requirements typically associated with state-of-the-art AI.

    DeepSeek’s stealth launch strategy disrupts AI market expectations

    The 685-billion-parameter model arrived with no accompanying whitepaper, blog post, or marketing push — just an empty README file and the model weights themselves. This approach contrasts sharply with the carefully orchestrated product launches typical of Western AI companies, where months of hype often precede actual releases.

    Early testers report significant improvements over the previous version. AI researcher Xeophon proclaimed in a post on X.com: “Tested the new DeepSeek V3 on my internal bench and it has a huge jump in all metrics on all tests. It is now the best non-reasoning model, dethroning Sonnet 3.5.”

    Tested the new DeepSeek V3 on my internal bench and it has a huge jump in all metrics on all tests.
    It is now the best non-reasoning model, dethroning Sonnet 3.5.

    Congrats @deepseek_ai! pic.twitter.com/efEu2FQSBe

    — Xeophon (@TheXeophon) March 24, 2025

    This claim, if validated by broader testing, would position DeepSeek’s new model above Claude Sonnet 3.5 from Anthropic, one of the most respected commercial AI systems. And unlike Sonnet, which requires a subscription, DeepSeek-V3-0324‘s weights are freely available for anyone to download and use.

    How DeepSeek V3-0324’s breakthrough architecture achieves unmatched efficiency

    DeepSeek-V3-0324 employs a mixture-of-experts (MoE) architecture that fundamentally reimagines how large language models operate. Traditional models activate their entire parameter count for every task, but DeepSeek’s approach activates only about 37 billion of its 685 billion parameters during specific tasks.

    This selective activation represents a paradigm shift in model efficiency. By activating only the most relevant “expert” parameters for each specific task, DeepSeek achieves performance comparable to much larger fully-activated models while drastically reducing computational demands.

    The model incorporates two additional breakthrough technologies: Multi-Head Latent Attention (MLA) and Multi-Token Prediction (MTP). MLA enhances the model’s ability to maintain context across long passages of text, while MTP generates multiple tokens per step instead of the usual one-at-a-time approach. Together, these innovations boost output speed by nearly 80%.

    Simon Willison, a developer tools creator, noted in a blog post that a 4-bit quantized version reduces the storage footprint to 352GB, making it feasible to run on high-end consumer hardware like the Mac Studio with M3 Ultra chip.

    This represents a potentially significant shift in AI deployment. While traditional AI infrastructure typically relies on multiple Nvidia GPUs consuming several kilowatts of power, the Mac Studio draws less than 200 watts during inference. This efficiency gap suggests the AI industry may need to rethink assumptions about infrastructure requirements for top-tier model performance.

    China’s open source AI revolution challenges Silicon Valley’s closed garden model

    DeepSeek’s release strategy exemplifies a fundamental divergence in AI business philosophy between Chinese and Western companies. While U.S. leaders like OpenAI and Anthropic keep their models behind paywalls, Chinese AI companies increasingly embrace permissive open-source licensing.

    This approach is rapidly transforming China’s AI ecosystem. The open availability of cutting-edge models creates a multiplier effect, enabling startups, researchers, and developers to build upon sophisticated AI technology without massive capital expenditure. This has accelerated China’s AI capabilities at a pace that has shocked Western observers.

    The business logic behind this strategy reflects market realities in China. With multiple well-funded competitors, maintaining a proprietary approach becomes increasingly difficult when competitors offer similar capabilities for free. Open-sourcing creates alternative value pathways through ecosystem leadership, API services, and enterprise solutions built atop freely available foundation models.

    Even established Chinese tech giants have recognized this shift. Baidu announced plans to make its Ernie 4.5 model series open-source by June, while Alibaba and Tencent have released open-source AI models with specialized capabilities. This movement stands in stark contrast to the API-centric strategy employed by Western leaders.

    The open-source approach also addresses unique challenges faced by Chinese AI companies. With restrictions on access to cutting-edge Nvidia chips, Chinese firms have emphasized efficiency and optimization to achieve competitive performance with more limited computational resources. This necessity-driven innovation has now become a potential competitive advantage.

    DeepSeek V3-0324: The foundation for an AI reasoning revolution

    The timing and characteristics of DeepSeek-V3-0324 strongly suggest it will serve as the foundation for DeepSeek-R2, an improved reasoning-focused model expected within the next two months. This follows DeepSeek’s established pattern, where its base models precede specialized reasoning models by several weeks.

    “This lines up with how they released V3 around Christmas followed by R1 a few weeks later. R2 is rumored for April so this could be it,” noted Reddit user mxforest.

    The implications of an advanced open-source reasoning model cannot be overstated. Current reasoning models like OpenAI’s o1 and DeepSeek’s R1 represent the cutting edge of AI capabilities, demonstrating unprecedented problem-solving abilities in domains from mathematics to coding. Making this technology freely available would democratize access to AI systems currently limited to those with substantial budgets.

    The potential R2 model arrives amid significant revelations about reasoning models’ computational demands. Nvidia CEO Jensen Huang recently noted that DeepSeek’s R1 model “consumes 100 times more compute than a non-reasoning AI,” contradicting earlier industry assumptions about efficiency. This reveals the remarkable achievement behind DeepSeek’s models, which deliver competitive performance while operating under greater resource constraints than their Western counterparts.

    If DeepSeek-R2 follows the trajectory set by R1, it could present a direct challenge to GPT-5, OpenAI’s next flagship model rumored for release in coming months. The contrast between OpenAI’s closed, heavily-funded approach and DeepSeek’s open, resource-efficient strategy represents two competing visions for AI’s future.

    How to experience DeepSeek V3-0324: A complete guide for developers and users

    For those eager to experiment with DeepSeek-V3-0324, several pathways exist depending on technical needs and resources. The complete model weights are available from Hugging Face, though the 641GB size makes direct download practical only for those with substantial storage and computational resources.

    For most users, cloud-based options offer the most accessible entry point. OpenRouter provides free API access to the model, with a user-friendly chat interface. Simply select DeepSeek V3 0324 as the model to begin experimenting.

    DeepSeek’s own chat interface at chat.deepseek.com has likely been updated to the new version as well, though the company hasn’t explicitly confirmed this. Early users report the model is accessible through this platform with improved performance over previous versions.

    Developers looking to integrate the model into applications can access it through various inference providers. Hyperbolic Labs announced immediate availability as “the first inference provider serving this model on Hugging Face,” while OpenRouter offers API access compatible with the OpenAI SDK.

    DeepSeek-V3-0324 Now Live on Hyperbolic ?

    At Hyperbolic, we’re committed to delivering the latest open-source models as soon as they’re available. This is our promise to the developer community.

    Start inferencing today. pic.twitter.com/495xf6kofa

    — Hyperbolic (@hyperbolic_labs) March 24, 2025

    DeepSeek’s new model prioritizes technical precision over conversational warmth

    Early users have reported a noticeable shift in the model’s communication style. While previous DeepSeek models were praised for their conversational, human-like tone, “V3-0324” presents a more formal, technically-oriented persona.

    “Is it only me or does this version feel less human like?” asked Reddit user nother_level. “For me the thing that set apart deepseek v3 from others were the fact that it felt more like human. Like the tone the words and such it was not robotic sounding like other llm’s but now with this version its like other llms sounding robotic af.”

    Another user, AppearanceHeavy6724, added: “Yeah, it lost its aloof charm for sure, it feels too intellectual for its own good.”

    This personality shift likely reflects deliberate design choices by DeepSeek’s engineers. The move toward a more precise, analytical communication style suggests a strategic repositioning of the model for professional and technical applications rather than casual conversation. This aligns with broader industry trends, as AI developers increasingly recognize that different use cases benefit from different interaction styles.

    For developers building specialized applications, this more precise communication style may actually represent an advantage, providing clearer and more consistent outputs for integration into professional workflows. However, it may limit the model’s appeal for customer-facing applications where warmth and approachability are valued.

    How DeepSeek’s open source strategy is redrawing the global AI landscape

    DeepSeek’s approach to AI development and distribution represents more than a technical achievement — it embodies a fundamentally different vision for how advanced technology should propagate through society. By making cutting-edge AI freely available under permissive licensing, DeepSeek enables exponential innovation that closed models inherently constrain.

    This philosophy is rapidly closing the perceived AI gap between China and the United States. Just months ago, most analysts estimated China lagged 1-2 years behind U.S. AI capabilities. Today, that gap has narrowed dramatically to perhaps 3-6 months, with some areas approaching parity or even Chinese leadership.

    The parallels to Android’s impact on the mobile ecosystem are striking. Google’s decision to make Android freely available created a platform that ultimately achieved dominant global market share. Similarly, open-source AI models may outcompete closed systems through sheer ubiquity and the collective innovation of thousands of contributors.

    The implications extend beyond market competition to fundamental questions about technology access. Western AI leaders increasingly face criticism for concentrating advanced capabilities among well-resourced corporations and individuals. DeepSeek’s approach distributes these capabilities more broadly, potentially accelerating global AI adoption.

    As DeepSeek-V3-0324 finds its way into research labs and developer workstations worldwide, the competition is no longer simply about building the most powerful AI, but about enabling the most people to build with AI. In that race, DeepSeek’s quiet release speaks volumes about the future of artificial intelligence. The company that shares its technology most freely may ultimately wield the greatest influence over how AI reshapes our world.

    Daily insights on business use cases with VB Daily

    If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI.

    Read our Privacy Policy

    Thanks for subscribing. Check out more VB newsletters here.

    An error occured.

    Share. Facebook Twitter Pinterest LinkedIn Reddit WhatsApp Telegram Email
    Previous ArticleThe issues game developers are facing in 2025 | IGDA interview
    Next Article Journalist Christopher Dring teams with Geoff Keighley to unveil The Game Business publication
    TechAiVerse
    • Website

    Jonathan is a tech enthusiast and the mind behind Tech AI Verse. With a passion for artificial intelligence, consumer tech, and emerging innovations, he deliver clear, insightful content to keep readers informed. From cutting-edge gadgets to AI advancements and cryptocurrency trends, Jonathan breaks down complex topics to make technology accessible to all.

    Related Posts

    This stackable Xbox Game Pass Ultimate 1-month code is $25

    February 15, 2026

    Turn leads into deals with a $50 CRM lifetime license

    February 15, 2026

    This week’s free game on Epic Games Store is a sci-fi detective trip

    February 15, 2026
    Leave A Reply Cancel Reply

    Top Posts

    Ping, You’ve Got Whale: AI detection system alerts ships of whales in their path

    April 22, 2025676 Views

    Lumo vs. Duck AI: Which AI is Better for Your Privacy?

    July 31, 2025260 Views

    6.7 Cummins Lifter Failure: What Years Are Affected (And Possible Fixes)

    April 14, 2025153 Views

    6 Best MagSafe Phone Grips (2025), Tested and Reviewed

    April 6, 2025112 Views
    Don't Miss
    Technology February 15, 2026

    This stackable Xbox Game Pass Ultimate 1-month code is $25

    This stackable Xbox Game Pass Ultimate 1-month code is $25 Image: StackCommerce TL;DR: A stackable month…

    Turn leads into deals with a $50 CRM lifetime license

    This week’s free game on Epic Games Store is a sci-fi detective trip

    Grab 2x 100W Anker USB-C cables for $10

    Stay In Touch
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    • YouTube
    • Vimeo

    Subscribe to Updates

    Get the latest creative news from SmartMag about art & design.

    About Us
    About Us

    Welcome to Tech AI Verse, your go-to destination for everything technology! We bring you the latest news, trends, and insights from the ever-evolving world of tech. Our coverage spans across global technology industry updates, artificial intelligence advancements, machine learning ethics, and automation innovations. Stay connected with us as we explore the limitless possibilities of technology!

    Facebook X (Twitter) Pinterest YouTube WhatsApp
    Our Picks

    This stackable Xbox Game Pass Ultimate 1-month code is $25

    February 15, 20263 Views

    Turn leads into deals with a $50 CRM lifetime license

    February 15, 20262 Views

    This week’s free game on Epic Games Store is a sci-fi detective trip

    February 15, 20263 Views
    Most Popular

    7 Best Kids Bikes (2025): Mountain, Balance, Pedal, Coaster

    March 13, 20250 Views

    VTOMAN FlashSpeed 1500: Plenty Of Power For All Your Gear

    March 13, 20250 Views

    This new Roomba finally solves the big problem I have with robot vacuums

    March 13, 20250 Views
    © 2026 TechAiVerse. Designed by Divya Tech.
    • Home
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms & Conditions

    Type above and press Enter to search. Press Esc to cancel.