Close Menu

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Chilkey Slice68 HE: New Evangelion-themed Hall effect gaming keyboard on the way

    Xiaomi unveils HyperOS 3 while confirming dozens of devices will receive beta access

    Surprises from start to finish -person shooter for $5 instead of $50 in Steam Sale

    Facebook X (Twitter) Instagram
    • Artificial Intelligence
    • Business Technology
    • Cryptocurrency
    • Gadgets
    • Gaming
    • Health
    • Software and Apps
    • Technology
    Facebook X (Twitter) Instagram Pinterest Vimeo
    Tech AI Verse
    • Home
    • Artificial Intelligence

      Blue-collar jobs are gaining popularity as AI threatens office work

      August 17, 2025

      Man who asked ChatGPT about cutting out salt from his diet was hospitalized with hallucinations

      August 15, 2025

      What happens when chatbots shape your reality? Concerns are growing online

      August 14, 2025

      Scientists want to prevent AI from going rogue by teaching it to be bad first

      August 8, 2025

      AI models may be accidentally (and secretly) learning each other’s bad behaviors

      July 30, 2025
    • Business

      Why Certified VMware Pros Are Driving the Future of IT

      August 24, 2025

      Murky Panda hackers exploit cloud trust to hack downstream customers

      August 23, 2025

      The rise of sovereign clouds: no data portability, no party

      August 20, 2025

      Israel is reportedly storing millions of Palestinian phone calls on Microsoft servers

      August 6, 2025

      AI site Perplexity uses “stealth tactics” to flout no-crawl edicts, Cloudflare says

      August 5, 2025
    • Crypto

      Japan Auto Parts Maker Invests US Stablecoin Firm and Its Stock Soars

      August 29, 2025

      Stablecoin Card Firm Rain Raise $58M from Samsung and Sapphire

      August 29, 2025

      Shark Tank Star Kevin O’Leary Expands to Bitcoin ETF

      August 29, 2025

      BitMine Stock Moves Opposite to Ethereum — What Are Analysts Saying?

      August 29, 2025

      Argentina’s Opposition Parties Reactivate LIBRA Investigation Into President Milei

      August 29, 2025
    • Technology

      Chilkey Slice68 HE: New Evangelion-themed Hall effect gaming keyboard on the way

      August 29, 2025

      Xiaomi unveils HyperOS 3 while confirming dozens of devices will receive beta access

      August 29, 2025

      Surprises from start to finish -person shooter for $5 instead of $50 in Steam Sale

      August 29, 2025

      OnePlus 15 makes Geekbench debut along with heavily nerfed Snapdragon 8 Elite 2

      August 29, 2025

      Samsung Galaxy Tab S11 Enterprise Edition price leaks ahead of launch

      August 29, 2025
    • Others
      • Gadgets
      • Gaming
      • Health
      • Software and Apps
    Check BMI
    Tech AI Verse
    You are at:Home»Technology»Alibaba’s new open source Qwen3-235B-A22B-2507 beats Kimi-2 and offers low compute version
    Technology

    Alibaba’s new open source Qwen3-235B-A22B-2507 beats Kimi-2 and offers low compute version

    TechAiVerseBy TechAiVerseJuly 27, 2025No Comments9 Mins Read2 Views
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr Email Reddit
    Share
    Facebook Twitter LinkedIn Pinterest WhatsApp Email

    BMI Calculator – Check your Body Mass Index for free!

    Alibaba’s new open source Qwen3-235B-A22B-2507 beats Kimi-2 and offers low compute version

    Chinese e-commerce giant Alibaba has made waves globally in the tech and business communities with its family of “Qwen” gen AI large language models (LLMs), beginning with the launch of the Tongyi Qianwen chatbot in April 2023, through the release of Qwen 3 in April 2025.

    Well, not only are its models powerful and score high on third-party benchmark tests for math, science, reasoning and writing tasks, for the most part, they’ve been released under permissive open-source licensing terms, allowing organizations and enterprises to download, customize, run and generally use them for a variety of purposes, even commercial. Think of them as an alternative to DeepSeek.

    This week, Alibaba’s Qwen Team released the latest updates to its Qwen family, and they’re already attracting attention from AI power users in the West for their top performance. In one case, they edged out the new Kimi-2 model from rival Chinese AI startup Moonshot, released in mid-July 2025.


    The AI Impact Series Returns to San Francisco – August 5

    The next phase of AI is here – are you ready? Join leaders from Block, GSK, and SAP for an exclusive look at how autonomous agents are reshaping enterprise workflows – from real-time decision-making to end-to-end automation.

    Secure your spot now – space is limited: https://bit.ly/3GuuPLF


    The new Qwen3-235B-A22B-2507-Instruct model — released on AI code sharing community Hugging Face alongside a “floating point 8” or FP8 version, which we’ll cover more in-depth below — improves on the original Qwen 3 in reasoning tasks, factual accuracy and multilingual understanding. It also outperforms Claude Opus 4’s “non-thinking” version.

    The new Qwen3 model update also delivers better coding results, alignment with user preferences and long-context handling, according to its creators. But that’s not all…

    Read on for what it offers enterprise users and technical decision-makers.

    FP8 version lets enterprises run Qwen 3 with far less memory and compute

    The “FP8” version’s 8-bit floating point compresses the model’s numerical operations to use less memory and processing power — without noticeably affecting its performance.

    In practice, this means organizations can run a model with Qwen3’s capabilities on smaller, less expensive hardware, or more efficiently in the cloud. The result is faster response times, lower energy costs and the ability to scale deployments without needing massive infrastructure.

    This makes the FP8 model especially attractive for production environments with tight latency or cost constraints. Teams can scale Qwen3’s capabilities to single-node GPU instances or local development machines, avoiding the need for massive multi-GPU clusters. It also lowers the barrier to private fine-tuning and on-premises deployments, where infrastructure resources are finite and total cost of ownership matters.

    Even though Qwen’s team didn’t release official calculations, comparisons to similar FP8 quantized deployments suggest the efficiency savings are substantial. Here’s a practical illustration (updated and corrected on 07/23/2025 at 16:04 pm ET — this piece originally included an inaccurate chart based on a miscalculation. I apologize for the errors and thank readers for contacting me about them.):

    Metric BF16 / BF16-equiv build FP8 Quantized build
    GPU memory use* ≈ 640 GB total (8 × H100-80 GB, TP-8) ≈ 320 GB total on the recommended 4 × H100-80 GB, TP-4    Lowest-footprint community run: ~143 GB across 2 × H100 with Ollama off-loading
    Single-query inference speed† ~74 tokens / s (batch = 1, context = 2 k, 8 × H20-96 GB, TP-8) ~72 tokens / s (same settings, 4 × H20-96 GB, TP-4)
    Power / energy Full node of eight H100s draws ~4-4.5 kW under load (550–600 W per card, plus host)‡ FP8 needs half the cards and moves half the data; Nvidia’s Hopper FP8 case-studies report ≈ 35-40 % lower TCO and energy at comparable throughput
    GPUs needed (practical) 8 × H100-80 GB (TP-8) or 8 × A100-80 GB for parity 4 × H100-80 GB (TP-4). 2 × H100 is possible with aggressive off-loading, at the cost of latency 

    *Disk footprint for the checkpoints: BF16 weights are ~500 GB; the FP8 checkpoint is “well over 200 GB,” so the absolute memory savings on GPU come mostly from needing fewer cards, not from weights alone.

    †Speed figures are from the Qwen3 official SGLang benchmarks (batch 1). Throughput scales almost linearly with batch size: Baseten measured ~45 tokens/s per user at batch 32 and ~1.4 k tokens/s aggregate on the same four-GPU FP8 setup.

    ‡No vendor supplies exact wall-power numbers for Qwen, so we approximate using H100 board specs and NVIDIA Hopper FP8 energy-saving data.

    No more ‘hybrid reasoning’… instead, Qwen will release separate reasoning and instruct models

    Perhaps most interesting, Qwen announced it will no longer be pursuing a “hybrid” reasoning approach, which it introduced with Qwen 3 in April. It seemed to be inspired by an approach pioneered by sovereign AI collective Nous Research.

    This allowed users to toggle on a “reasoning” model, letting the AI model engage in its own self-checking and producing chains-of-thought (CoT) before responding.

    In a way, it was designed to mimic the reasoning capabilities of powerful proprietary models such as OpenAI’s “o” series (o1, o3, o4-mini, o4-mini-high), which also produce “chains-of-thought.”

    However, unlike those rival models which always engage in such “reasoning” for every prompt, Qwen 3 can have the reasoning mode manually switched on or off with a “Thinking Mode” button on the Qwen website chatbot. Or, users can type “/think” before their prompt on a local or privately run model inference.

    The idea was to give users control to engage the slower and more token-intensive thinking mode for more difficult prompts and tasks, and use a non-thinking mode for simpler prompts. But again, this put the onus on the user to decide. While flexible, it also introduced design complexity and inconsistent behavior in some cases.

    Now As Qwen wrote on X:

    “After talking with the community and thinking it through, we decided to stop using hybrid thinking mode. Instead, we’ll train Instruct and Thinking models separately so we can get the best quality possible.”

    With the 2507 update — an instruct or non-reasoning model, for now — Alibaba is no longer straddling both approaches in a single model. Instead, separate model variants will be trained for instruction and reasoning tasks, respectively.

    The result is a model that adheres more closely to user instructions, generates more predictable responses and, as benchmark data shows, improves significantly across multiple evaluation domains.

    Performance benchmarks and use cases

    Compared to its predecessor, the Qwen3-235B-A22B-Instruct-2507 model delivers measurable improvements:

    • MMLU-Pro scores rise from 75.2 to 83.0, a notable gain in general knowledge performance.
    • GPQA and SuperGPQA benchmarks improve by 15–20 percentage points, reflecting stronger factual accuracy.
    • Reasoning tasks such as AIME25 and ARC-AGI show more than double the previous performance.
    • Code generation improves, with LiveCodeBench scores increasing from 32.9 to 51.8.
    • Multilingual support expands, aided by improved coverage of long-tail languages and better alignment across dialects.

    The model maintains a mixture-of-experts (MoE) architecture, activating 8 out of 128 experts during inference, with a total of 235 billion parameters — 22 billion of which are active at any time.

    As mentioned, the FP8 version introduces fine-grained quantization for better inference speed and reduced memory usage.

    Enterprise-ready by design

    Unlike many open-source LLMs, which are often released under restrictive research-only licenses or require API access for commercial use, Qwen3 is squarely aimed at enterprise deployment.

    Boasting a permissive Apache 2.0 license, this means enterprises can use it freely for commercial applications. They may also:

    • Deploy models locally or through OpenAI-compatible APIs using vLLM and SGLang;
    • Fine-tune models privately using LoRA or QLoRA without exposing proprietary data;
    • Log and inspect all prompts and outputs on-premises for compliance and auditing;
    • Scale from prototype to production using dense variants (from 0.6B to 32B) or MoE checkpoints.

    Alibaba’s team also introduced Qwen-Agent, a lightweight framework that abstracts tool invocation logic for users building agentic systems.

    Benchmarks like TAU-Retail and BFCL-v3 suggest the instruction model can competently execute multi-step decision tasks — typically the domain of purpose-built agents.

    Community and industry reactions

    The release has already been well received by AI power users.

    Paul Couvert, AI educator and founder of private LLM chatbot host Blue Shell AI, posted on X a comparison chart showing Qwen3-235B-A22B-Instruct-2507 outperforming Claude Opus 4 and Kimi K2 on benchmarks like GPQA, AIME25 and Arena-Hard v2, calling it “even more powerful than Kimi K2… and even better than Claude Opus 4.”

    AI influencer NIK (@ns123abc) commented on its rapid impact: “Qwen-3-235B made Kimi K2 irrelevant after only one week, despite being one quarter the size, and you’re laughing.”

    Meanwhile, Jeff Boudier, head of product at Hugging Face, highlighted the deployment benefits: “Qwen silently released a massive improvement to Qwen3… it tops best open (Kimi K2, a 4x larger model) and closed (Claude Opus 4) LLMs on benchmarks.”

    He praised the availability of an FP8 checkpoint for faster inference, 1-click deployment on Azure ML and support for local use via MLX on Mac or INT4 builds from Intel.

    The overall tone from developers has been enthusiastic, as the model’s balance of performance, licensing and deployability appeals to both hobbyists and professionals.

    What’s next for Qwen team?

    Alibaba is already laying the groundwork for future updates. A separate reasoning-focused model is in the pipeline, and the Qwen roadmap points toward increasingly agentic systems capable of long-horizon task planning.

    Multimodal support, as seen in Qwen2.5-Omni and Qwen-VL models, is also expected to expand further.

    And already, rumors and rumblings have begun as Qwen team members tease yet another update to their model family, with their web properties revealing URL strings for a new Qwen3-Coder-480B-A35B-Instruct model, likely a 480-billion parameter MoE with a token context of 1 million.

    What Qwen3-235B-A22B-Instruct-2507 ultimately signals is not just another leap in benchmark performance, but a maturation of open models as viable alternatives to proprietary systems.

    The flexibility of deployment, strong general performance and enterprise-friendly licensing give the model a unique edge in a crowded field.

    For teams looking to integrate advanced instruction-following models into their AI stack — without the limitations of vendor lock-in or usage-based fees — Qwen3 is a serious contender.

    Daily insights on business use cases with VB Daily

    If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI.

    Read our Privacy Policy

    Thanks for subscribing. Check out more VB newsletters here.

    An error occured.

    BMI Calculator – Check your Body Mass Index for free!

    Share. Facebook Twitter Pinterest LinkedIn Reddit WhatsApp Telegram Email
    Previous ArticleFujitsu to cut at least 100 more UK staff
    Next Article Freed says 20,000 clinicians are using its medical AI transcription ‘scribe,’ but competition is rising fast
    TechAiVerse
    • Website

    Jonathan is a tech enthusiast and the mind behind Tech AI Verse. With a passion for artificial intelligence, consumer tech, and emerging innovations, he deliver clear, insightful content to keep readers informed. From cutting-edge gadgets to AI advancements and cryptocurrency trends, Jonathan breaks down complex topics to make technology accessible to all.

    Related Posts

    Chilkey Slice68 HE: New Evangelion-themed Hall effect gaming keyboard on the way

    August 29, 2025

    Xiaomi unveils HyperOS 3 while confirming dozens of devices will receive beta access

    August 29, 2025

    Surprises from start to finish -person shooter for $5 instead of $50 in Steam Sale

    August 29, 2025
    Leave A Reply Cancel Reply

    Top Posts

    Ping, You’ve Got Whale: AI detection system alerts ships of whales in their path

    April 22, 2025166 Views

    6.7 Cummins Lifter Failure: What Years Are Affected (And Possible Fixes)

    April 14, 202548 Views

    New Akira ransomware decryptor cracks encryptions keys using GPUs

    March 16, 202530 Views

    Is Libby Compatible With Kobo E-Readers?

    March 31, 202528 Views
    Don't Miss
    Technology August 29, 2025

    Chilkey Slice68 HE: New Evangelion-themed Hall effect gaming keyboard on the way

    Chilkey Slice68 HE: New Evangelion-themed Hall effect gaming keyboard on the way – NotebookCheck.net NewsWhile…

    Xiaomi unveils HyperOS 3 while confirming dozens of devices will receive beta access

    Surprises from start to finish -person shooter for $5 instead of $50 in Steam Sale

    OnePlus 15 makes Geekbench debut along with heavily nerfed Snapdragon 8 Elite 2

    Stay In Touch
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    • YouTube
    • Vimeo

    Subscribe to Updates

    Get the latest creative news from SmartMag about art & design.

    About Us
    About Us

    Welcome to Tech AI Verse, your go-to destination for everything technology! We bring you the latest news, trends, and insights from the ever-evolving world of tech. Our coverage spans across global technology industry updates, artificial intelligence advancements, machine learning ethics, and automation innovations. Stay connected with us as we explore the limitless possibilities of technology!

    Facebook X (Twitter) Pinterest YouTube WhatsApp
    Our Picks

    Chilkey Slice68 HE: New Evangelion-themed Hall effect gaming keyboard on the way

    August 29, 20252 Views

    Xiaomi unveils HyperOS 3 while confirming dozens of devices will receive beta access

    August 29, 20252 Views

    Surprises from start to finish -person shooter for $5 instead of $50 in Steam Sale

    August 29, 20252 Views
    Most Popular

    Xiaomi 15 Ultra Officially Launched in China, Malaysia launch to follow after global event

    March 12, 20250 Views

    Apple thinks people won’t use MagSafe on iPhone 16e

    March 12, 20250 Views

    French Apex Legends voice cast refuses contracts over “unacceptable” AI clause

    March 12, 20250 Views
    © 2025 TechAiVerse. Designed by Divya Tech.
    • Home
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms & Conditions

    Type above and press Enter to search. Press Esc to cancel.