Close Menu

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Windows 10 KB5062554 update breaks emoji panel search feature

    Google Gemini flaw hijacks email summaries for phishing

    Foldables are in and suddenly really thin

    Facebook X (Twitter) Instagram
    • Artificial Intelligence
    • Business Technology
    • Cryptocurrency
    • Gadgets
    • Gaming
    • Health
    • Software and Apps
    • Technology
    Facebook X (Twitter) Instagram Pinterest Vimeo
    Tech AI Verse
    • Home
    • Artificial Intelligence

      AI chatbot Grok issues apology for antisemitic posts

      July 13, 2025

      Apple sued by shareholders for allegedly overstating AI progress

      June 22, 2025

      How far will AI go to defend its own survival?

      June 2, 2025

      The internet thinks this video from Gaza is AI. Here’s how we proved it isn’t.

      May 30, 2025

      Nvidia CEO hails Trump’s plan to rescind some export curbs on AI chips to China

      May 22, 2025
    • Business

      Cloudflare open-sources Orange Meets with End-to-End encryption

      June 29, 2025

      Google links massive cloud outage to API management issue

      June 13, 2025

      The EU challenges Google and Cloudflare with its very own DNS resolver that can filter dangerous traffic

      June 11, 2025

      These two Ivanti bugs are allowing hackers to target cloud instances

      May 21, 2025

      How cloud and AI transform and improve customer experiences

      May 10, 2025
    • Crypto

      3 Made in USA Coins to Watch in The Third Week of July

      July 12, 2025

      Bybit Receives Backlash Over PUMP Token Sale Mismanagement

      July 12, 2025

      3 Pump.Fun Ecosystem Coins to Watch Amid PUMP Token Launch

      July 12, 2025

      Coinbase CEO Calls the Bomb Squad for a Surprising Gift

      July 12, 2025

      Pump.Fun Token Sold Out In 12 minutes as Whales Flood Solana Launchpad

      July 12, 2025
    • Technology

      Windows 10 KB5062554 update breaks emoji panel search feature

      July 14, 2025

      Google Gemini flaw hijacks email summaries for phishing

      July 14, 2025

      Foldables are in and suddenly really thin

      July 14, 2025

      Why GM’s CEO is still betting on electric vehicles (and racing)

      July 14, 2025

      xAI explains the Grok Nazi meltdown, as Tesla puts Elon’s bot in its cars

      July 14, 2025
    • Others
      • Gadgets
      • Gaming
      • Health
      • Software and Apps
    Shop Now
    Tech AI Verse
    You are at:Home»Technology»Alibaba launches open source Qwen3 model that surpasses OpenAI o1 and DeepSeek R1
    Technology

    Alibaba launches open source Qwen3 model that surpasses OpenAI o1 and DeepSeek R1

    TechAiVerseBy TechAiVerseApril 29, 2025No Comments6 Mins Read2 Views
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr Email Reddit
    Alibaba launches open source Qwen3 model that surpasses OpenAI o1 and DeepSeek R1
    Share
    Facebook Twitter LinkedIn Pinterest WhatsApp Email

    Alibaba launches open source Qwen3 model that surpasses OpenAI o1 and DeepSeek R1

    April 28, 2025 4:56 PM

    Credit: VentureBeat made with Qwen Chat

    Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More


    Chinese e-commerce and web giant Alibaba’s Qwen team has officially launched a new series of open source AI large language multimodal models known as Qwen3 that appear to be among the state-of-the-art for open models, and approach performance of proprietary models from the likes of OpenAI and Google.

    The Qwen3 series features two “mixture-of-experts” models and six dense models for a total of eight (!) new models. The “mixture-of-experts” approach involves having several different specialty model types combined into one, with only those relevant models to the task at hand being activated when needed in the internal settings of the model (known as parameters). It was popularized by open source French AI startup Mistral.

    According to the team, the 235-billion parameter version of Qwen3 codenamed A22B outperforms DeepSeek’s open source R1 and OpenAI’s proprietary o1 on key third-party benchmarks including ArenaHard (with 500 user questions in software engineering and math) and nears the performance of the new, proprietary Google Gemini 2.5-Pro.

    Overall, the benchmark data positions Qwen3-235B-A22B as one of the most powerful publicly available models, achieving parity or superiority relative to major industry offerings.

    Hybrid (reasoning) theory

    The Qwen3 models are trained to provide so-called “hybrid reasoning” or “dynamic reasoning” capabilities, allowing users to toggle between fast, accurate responses and more time-consuming and compute-intensive reasoning steps (similar to OpenAI’s “o” series) for more difficult queries in science, math, engineering and other specialized fields. This is an approach pioneered by Nous Research and other AI startups and research collectives.

    With Qwen3, users can engage the more intensive “Thinking Mode” using the button marked as such on the Qwen Chat website or by embedding specific prompts like /think or /no_think when deploying the model locally or through the API, allowing for flexible use depending on the task complexity.

    Users can now access and deploy these models across platforms like Hugging Face, ModelScope, Kaggle, and GitHub, as well as interact with them directly via the Qwen Chat web interface and mobile applications. The release includes both Mixture of Experts (MoE) and dense models, all available under the Apache 2.0 open-source license.

    In my brief usage of the Qwen Chat website so far, it was able to generate imagery relatively rapidly and with decent prompt adherence — especially when incorporating text into the image natively while matching the style. However, it often prompted me to log in and was subject to the usual Chinese content restrictions (such as prohibiting prompts or responses related to the Tiananmen Square protests).

    In addition to the MoE offerings, Qwen3 includes dense models at different scales: Qwen3-32B, Qwen3-14B, Qwen3-8B, Qwen3-4B, Qwen3-1.7B, and Qwen3-0.6B.

    These models vary in size and architecture, offering users options to fit diverse needs and computational budgets.

    The Qwen3 models also significantly expand multilingual support, now covering 119 languages and dialects across major language families. This broadens the models’ potential applications globally, facilitating research and deployment in a wide range of linguistic contexts.

    Model training and architecture

    In terms of model training, Qwen3 represents a substantial step up from its predecessor, Qwen2.5. The pretraining dataset doubled in size to approximately 36 trillion tokens.

    The data sources include web crawls, PDF-like document extractions, and synthetic content generated using previous Qwen models focused on math and coding.

    The training pipeline consisted of a three-stage pretraining process followed by a four-stage post-training refinement to enable the hybrid thinking and non-thinking capabilities. The training improvements allow the dense base models of Qwen3 to match or exceed the performance of much larger Qwen2.5 models.

    Deployment options are versatile. Users can integrate Qwen3 models using frameworks such as SGLang and vLLM, both of which offer OpenAI-compatible endpoints.

    For local usage, options like Ollama, LMStudio, MLX, llama.cpp, and KTransformers are recommended. Additionally, users interested in the models’ agentic capabilities are encouraged to explore the Qwen-Agent toolkit, which simplifies tool-calling operations.

    Junyang Lin, a member of the Qwen team, commented on X that building Qwen3 involved addressing critical but less glamorous technical challenges such as scaling reinforcement learning stably, balancing multi-domain data, and expanding multilingual performance without quality sacrifice.

    Lin also indicated that the team is transitioning focus toward training agents capable of long-horizon reasoning for real-world tasks.

    What it means for enterprise decision-makers

    Engineering teams can point existing OpenAI-compatible endpoints to the new model in hours instead of weeks. The MoE checkpoints (235 B parameters with 22 B active, and 30 B with 3 B active) deliver GPT-4-class reasoning at roughly the GPU memory cost of a 20–30 B dense model.

    Official LoRA and QLoRA hooks allow private fine-tuning without sending proprietary data to a third-party vendor.

    Dense variants from 0.6 B to 32 B make it easy to prototype on laptops and scale to multi-GPU clusters without rewriting prompts.

    Running the weights on-premises means all prompts and outputs can be logged and inspected. MoE sparsity reduces the number of active parameters per call, cutting the inference attack surface.

    The Apache-2.0 license removes usage-based legal hurdles, though organizations should still review export-control and governance implications of using a model trained by a China-based vendor.

    Yet at the same time, it also offers a viable alternative to other Chinese players including DeepSeek, Tencent, and ByteDance — as well as the myriad and growing number of North American models such as the aforementioned OpenAI, Google, Microsoft, Anthropic, Amazon, Meta and others. The permissive Apache 2.0 license — which allows for unlimited commercial usage — is also a big advantage over other open source players like Meta, whose licenses are more restrictive.

    It indicates furthermore that the race between AI providers to offer ever-more powerful and accessible models continues to remain highly competitive, and savvy organizations looking to cut costs should attempt to remain flexible and open to evaluating said new models for their AI agents and workflows.

    Looking ahead

    The Qwen team positions Qwen3 not just as an incremental improvement but as a significant step toward future goals in Artificial General Intelligence (AGI) and Artificial Superintelligence (ASI), AI significantly smarter than humans.

    Plans for Qwen’s next phase include scaling data and model size further, extending context lengths, broadening modality support, and enhancing reinforcement learning with environmental feedback mechanisms.

    As the landscape of large-scale AI research continues to evolve, Qwen3’s open-weight release under an accessible license marks another important milestone, lowering barriers for researchers, developers, and organizations aiming to innovate with state-of-the-art LLMs.

    Daily insights on business use cases with VB Daily

    If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI.

    Read our Privacy Policy

    Thanks for subscribing. Check out more VB newsletters here.

    An error occured.

    Share. Facebook Twitter Pinterest LinkedIn Reddit WhatsApp Telegram Email
    Previous ArticleEx-OpenAI CEO and power users sound alarm over AI sycophancy and flattery of users
    Next Article Iwot Studios launches game team for Wheel of Time RPG
    TechAiVerse
    • Website

    Jonathan is a tech enthusiast and the mind behind Tech AI Verse. With a passion for artificial intelligence, consumer tech, and emerging innovations, he deliver clear, insightful content to keep readers informed. From cutting-edge gadgets to AI advancements and cryptocurrency trends, Jonathan breaks down complex topics to make technology accessible to all.

    Related Posts

    Windows 10 KB5062554 update breaks emoji panel search feature

    July 14, 2025

    Google Gemini flaw hijacks email summaries for phishing

    July 14, 2025

    Foldables are in and suddenly really thin

    July 14, 2025
    Leave A Reply Cancel Reply

    Top Posts

    New Akira ransomware decryptor cracks encryptions keys using GPUs

    March 16, 202528 Views

    OpenAI details ChatGPT-o3, o4-mini, o4-mini-high usage limits

    April 19, 202522 Views

    Rsync replaced with openrsync on macOS Sequoia

    April 7, 202520 Views

    6.7 Cummins Lifter Failure: What Years Are Affected (And Possible Fixes)

    April 14, 202519 Views
    Don't Miss
    Technology July 14, 2025

    Windows 10 KB5062554 update breaks emoji panel search feature

    Windows 10 KB5062554 update breaks emoji panel search feature The search feature for the Windows…

    Google Gemini flaw hijacks email summaries for phishing

    Foldables are in and suddenly really thin

    Why GM’s CEO is still betting on electric vehicles (and racing)

    Stay In Touch
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    • YouTube
    • Vimeo

    Subscribe to Updates

    Get the latest creative news from SmartMag about art & design.

    About Us
    About Us

    Welcome to Tech AI Verse, your go-to destination for everything technology! We bring you the latest news, trends, and insights from the ever-evolving world of tech. Our coverage spans across global technology industry updates, artificial intelligence advancements, machine learning ethics, and automation innovations. Stay connected with us as we explore the limitless possibilities of technology!

    Facebook X (Twitter) Pinterest YouTube WhatsApp
    Our Picks

    Windows 10 KB5062554 update breaks emoji panel search feature

    July 14, 20252 Views

    Google Gemini flaw hijacks email summaries for phishing

    July 14, 20251 Views

    Foldables are in and suddenly really thin

    July 14, 20250 Views
    Most Popular

    Ethereum must hold $2,000 support or risk dropping to $1,850 – Here’s why

    March 12, 20250 Views

    Xiaomi 15 Ultra Officially Launched in China, Malaysia launch to follow after global event

    March 12, 20250 Views

    Apple thinks people won’t use MagSafe on iPhone 16e

    March 12, 20250 Views
    © 2025 TechAiVerse. Designed by Divya Tech.
    • Home
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms & Conditions

    Type above and press Enter to search. Press Esc to cancel.