Close Menu

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Show HN: An encrypted, local, cross-platform journaling app

    ShannonMax: A Library to Optimize Emacs Keybindings with Information Theory

    Bridging Elixir and Python with Oban

    Facebook X (Twitter) Instagram
    • Artificial Intelligence
    • Business Technology
    • Cryptocurrency
    • Gadgets
    • Gaming
    • Health
    • Software and Apps
    • Technology
    Facebook X (Twitter) Instagram Pinterest Vimeo
    Tech AI Verse
    • Home
    • Artificial Intelligence

      Read the extended transcript: President Donald Trump interviewed by ‘NBC Nightly News’ anchor Tom Llamas

      February 6, 2026

      Stocks and bitcoin sink as investors dump software company shares

      February 4, 2026

      AI, crypto and Trump super PACs stash millions to spend on the midterms

      February 2, 2026

      To avoid accusations of AI cheating, college students are turning to AI

      January 29, 2026

      ChatGPT can embrace authoritarian ideas after just one prompt, researchers say

      January 24, 2026
    • Business

      The HDD brand that brought you the 1.8-inch, 2.5-inch, and 3.5-inch hard drives is now back with a $19 pocket-sized personal cloud for your smartphones

      February 12, 2026

      New VoidLink malware framework targets Linux cloud servers

      January 14, 2026

      Nvidia Rubin’s rack-scale encryption signals a turning point for enterprise AI security

      January 13, 2026

      How KPMG is redefining the future of SAP consulting on a global scale

      January 10, 2026

      Top 10 cloud computing stories of 2025

      December 22, 2025
    • Crypto

      Is Bitcoin Price Entering a New Bear Market? Here’s Why Metrics Say Yes

      February 19, 2026

      Cardano’s Trading Activity Crashes to a 6-Month Low — Can ADA Still Attempt a Reversal?

      February 19, 2026

      Is Extreme Fear a Buy Signal? New Data Questions the Conventional Wisdom

      February 19, 2026

      Coinbase and Ledn Strengthen Crypto Lending Push Despite Market Slump

      February 19, 2026

      Bitcoin Caught Between Hawkish Fed and Dovish Warsh

      February 19, 2026
    • Technology

      Show HN: An encrypted, local, cross-platform journaling app

      February 19, 2026

      ShannonMax: A Library to Optimize Emacs Keybindings with Information Theory

      February 19, 2026

      Bridging Elixir and Python with Oban

      February 19, 2026

      Old School Visual Effects: The Cloud Tank (2010)

      February 19, 2026

      Show HN: A Lisp where each function call runs a Docker container

      February 19, 2026
    • Others
      • Gadgets
      • Gaming
      • Health
      • Software and Apps
    Check BMI
    Tech AI Verse
    You are at:Home»Technology»It’s Qwen’s summer: new open source Qwen3-235B-A22B-Thinking-2507 tops OpenAI, Gemini reasoning models on key benchmarks
    Technology

    It’s Qwen’s summer: new open source Qwen3-235B-A22B-Thinking-2507 tops OpenAI, Gemini reasoning models on key benchmarks

    TechAiVerseBy TechAiVerseJuly 26, 2025No Comments6 Mins Read9 Views
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr Email Reddit
    Share
    Facebook Twitter LinkedIn Pinterest WhatsApp Email

    It’s Qwen’s summer: new open source Qwen3-235B-A22B-Thinking-2507 tops OpenAI, Gemini reasoning models on key benchmarks

    Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now


    If the AI industry had an equivalent to the recording industry’s “song of the summer” — a hit that catches on in the warmer months here in the Northern Hemisphere and is heard playing everywhere — the clear honoree for that title would go to Alibaba’s Qwen Team.

    Over just the past week, the frontier model AI research division of the Chinese e-commerce behemoth has released not one, not two, not three, but four (!!) new open source generative AI models that offer record-setting benchmarks, besting even some leading proprietary options.

    Last night, Qwen Team capped it off with the release of Qwen3-235B-A22B-Thinking-2507, it’s updated reasoning large language model (LLM), which takes longer to respond than a non-reasoning or “instruct” LLM, engaging in “chains-of-thought” or self-reflection and self-checking that hopefully result in more correct and comprehensive responses on more difficult tasks.

    Indeed, the new Qwen3-Thinking-2507, as we’ll call it for short, now leads or closely trails top-performing models across several major benchmarks.


    The AI Impact Series Returns to San Francisco – August 5

    The next phase of AI is here – are you ready? Join leaders from Block, GSK, and SAP for an exclusive look at how autonomous agents are reshaping enterprise workflows – from real-time decision-making to end-to-end automation.

    Secure your spot now – space is limited: https://bit.ly/3GuuPLF


    As AI influencer and news aggregator Andrew Curran wrote on X: “Qwen’s strongest reasoning model has arrived, and it is at the frontier.”

    In the AIME25 benchmark—designed to evaluate problem-solving ability in mathematical and logical contexts — Qwen3-Thinking-2507 leads all reported models with a score of 92.3, narrowly surpassing both OpenAI’s o4-mini (92.7) and Gemini-2.5 Pro (88.0).

    The model also shows a commanding performance on LiveCodeBench v6, scoring 74.1, ahead of Google Gemini-2.5 Pro (72.5), OpenAI o4-mini (71.8), and significantly outperforming its earlier version, which posted 55.7.

    In GPQA, a benchmark for graduate-level multiple-choice questions, the model achieves 81.1, nearly matching Deepseek-R1-0528 (81.0) and trailing Gemini-2.5 Pro’s top mark of 86.4.

    On Arena-Hard v2, which evaluates alignment and subjective preference through win rates, Qwen3-Thinking-2507 scores 79.7, placing it ahead of all competitors.

    The results show that this model not only surpasses its predecessor in every major category but also sets a new standard for what open-source, reasoning-focused models can achieve.

    A shift away from ‘hybrid reasoning’

    The release of Qwen3-Thinking-2507 reflects a broader strategic shift by Alibaba’s Qwen team: moving away from hybrid reasoning models that required users to manually toggle between “thinking” and “non-thinking” modes.

    Instead, the team is now training separate models for reasoning and instruction tasks. This separation allows each model to be optimized for its intended purpose—resulting in improved consistency, clarity, and benchmark performance. The new Qwen3-Thinking model fully embodies this design philosophy.

    Alongside it, Qwen launched Qwen3-Coder-480B-A35B-Instruct, a 480B-parameter model built for complex coding workflows. It supports 1 million token context windows and outperforms GPT-4.1 and Gemini 2.5 Pro on SWE-bench Verified.

    Also announced was Qwen3-MT, a multilingual translation model trained on trillions of tokens across 92+ languages. It supports domain adaptation, terminology control, and inference from just $0.50 per million tokens.

    Earlier in the week, the team released Qwen3-235B-A22B-Instruct-2507, a non-reasoning model that surpassed Claude Opus 4 on several benchmarks and introduced a lightweight FP8 variant for more efficient inference on constrained hardware.

    All models are licensed under Apache 2.0 and are available through Hugging Face, ModelScope, and the Qwen API.

    Licensing: Apache 2.0 and its enterprise advantage

    Qwen3-235B-A22B-Thinking-2507 is released under the Apache 2.0 license, a highly permissive and commercially friendly license that allows enterprises to download, modify, self-host, fine-tune, and integrate the model into proprietary systems without restriction.

    This stands in contrast to proprietary models or research-only open releases, which often require API access, impose usage limits, or prohibit commercial deployment. For compliance-conscious organizations and teams looking to control cost, latency, and data privacy, Apache 2.0 licensing enables full flexibility and ownership.

    Availability and pricing

    Qwen3-235B-A22B-Thinking-2507 is available now for free download on Hugging Face and ModelScope.

    For those enterprises who don’t want to or don’t have the resources and capability to host the model inference on their own hardware or virtual private cloud through Alibaba Cloud’s API, vLLM, and SGLang.

    • Input price: $0.70 per million tokens
    • Output price: $8.40 per million tokens
    • Free tier: 1 million tokens, valid for 180 days

    The model is compatible with agentic frameworks via Qwen-Agent, and supports advanced deployment via OpenAI-compatible APIs.

    It can also be run locally using transformer frameworks or integrated into dev stacks through Node.js, CLI tools, or structured prompting interfaces.

    Sampling settings for best performance include temperature=0.6, top_p=0.95, and max output length of 81,920 tokens for complex tasks.

    Enterprise applications and future outlook

    With its strong benchmark performance, long-context capability, and permissive licensing, Qwen3-Thinking-2507 is particularly well suited for use in enterprise AI systems involving reasoning, planning, and decision support.

    The broader Qwen3 ecosystem — including coding, instruction, and translation models—further extends the appeal to technical teams and business units looking to incorporate AI across verticals like engineering, localization, customer support, and research.

    The Qwen team’s decision to release specialized models for distinct use cases, backed by technical transparency and community support, signals a deliberate shift toward building open, performant, and production-ready AI infrastructure.

    As more enterprises seek alternatives to API-gated, black-box models, Alibaba’s Qwen series increasingly positions itself as a viable open-source foundation for intelligent systems—offering both control and capability at scale.

    Daily insights on business use cases with VB Daily

    If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI.

    Read our Privacy Policy

    Thanks for subscribing. Check out more VB newsletters here.

    An error occured.

    Share. Facebook Twitter Pinterest LinkedIn Reddit WhatsApp Telegram Email
    Previous ArticleAnthropic unveils ‘auditing agents’ to test for AI misalignment
    Next Article CoSyn: The open-source tool that’s making GPT-4V-level vision AI accessible to everyone
    TechAiVerse
    • Website

    Jonathan is a tech enthusiast and the mind behind Tech AI Verse. With a passion for artificial intelligence, consumer tech, and emerging innovations, he deliver clear, insightful content to keep readers informed. From cutting-edge gadgets to AI advancements and cryptocurrency trends, Jonathan breaks down complex topics to make technology accessible to all.

    Related Posts

    Show HN: An encrypted, local, cross-platform journaling app

    February 19, 2026

    ShannonMax: A Library to Optimize Emacs Keybindings with Information Theory

    February 19, 2026

    Bridging Elixir and Python with Oban

    February 19, 2026
    Leave A Reply Cancel Reply

    Top Posts

    Ping, You’ve Got Whale: AI detection system alerts ships of whales in their path

    April 22, 2025684 Views

    Lumo vs. Duck AI: Which AI is Better for Your Privacy?

    July 31, 2025273 Views

    6.7 Cummins Lifter Failure: What Years Are Affected (And Possible Fixes)

    April 14, 2025156 Views

    6 Best MagSafe Phone Grips (2025), Tested and Reviewed

    April 6, 2025118 Views
    Don't Miss
    Technology February 19, 2026

    Show HN: An encrypted, local, cross-platform journaling app

    Show HN: An encrypted, local, cross-platform journaling appMini Diarium An encrypted, local cross-platform journaling app…

    ShannonMax: A Library to Optimize Emacs Keybindings with Information Theory

    Bridging Elixir and Python with Oban

    Old School Visual Effects: The Cloud Tank (2010)

    Stay In Touch
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    • YouTube
    • Vimeo

    Subscribe to Updates

    Get the latest creative news from SmartMag about art & design.

    About Us
    About Us

    Welcome to Tech AI Verse, your go-to destination for everything technology! We bring you the latest news, trends, and insights from the ever-evolving world of tech. Our coverage spans across global technology industry updates, artificial intelligence advancements, machine learning ethics, and automation innovations. Stay connected with us as we explore the limitless possibilities of technology!

    Facebook X (Twitter) Pinterest YouTube WhatsApp
    Our Picks

    Show HN: An encrypted, local, cross-platform journaling app

    February 19, 20260 Views

    ShannonMax: A Library to Optimize Emacs Keybindings with Information Theory

    February 19, 20260 Views

    Bridging Elixir and Python with Oban

    February 19, 20260 Views
    Most Popular

    7 Best Kids Bikes (2025): Mountain, Balance, Pedal, Coaster

    March 13, 20250 Views

    VTOMAN FlashSpeed 1500: Plenty Of Power For All Your Gear

    March 13, 20250 Views

    This new Roomba finally solves the big problem I have with robot vacuums

    March 13, 20250 Views
    © 2026 TechAiVerse. Designed by Divya Tech.
    • Home
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms & Conditions

    Type above and press Enter to search. Press Esc to cancel.