Close Menu

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Sony Malaysia univels WF-1000XM6 for RM1599, RM350 off when you preorder it

    iOS 26.3 now lets you migrate to Android easier

    Samsung Galaxy S26 series to launch on 26 February 2026

    Facebook X (Twitter) Instagram
    • Artificial Intelligence
    • Business Technology
    • Cryptocurrency
    • Gadgets
    • Gaming
    • Health
    • Software and Apps
    • Technology
    Facebook X (Twitter) Instagram Pinterest Vimeo
    Tech AI Verse
    • Home
    • Artificial Intelligence

      Read the extended transcript: President Donald Trump interviewed by ‘NBC Nightly News’ anchor Tom Llamas

      February 6, 2026

      Stocks and bitcoin sink as investors dump software company shares

      February 4, 2026

      AI, crypto and Trump super PACs stash millions to spend on the midterms

      February 2, 2026

      To avoid accusations of AI cheating, college students are turning to AI

      January 29, 2026

      ChatGPT can embrace authoritarian ideas after just one prompt, researchers say

      January 24, 2026
    • Business

      The HDD brand that brought you the 1.8-inch, 2.5-inch, and 3.5-inch hard drives is now back with a $19 pocket-sized personal cloud for your smartphones

      February 12, 2026

      New VoidLink malware framework targets Linux cloud servers

      January 14, 2026

      Nvidia Rubin’s rack-scale encryption signals a turning point for enterprise AI security

      January 13, 2026

      How KPMG is redefining the future of SAP consulting on a global scale

      January 10, 2026

      Top 10 cloud computing stories of 2025

      December 22, 2025
    • Crypto

      How Polymarket Is Turning Bitcoin Volatility Into a Five-Minute Betting Market

      February 13, 2026

      Israel Indicts Two Over Secret Bets on Military Operations via Polymarket

      February 13, 2026

      Binance’s October 10 Defense at Consensus Hong Kong Falls Flat

      February 13, 2026

      Argentina Congress Strips Workers’ Right to Choose Digital Wallet Deposits

      February 13, 2026

      Monero Price Breakdown Begins? Dip Buyers Now Fight XMR’s Drop to $135

      February 13, 2026
    • Technology

      This MacBook Pro has a Touch Bar and is only $410 while stock lasts

      February 13, 2026

      Intel’s tough decision boosted AMD to record highs

      February 13, 2026

      Bundle deal! Ring Battery Doorbell and Outdoor Cam Plus (44% off)

      February 13, 2026

      Microsoft Store goes zero-clutter—through the command line

      February 13, 2026

      How Boll & Branch leverages AI for operational and creative tasks

      February 13, 2026
    • Others
      • Gadgets
      • Gaming
      • Health
      • Software and Apps
    Check BMI
    Tech AI Verse
    You are at:Home»Technology»Absolute Zero: Reinforced Self-Play Reasoning with Zero Data
    Technology

    Absolute Zero: Reinforced Self-Play Reasoning with Zero Data

    TechAiVerseBy TechAiVerseMay 11, 2025No Comments2 Mins Read4 Views
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr Email Reddit
    Absolute Zero: Reinforced Self-Play Reasoning with Zero Data
    Share
    Facebook Twitter LinkedIn Pinterest WhatsApp Email

    Absolute Zero: Reinforced Self-Play Reasoning with Zero Data

    [Submitted on 6 May 2025 (v1), last revised 7 May 2025 (this version, v2)]

    Authors:Andrew Zhao, Yiran Wu, Yang Yue, Tong Wu, Quentin Xu, Yang Yue, Matthieu Lin, Shenzhi Wang, Qingyun Wu, Zilong Zheng, Gao Huang

    View PDF
    HTML (experimental)

    Abstract:Reinforcement learning with verifiable rewards (RLVR) has shown promise in enhancing the reasoning capabilities of large language models by learning directly from outcome-based rewards. Recent RLVR works that operate under the zero setting avoid supervision in labeling the reasoning process, but still depend on manually curated collections of questions and answers for training. The scarcity of high-quality, human-produced examples raises concerns about the long-term scalability of relying on human supervision, a challenge already evident in the domain of language model pretraining. Furthermore, in a hypothetical future where AI surpasses human intelligence, tasks provided by humans may offer limited learning potential for a superintelligent system. To address these concerns, we propose a new RLVR paradigm called Absolute Zero, in which a single model learns to propose tasks that maximize its own learning progress and improves reasoning by solving them, without relying on any external data. Under this paradigm, we introduce the Absolute Zero Reasoner (AZR), a system that self-evolves its training curriculum and reasoning ability by using a code executor to both validate proposed code reasoning tasks and verify answers, serving as an unified source of verifiable reward to guide open-ended yet grounded learning. Despite being trained entirely without external data, AZR achieves overall SOTA performance on coding and mathematical reasoning tasks, outperforming existing zero-setting models that rely on tens of thousands of in-domain human-curated examples. Furthermore, we demonstrate that AZR can be effectively applied across different model scales and is compatible with various model classes.

    Submission history

    From: Andrew Zhao [view email]
    [v1]
    Tue, 6 May 2025 09:08:00 UTC (3,686 KB)
    [v2]
    Wed, 7 May 2025 13:01:17 UTC (4,145 KB)

    Share. Facebook Twitter Pinterest LinkedIn Reddit WhatsApp Telegram Email
    Previous ArticleWhat is it like to be a thermostat? (1996)
    Next Article The Evolving Revolution: AI in 2025
    TechAiVerse
    • Website

    Jonathan is a tech enthusiast and the mind behind Tech AI Verse. With a passion for artificial intelligence, consumer tech, and emerging innovations, he deliver clear, insightful content to keep readers informed. From cutting-edge gadgets to AI advancements and cryptocurrency trends, Jonathan breaks down complex topics to make technology accessible to all.

    Related Posts

    This MacBook Pro has a Touch Bar and is only $410 while stock lasts

    February 13, 2026

    Intel’s tough decision boosted AMD to record highs

    February 13, 2026

    Bundle deal! Ring Battery Doorbell and Outdoor Cam Plus (44% off)

    February 13, 2026
    Leave A Reply Cancel Reply

    Top Posts

    Ping, You’ve Got Whale: AI detection system alerts ships of whales in their path

    April 22, 2025669 Views

    Lumo vs. Duck AI: Which AI is Better for Your Privacy?

    July 31, 2025258 Views

    6.7 Cummins Lifter Failure: What Years Are Affected (And Possible Fixes)

    April 14, 2025153 Views

    6 Best MagSafe Phone Grips (2025), Tested and Reviewed

    April 6, 2025111 Views
    Don't Miss
    Gadgets February 13, 2026

    Sony Malaysia univels WF-1000XM6 for RM1599, RM350 off when you preorder it

    Sony Malaysia univels WF-1000XM6 for RM1599, RM350 off when you preorder it Sony Malaysia has…

    iOS 26.3 now lets you migrate to Android easier

    Samsung Galaxy S26 series to launch on 26 February 2026

    This MacBook Pro has a Touch Bar and is only $410 while stock lasts

    Stay In Touch
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    • YouTube
    • Vimeo

    Subscribe to Updates

    Get the latest creative news from SmartMag about art & design.

    About Us
    About Us

    Welcome to Tech AI Verse, your go-to destination for everything technology! We bring you the latest news, trends, and insights from the ever-evolving world of tech. Our coverage spans across global technology industry updates, artificial intelligence advancements, machine learning ethics, and automation innovations. Stay connected with us as we explore the limitless possibilities of technology!

    Facebook X (Twitter) Pinterest YouTube WhatsApp
    Our Picks

    Sony Malaysia univels WF-1000XM6 for RM1599, RM350 off when you preorder it

    February 13, 20260 Views

    iOS 26.3 now lets you migrate to Android easier

    February 13, 20260 Views

    Samsung Galaxy S26 series to launch on 26 February 2026

    February 13, 20260 Views
    Most Popular

    7 Best Kids Bikes (2025): Mountain, Balance, Pedal, Coaster

    March 13, 20250 Views

    VTOMAN FlashSpeed 1500: Plenty Of Power For All Your Gear

    March 13, 20250 Views

    This new Roomba finally solves the big problem I have with robot vacuums

    March 13, 20250 Views
    © 2026 TechAiVerse. Designed by Divya Tech.
    • Home
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms & Conditions

    Type above and press Enter to search. Press Esc to cancel.