Close Menu

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Black Friday Thunderbolt dock deals: What to expect, early sales

    Black Friday monitor deals: What to expect and early sales

    Apple TV drops its paywall for Major League Soccer

    Facebook X (Twitter) Instagram
    • Artificial Intelligence
    • Business Technology
    • Cryptocurrency
    • Gadgets
    • Gaming
    • Health
    • Software and Apps
    • Technology
    Facebook X (Twitter) Instagram Pinterest Vimeo
    Tech AI Verse
    • Home
    • Artificial Intelligence

      Amazon to lay off 14,000 corporate employees

      October 29, 2025

      Elon Musk launches Grokipedia as an alternative to ‘woke’ Wikipedia

      October 29, 2025

      Fears of an AI bubble are growing, but some on Wall Street aren’t worried just yet

      October 18, 2025

      The sleeper issue that could play a huge role in Virginia and New Jersey โ€” and the midterms

      October 16, 2025

      California bill regulating top AI companies signed into law

      September 30, 2025
    • Business

      Government faces questions about why US AWS outage disrupted UK tax office and banking firms

      October 23, 2025

      Amazon’s AWS outage knocked services like Alexa, Snapchat, Fortnite, Venmo and more offline

      October 21, 2025

      SAP ECC customers bet on composable ERP to avoid upgrading

      October 18, 2025

      Revenue generated by neoclouds expected to exceed $23bn in 2025, predicts Synergy

      October 15, 2025

      You can now try Fortnite directly in Discord

      October 8, 2025
    • Crypto

      Chainlink ETF Nears Reality โ€” But Holders Keep Selling LINK

      November 13, 2025

      Solana at a Breaking Point: $1,000 Moonshot or Crash Back to $100?

      November 13, 2025

      Bitcoin Stares At Its Next Peak From The Bottom, But One Level Blocks The View

      November 13, 2025

      UK GDP Expected to Post Modest Growth in Q3

      November 13, 2025

      Analysts Reveal The Chart That Predicts Bitcoin Better Than M2 Ever Did

      November 13, 2025
    • Technology

      Black Friday Thunderbolt dock deals: What to expect, early sales

      November 13, 2025

      Black Friday monitor deals: What to expect and early sales

      November 13, 2025

      Apple TV drops its paywall for Major League Soccer

      November 13, 2025

      Black Friday mini PC deals: What to expect, early sales

      November 13, 2025

      Black Friday USB flash drive deals: What to expect, early sales

      November 13, 2025
    • Others
      • Gadgets
      • Gaming
      • Health
      • Software and Apps
    Check BMI
    Tech AI Verse
    You are at:Home»Technology»SWE-Bench Pro
    Technology

    SWE-Bench Pro

    TechAiVerseBy TechAiVerseSeptember 22, 2025No Comments2 Mins Read3 Views
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr Email Reddit
    SWE-Bench Pro
    Share
    Facebook Twitter LinkedIn Pinterest WhatsApp Email

    SWE-Bench Pro

    SWE-Bench Pro

    Code and data for the following works:

    • SWE-bench Pro: Can AI Agents Solve Long-Horizon Software Engineering Tasks?

    ๐Ÿ‘‹ Overview

    SWE-Bench Pro is a challenging benchmark evaluating LLMs/Agents on long-horizon software engineering tasks.
    Given a codebase and an issue, a language model is tasked with generating a patch that resolves the described problem.

    The dataset is inspired from SWE-Bench: https://github.com/SWE-bench/SWE-bench

    To access SWE-bench Pro, copy and run the following code:

    from datasets import load_dataset
    swebench = load_dataset('ScaleAI/SWE-bench_Pro', split='test')

    ๐Ÿš€ Set Up

    SWE-bench Pro uses Docker for reproducible evaluations.
    In addition, the evaluation script requires Modal to scale the evaluation set.

    Follow the instructions in the Docker setup guide to install Docker on your machine.
    If you’re setting up on Linux, we recommend seeing the post-installation steps as well.

    Run the following commands to store modal credentials:

    pip install modal
    modalv setup # and follow the prompts to generate your token and secret
    

    After running these steps, you should be able to see a token ID and secret in ~/.modal.toml:
    EG:

    We store prebuilt Docker images for each instance. They can be found in this directory:

    https://hub.docker.com/repository/docker/jefzda/sweap-images/general

    The format of the images is as follows.

    jefzda/sweap-images:{repo_base}.{repo_name}-{repo_base}__{repo_name}-{hash}

    For example:

    jefzda/sweap-images:gravitational.teleport-gravitational__teleport-82185f232ae8974258397e121b3bc2ed0c3729ed-v626ec2a48416b10a88641359a169d99e935ff03

    ๐Ÿ’ฝ Usage

    First generate patch predictions using your harness of choice.
    Evaluate patch predictions on SWE-bench Pro with the following command:

    python sweap_pro_eval_modal.py 
        --raw_sample_path=external_hf_v2.csv 
        --patch_path={OUTPUT}/gold_patches.json 
        --output_dir={OUTPUT}/ 
        --scripts_dir=run_scripts 
        --num_workers=100 
        --dockerhub_username=your-username

    Replace gold_patches with your patch json, and point raw_sample_path to the SWE-Bench Pro CSV.

    Share. Facebook Twitter Pinterest LinkedIn Reddit WhatsApp Telegram Email
    Previous ArticleNew Govee Outdoor Lights with tri-color effects now available
    Next Article OpenAI and Nvidia announce partnership to deploy 10GW of Nvidia systems
    TechAiVerse
    • Website

    Jonathan is a tech enthusiast and the mind behind Tech AI Verse. With a passion for artificial intelligence, consumer tech, and emerging innovations, he deliver clear, insightful content to keep readers informed. From cutting-edge gadgets to AI advancements and cryptocurrency trends, Jonathan breaks down complex topics to make technology accessible to all.

    Related Posts

    Black Friday Thunderbolt dock deals: What to expect, early sales

    November 13, 2025

    Black Friday monitor deals: What to expect and early sales

    November 13, 2025

    Apple TV drops its paywall for Major League Soccer

    November 13, 2025
    Leave A Reply Cancel Reply

    Top Posts

    Ping, You’ve Got Whale: AI detection system alerts ships of whales in their path

    April 22, 2025380 Views

    Lumo vs. Duck AI: Which AI is Better for Your Privacy?

    July 31, 202598 Views

    6.7 Cummins Lifter Failure: What Years Are Affected (And Possible Fixes)

    April 14, 202573 Views

    Is Libby Compatible With Kobo E-Readers?

    March 31, 202555 Views
    Don't Miss
    Technology November 13, 2025

    Black Friday Thunderbolt dock deals: What to expect, early sales

    Black Friday Thunderbolt dock deals: What to expect, early sales Image: Foundry Early Black Friday…

    Black Friday monitor deals: What to expect and early sales

    Apple TV drops its paywall for Major League Soccer

    Black Friday mini PC deals: What to expect, early sales

    Stay In Touch
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    • YouTube
    • Vimeo

    Subscribe to Updates

    Get the latest creative news from SmartMag about art & design.

    About Us
    About Us

    Welcome to Tech AI Verse, your go-to destination for everything technology! We bring you the latest news, trends, and insights from the ever-evolving world of tech. Our coverage spans across global technology industry updates, artificial intelligence advancements, machine learning ethics, and automation innovations. Stay connected with us as we explore the limitless possibilities of technology!

    Facebook X (Twitter) Pinterest YouTube WhatsApp
    Our Picks

    Black Friday Thunderbolt dock deals: What to expect, early sales

    November 13, 20251 Views

    Black Friday monitor deals: What to expect and early sales

    November 13, 20251 Views

    Apple TV drops its paywall for Major League Soccer

    November 13, 20251 Views
    Most Popular

    Xiaomi 15 Ultra Officially Launched in China, Malaysia launch to follow after global event

    March 12, 20250 Views

    Apple thinks people wonโ€™t use MagSafe on iPhone 16e

    March 12, 20250 Views

    French Apex Legends voice cast refuses contracts over “unacceptable” AI clause

    March 12, 20250 Views
    © 2025 TechAiVerse. Designed by Divya Tech.
    • Home
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms & Conditions

    Type above and press Enter to search. Press Esc to cancel.