Close Menu

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Electrify your drive: lease the 2025 Ford F-150 Lightning for just $237/month with up to $12,250 in bonus

    Gemini for Home: Google replaces classic household assistant

    MSI Crosshair 18 AX is one of the most affordable 18-inch gaming laptops with the Core Ultra 9 275HX

    Facebook X (Twitter) Instagram
    • Artificial Intelligence
    • Business Technology
    • Cryptocurrency
    • Gadgets
    • Gaming
    • Health
    • Software and Apps
    • Technology
    Facebook X (Twitter) Instagram Pinterest Vimeo
    Tech AI Verse
    • Home
    • Artificial Intelligence

      Blue-collar jobs are gaining popularity as AI threatens office work

      August 17, 2025

      Man who asked ChatGPT about cutting out salt from his diet was hospitalized with hallucinations

      August 15, 2025

      What happens when chatbots shape your reality? Concerns are growing online

      August 14, 2025

      Scientists want to prevent AI from going rogue by teaching it to be bad first

      August 8, 2025

      AI models may be accidentally (and secretly) learning each other’s bad behaviors

      July 30, 2025
    • Business

      Why Certified VMware Pros Are Driving the Future of IT

      August 24, 2025

      Murky Panda hackers exploit cloud trust to hack downstream customers

      August 23, 2025

      The rise of sovereign clouds: no data portability, no party

      August 20, 2025

      Israel is reportedly storing millions of Palestinian phone calls on Microsoft servers

      August 6, 2025

      AI site Perplexity uses “stealth tactics” to flout no-crawl edicts, Cloudflare says

      August 5, 2025
    • Crypto

      Max Keiser Says Flee to El Salvador as Kiyosaki Declares Europe ‘Toast’

      August 31, 2025

      New Mystery Coin on Pump.fun Reportedly Hits $1.8 Million in 24H Volume

      August 31, 2025

      Trump Family’s $750 Million Crypto Deal Raises Questions Ahead of WLFI Token Debut

      August 31, 2025

      CZ Backs DeFi Dominance As Japan Post Bank Unveils $1.3 Trillion Digital Currency Plan

      August 31, 2025

      Hedera (HBAR) Price Eyes New Lows Despite Major Whale Buying Actions

      August 31, 2025
    • Technology

      Electrify your drive: lease the 2025 Ford F-150 Lightning for just $237/month with up to $12,250 in bonus

      September 1, 2025

      Gemini for Home: Google replaces classic household assistant

      September 1, 2025

      MSI Crosshair 18 AX is one of the most affordable 18-inch gaming laptops with the Core Ultra 9 275HX

      September 1, 2025

      Two free games worth nearly $40 set to be available soon on Lenovo Legion Gaming Community

      September 1, 2025

      Xiaomi 16 series reported to launch with proprietary Nokia technology

      September 1, 2025
    • Others
      • Gadgets
      • Gaming
      • Health
      • Software and Apps
    Check BMI
    Tech AI Verse
    You are at:Home»Technology»IAB Tech Lab pitches plan to help publishers gain control of LLM scraping
    Technology

    IAB Tech Lab pitches plan to help publishers gain control of LLM scraping

    TechAiVerseBy TechAiVerseJuly 16, 2025No Comments6 Mins Read1 Views
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr Email Reddit
    IAB Tech Lab pitches plan to help publishers gain control of LLM scraping
    Share
    Facebook Twitter LinkedIn Pinterest WhatsApp Email

    BMI Calculator – Check your Body Mass Index for free!

    IAB Tech Lab pitches plan to help publishers gain control of LLM scraping

    By Jessica Davies  •  July 16, 2025  •

    The IAB Tech Lab is working to assemble a task force of publishers and compute edge companies to kick off its plan to create a technical framework that helps publishers gain better control of, and be paid for, LLM crawling. 

    So far, it has roughly a dozen publishers on board for the task force, who will meet for the first workshop in New York City on July 23 (next Wednesday), to discuss next steps for what it has called its LLM Content Ingest API framework. Edge compute company Cloudflare will also attend and speak at the meeting, and the IAB Tech Lab is working to get edge compute company Fastly on board as well, according to CEO Anthony Katsur.

    It’s early days, so next steps entail writing the specification — essentially the blueprint or technical guide that will help the different stakeholders (publishers, tech vendors, platforms) build toward the same standard. IAB Tech Lab has an internal draft specification that it’s in the early stages of reviewing with publishers, according to Katsur. Over the last six weeks, it has pitched the overview of this specification (see below) to around 40 publishers globally. 

    Katsur hopes to have a framework out in the market in the fall. 

    Naturally, there are some sticky challenges. Getting publishers on board is one thing, but roping in the AI companies to hold up their end is another. Three publishing executives Digiday has spoken to have expressed their concerns that AI companies won’t care to establish compensation or attribution models with this framework.

    Katsur is all too aware of the challenges for the LLM Content Ingest API to work; it will need all stakeholders. “I’m skeptical that they’ll [AI platforms] be willing partners to this,” he said. 

    However, he believes that having publishers and compute edge companies unite on the issue will create infrastructure cost efficiencies for LLM crawlers, which may entice them to take part. “We’re definitely going to be aggressive,” he said, when referencing how they would pitch the final technical framework to AI companies. 

    Here’s a look at the pitch deck the IAB has presented to publishers.

    How LLM Content Ingest API will work

    First, there needs to be a contract between the LLM provider and the publisher to define what content can be accessed. Only then can the publisher set the crawler terms to reflect that agreement.

    Publishers can group their content into tiers: such as basic (daily articles or videos), archival content, and premium content like investigative journalism articles or exclusive interviews.

    Then come the payment options: cost-per-crawl, all-you-can-eat unlimited access, and cost-per-query, which is IAB Tech Lab’s preferred model. “We think cost-per-query scales better than cost-per-crawl,” said Katsur. There is a misconception that bots only crawl once; they do in fact return, he stressed, but there are still fewer crawls likely to happen versus queries surfaced in answer engines.

    There is also a logging and reporting component, which ensures publishers can invoice the LLM provider correctly. “There can be reconciliation every month in terms of: here’s how many times you crawled me, or here’s how many times I showed up in a query,” said Katsur.  

    Tokenization to authenticate source – important for brands and publishers

    The last step is what IAB Tech Lab refers to as request processing, where it will tokenize the content to ensure the accuracy of the source information, and also show clearly where compensation is needed and to whom. “This is really where cost-per-query becomes feasible – the ability to tokenize content inputs into the LLM, and then every time that shows up in a user query, it’s trackable because you’ve assigned a unique identifier to that particular piece of content if it’s contributed to a query,” added Katsur. “Ostensibly, both the LLM and the publisher should be able to track that.” 

    For Katsur, tokenizing content is especially important because it helps identify the original source within the “contextual stew” of AI-generated answers, which are typically synthesized from multiple publisher sites.

    Brands are also concerned about the likelihood of their products being misrepresented in queries, noted Katsur. CPG and auto manufacturer brands he has spoken to have noticed confusing or error-prone queries related to their products, raising concerns about missed sales opportunities or the loss of existing or new customers. 

    If AI answer engines draw on content from three different publishers to generate a response, then tokenizing the articles could help identify the contributions, making it easy to split the payment between them.

    Elephant in the room: enforcement 

    While publishers welcome any efforts to assist with creating a more sustainable AI-driven model for publishers, where their content isn’t ripped off, there is a healthy level of skepticism over just how an API like LLM Content Ingest can truly prevent scraping. Their view: it needs to be more robust than the robots.txt, which so far has been easy to ignore or to game.

    Katsur stressed that there are some nefraious tactics being used by some LLM crawlers, who will simply use a different, undisclosed crawler if their original one gets listed in robots.txt. For this proposed standard to work, publishers need to take a hard line on all crawling, he added.

    “To enforce this model, you have to have a very strong fence,” said Katsur. “And all it’s going to take is one weak link in the fence, of one publisher saying, ok you can keep crawling.” 

    He said publishers need to form a coalition to take a clear stance: the crawling has to stop. This is where the edge compute platforms come in. “We’re confident Cloudflare and Fastly will be part of the task force with the publishers. They’re the ones in the best position to stop the crawling, and the ones best equipped to detect crawlers that don’t obey robots.txt.”

    There is also some hope that the AI companies will need to play ball, once the outcome of the ongoing publisher lawsuits – like those led by the New York Times and Ziff Davis – (should they favor the publishers) are confirmed. Katsur also believes there are a couple of basic AI laws regulators should make, that wouldn’t quash AI innovation: declare your crawler and fines robots.txt is flouted.

    “The challenge we face is that this is happening so fast. When we talk with publishers we’re hearing traffic declines of 30%-60% [in the US] and that’s unsustainable. And this is only the tip of the iceberg in terms of LLMs and zero-click search… We have to be really aggressive as an industry in tackling it.”

    https://digiday.com/?p=583222

    More in Media

    BMI Calculator – Check your Body Mass Index for free!

    Share. Facebook Twitter Pinterest LinkedIn Reddit WhatsApp Telegram Email
    Previous ArticleFuture of TV Briefing: Inside the measurement issues roiling this year’s upfront market
    Next Article CMOs might be pushing ahead on AI, but lack of measurement’s holding them back
    TechAiVerse
    • Website

    Jonathan is a tech enthusiast and the mind behind Tech AI Verse. With a passion for artificial intelligence, consumer tech, and emerging innovations, he deliver clear, insightful content to keep readers informed. From cutting-edge gadgets to AI advancements and cryptocurrency trends, Jonathan breaks down complex topics to make technology accessible to all.

    Related Posts

    Electrify your drive: lease the 2025 Ford F-150 Lightning for just $237/month with up to $12,250 in bonus

    September 1, 2025

    Gemini for Home: Google replaces classic household assistant

    September 1, 2025

    MSI Crosshair 18 AX is one of the most affordable 18-inch gaming laptops with the Core Ultra 9 275HX

    September 1, 2025
    Leave A Reply Cancel Reply

    Top Posts

    Ping, You’ve Got Whale: AI detection system alerts ships of whales in their path

    April 22, 2025170 Views

    6.7 Cummins Lifter Failure: What Years Are Affected (And Possible Fixes)

    April 14, 202548 Views

    New Akira ransomware decryptor cracks encryptions keys using GPUs

    March 16, 202530 Views

    Is Libby Compatible With Kobo E-Readers?

    March 31, 202528 Views
    Don't Miss
    Technology September 1, 2025

    Electrify your drive: lease the 2025 Ford F-150 Lightning for just $237/month with up to $12,250 in bonus

    Electrify your drive: lease the 2025 Ford F-150 Lightning for just $237/month with up to…

    Gemini for Home: Google replaces classic household assistant

    MSI Crosshair 18 AX is one of the most affordable 18-inch gaming laptops with the Core Ultra 9 275HX

    Two free games worth nearly $40 set to be available soon on Lenovo Legion Gaming Community

    Stay In Touch
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    • YouTube
    • Vimeo

    Subscribe to Updates

    Get the latest creative news from SmartMag about art & design.

    About Us
    About Us

    Welcome to Tech AI Verse, your go-to destination for everything technology! We bring you the latest news, trends, and insights from the ever-evolving world of tech. Our coverage spans across global technology industry updates, artificial intelligence advancements, machine learning ethics, and automation innovations. Stay connected with us as we explore the limitless possibilities of technology!

    Facebook X (Twitter) Pinterest YouTube WhatsApp
    Our Picks

    Electrify your drive: lease the 2025 Ford F-150 Lightning for just $237/month with up to $12,250 in bonus

    September 1, 20252 Views

    Gemini for Home: Google replaces classic household assistant

    September 1, 20252 Views

    MSI Crosshair 18 AX is one of the most affordable 18-inch gaming laptops with the Core Ultra 9 275HX

    September 1, 20252 Views
    Most Popular

    Xiaomi 15 Ultra Officially Launched in China, Malaysia launch to follow after global event

    March 12, 20250 Views

    Apple thinks people won’t use MagSafe on iPhone 16e

    March 12, 20250 Views

    French Apex Legends voice cast refuses contracts over “unacceptable” AI clause

    March 12, 20250 Views
    © 2025 TechAiVerse. Designed by Divya Tech.
    • Home
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms & Conditions

    Type above and press Enter to search. Press Esc to cancel.