Close Menu

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Media Briefing: DOJ’s Google search trial remedies fall flat for publishers

    Media agencies hope to drive down costs as Walmart opens up DSP roster

    In earnings reports, fashion brands clock fallout from tariffs and tease holiday plans

    Facebook X (Twitter) Instagram
    • Artificial Intelligence
    • Business Technology
    • Cryptocurrency
    • Gadgets
    • Gaming
    • Health
    • Software and Apps
    • Technology
    Facebook X (Twitter) Instagram Pinterest Vimeo
    Tech AI Verse
    • Home
    • Artificial Intelligence

      Blue-collar jobs are gaining popularity as AI threatens office work

      August 17, 2025

      Man who asked ChatGPT about cutting out salt from his diet was hospitalized with hallucinations

      August 15, 2025

      What happens when chatbots shape your reality? Concerns are growing online

      August 14, 2025

      Scientists want to prevent AI from going rogue by teaching it to be bad first

      August 8, 2025

      AI models may be accidentally (and secretly) learning each other’s bad behaviors

      July 30, 2025
    • Business

      Cloudflare hit by data breach in Salesloft Drift supply chain attack

      September 2, 2025

      Cloudflare blocks largest recorded DDoS attack peaking at 11.5 Tbps

      September 2, 2025

      Why Certified VMware Pros Are Driving the Future of IT

      August 24, 2025

      Murky Panda hackers exploit cloud trust to hack downstream customers

      August 23, 2025

      The rise of sovereign clouds: no data portability, no party

      August 20, 2025
    • Crypto

      Ripple Deepens Global Payments Alliance With Thunes

      September 4, 2025

      US Fed to Host Conference in October, Covering Stablecoins and DeFi

      September 4, 2025

      US Bank Resumes Bitcoin Custody Amid Eased Rules

      September 4, 2025

      Consensys’ Ethereum L2 Linea to Launch 72B Tokens

      September 4, 2025

      How Trump’s Tariff Appeal Could Impact Crypto Markets

      September 4, 2025
    • Technology

      Media Briefing: DOJ’s Google search trial remedies fall flat for publishers

      September 4, 2025

      Media agencies hope to drive down costs as Walmart opens up DSP roster

      September 4, 2025

      In earnings reports, fashion brands clock fallout from tariffs and tease holiday plans

      September 4, 2025

      On Amazon, the ‘Made in USA’ boom fizzles as price wins out

      September 4, 2025

      Team hires, studio rental fees: The hidden costs of creators

      September 4, 2025
    • Others
      • Gadgets
      • Gaming
      • Health
      • Software and Apps
    Check BMI
    Tech AI Verse
    You are at:Home»Technology»IAB Tech Lab pitches plan to help publishers gain control of LLM scraping
    Technology

    IAB Tech Lab pitches plan to help publishers gain control of LLM scraping

    TechAiVerseBy TechAiVerseJuly 16, 2025No Comments6 Mins Read1 Views
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr Email Reddit
    IAB Tech Lab pitches plan to help publishers gain control of LLM scraping
    Share
    Facebook Twitter LinkedIn Pinterest WhatsApp Email

    BMI Calculator – Check your Body Mass Index for free!

    IAB Tech Lab pitches plan to help publishers gain control of LLM scraping

    By Jessica Davies  •  July 16, 2025  •

    The IAB Tech Lab is working to assemble a task force of publishers and compute edge companies to kick off its plan to create a technical framework that helps publishers gain better control of, and be paid for, LLM crawling. 

    So far, it has roughly a dozen publishers on board for the task force, who will meet for the first workshop in New York City on July 23 (next Wednesday), to discuss next steps for what it has called its LLM Content Ingest API framework. Edge compute company Cloudflare will also attend and speak at the meeting, and the IAB Tech Lab is working to get edge compute company Fastly on board as well, according to CEO Anthony Katsur.

    It’s early days, so next steps entail writing the specification — essentially the blueprint or technical guide that will help the different stakeholders (publishers, tech vendors, platforms) build toward the same standard. IAB Tech Lab has an internal draft specification that it’s in the early stages of reviewing with publishers, according to Katsur. Over the last six weeks, it has pitched the overview of this specification (see below) to around 40 publishers globally. 

    Katsur hopes to have a framework out in the market in the fall. 

    Naturally, there are some sticky challenges. Getting publishers on board is one thing, but roping in the AI companies to hold up their end is another. Three publishing executives Digiday has spoken to have expressed their concerns that AI companies won’t care to establish compensation or attribution models with this framework.

    Katsur is all too aware of the challenges for the LLM Content Ingest API to work; it will need all stakeholders. “I’m skeptical that they’ll [AI platforms] be willing partners to this,” he said. 

    However, he believes that having publishers and compute edge companies unite on the issue will create infrastructure cost efficiencies for LLM crawlers, which may entice them to take part. “We’re definitely going to be aggressive,” he said, when referencing how they would pitch the final technical framework to AI companies. 

    Here’s a look at the pitch deck the IAB has presented to publishers.

    How LLM Content Ingest API will work

    First, there needs to be a contract between the LLM provider and the publisher to define what content can be accessed. Only then can the publisher set the crawler terms to reflect that agreement.

    Publishers can group their content into tiers: such as basic (daily articles or videos), archival content, and premium content like investigative journalism articles or exclusive interviews.

    Then come the payment options: cost-per-crawl, all-you-can-eat unlimited access, and cost-per-query, which is IAB Tech Lab’s preferred model. “We think cost-per-query scales better than cost-per-crawl,” said Katsur. There is a misconception that bots only crawl once; they do in fact return, he stressed, but there are still fewer crawls likely to happen versus queries surfaced in answer engines.

    There is also a logging and reporting component, which ensures publishers can invoice the LLM provider correctly. “There can be reconciliation every month in terms of: here’s how many times you crawled me, or here’s how many times I showed up in a query,” said Katsur.  

    Tokenization to authenticate source – important for brands and publishers

    The last step is what IAB Tech Lab refers to as request processing, where it will tokenize the content to ensure the accuracy of the source information, and also show clearly where compensation is needed and to whom. “This is really where cost-per-query becomes feasible – the ability to tokenize content inputs into the LLM, and then every time that shows up in a user query, it’s trackable because you’ve assigned a unique identifier to that particular piece of content if it’s contributed to a query,” added Katsur. “Ostensibly, both the LLM and the publisher should be able to track that.” 

    For Katsur, tokenizing content is especially important because it helps identify the original source within the “contextual stew” of AI-generated answers, which are typically synthesized from multiple publisher sites.

    Brands are also concerned about the likelihood of their products being misrepresented in queries, noted Katsur. CPG and auto manufacturer brands he has spoken to have noticed confusing or error-prone queries related to their products, raising concerns about missed sales opportunities or the loss of existing or new customers. 

    If AI answer engines draw on content from three different publishers to generate a response, then tokenizing the articles could help identify the contributions, making it easy to split the payment between them.

    Elephant in the room: enforcement 

    While publishers welcome any efforts to assist with creating a more sustainable AI-driven model for publishers, where their content isn’t ripped off, there is a healthy level of skepticism over just how an API like LLM Content Ingest can truly prevent scraping. Their view: it needs to be more robust than the robots.txt, which so far has been easy to ignore or to game.

    Katsur stressed that there are some nefraious tactics being used by some LLM crawlers, who will simply use a different, undisclosed crawler if their original one gets listed in robots.txt. For this proposed standard to work, publishers need to take a hard line on all crawling, he added.

    “To enforce this model, you have to have a very strong fence,” said Katsur. “And all it’s going to take is one weak link in the fence, of one publisher saying, ok you can keep crawling.” 

    He said publishers need to form a coalition to take a clear stance: the crawling has to stop. This is where the edge compute platforms come in. “We’re confident Cloudflare and Fastly will be part of the task force with the publishers. They’re the ones in the best position to stop the crawling, and the ones best equipped to detect crawlers that don’t obey robots.txt.”

    There is also some hope that the AI companies will need to play ball, once the outcome of the ongoing publisher lawsuits – like those led by the New York Times and Ziff Davis – (should they favor the publishers) are confirmed. Katsur also believes there are a couple of basic AI laws regulators should make, that wouldn’t quash AI innovation: declare your crawler and fines robots.txt is flouted.

    “The challenge we face is that this is happening so fast. When we talk with publishers we’re hearing traffic declines of 30%-60% [in the US] and that’s unsustainable. And this is only the tip of the iceberg in terms of LLMs and zero-click search… We have to be really aggressive as an industry in tackling it.”

    https://digiday.com/?p=583222

    More in Media

    BMI Calculator – Check your Body Mass Index for free!

    Share. Facebook Twitter Pinterest LinkedIn Reddit WhatsApp Telegram Email
    Previous ArticleFuture of TV Briefing: Inside the measurement issues roiling this year’s upfront market
    Next Article CMOs might be pushing ahead on AI, but lack of measurement’s holding them back
    TechAiVerse
    • Website

    Jonathan is a tech enthusiast and the mind behind Tech AI Verse. With a passion for artificial intelligence, consumer tech, and emerging innovations, he deliver clear, insightful content to keep readers informed. From cutting-edge gadgets to AI advancements and cryptocurrency trends, Jonathan breaks down complex topics to make technology accessible to all.

    Related Posts

    Media Briefing: DOJ’s Google search trial remedies fall flat for publishers

    September 4, 2025

    Media agencies hope to drive down costs as Walmart opens up DSP roster

    September 4, 2025

    In earnings reports, fashion brands clock fallout from tariffs and tease holiday plans

    September 4, 2025
    Leave A Reply Cancel Reply

    Top Posts

    Ping, You’ve Got Whale: AI detection system alerts ships of whales in their path

    April 22, 2025180 Views

    6.7 Cummins Lifter Failure: What Years Are Affected (And Possible Fixes)

    April 14, 202548 Views

    New Akira ransomware decryptor cracks encryptions keys using GPUs

    March 16, 202530 Views

    Is Libby Compatible With Kobo E-Readers?

    March 31, 202529 Views
    Don't Miss
    Technology September 4, 2025

    Media Briefing: DOJ’s Google search trial remedies fall flat for publishers

    Media Briefing: DOJ’s Google search trial remedies fall flat for publishersThis Media Briefing covers the…

    Media agencies hope to drive down costs as Walmart opens up DSP roster

    In earnings reports, fashion brands clock fallout from tariffs and tease holiday plans

    On Amazon, the ‘Made in USA’ boom fizzles as price wins out

    Stay In Touch
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    • YouTube
    • Vimeo

    Subscribe to Updates

    Get the latest creative news from SmartMag about art & design.

    About Us
    About Us

    Welcome to Tech AI Verse, your go-to destination for everything technology! We bring you the latest news, trends, and insights from the ever-evolving world of tech. Our coverage spans across global technology industry updates, artificial intelligence advancements, machine learning ethics, and automation innovations. Stay connected with us as we explore the limitless possibilities of technology!

    Facebook X (Twitter) Pinterest YouTube WhatsApp
    Our Picks

    Media Briefing: DOJ’s Google search trial remedies fall flat for publishers

    September 4, 20252 Views

    Media agencies hope to drive down costs as Walmart opens up DSP roster

    September 4, 20252 Views

    In earnings reports, fashion brands clock fallout from tariffs and tease holiday plans

    September 4, 20252 Views
    Most Popular

    Xiaomi 15 Ultra Officially Launched in China, Malaysia launch to follow after global event

    March 12, 20250 Views

    Apple thinks people won’t use MagSafe on iPhone 16e

    March 12, 20250 Views

    French Apex Legends voice cast refuses contracts over “unacceptable” AI clause

    March 12, 20250 Views
    © 2025 TechAiVerse. Designed by Divya Tech.
    • Home
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms & Conditions

    Type above and press Enter to search. Press Esc to cancel.