Close Menu

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    It’s time we blow up PC benchmarking

    If my Wi-Fi’s not working, here’s how I find answers

    Asus ROG NUC 2025 review: Mini PC in size, massive in performance

    Facebook X (Twitter) Instagram
    • Artificial Intelligence
    • Business Technology
    • Cryptocurrency
    • Gadgets
    • Gaming
    • Health
    • Software and Apps
    • Technology
    Facebook X (Twitter) Instagram Pinterest Vimeo
    Tech AI Verse
    • Home
    • Artificial Intelligence

      Blue-collar jobs are gaining popularity as AI threatens office work

      August 17, 2025

      Man who asked ChatGPT about cutting out salt from his diet was hospitalized with hallucinations

      August 15, 2025

      What happens when chatbots shape your reality? Concerns are growing online

      August 14, 2025

      Scientists want to prevent AI from going rogue by teaching it to be bad first

      August 8, 2025

      AI models may be accidentally (and secretly) learning each other’s bad behaviors

      July 30, 2025
    • Business

      Why Certified VMware Pros Are Driving the Future of IT

      August 24, 2025

      Murky Panda hackers exploit cloud trust to hack downstream customers

      August 23, 2025

      The rise of sovereign clouds: no data portability, no party

      August 20, 2025

      Israel is reportedly storing millions of Palestinian phone calls on Microsoft servers

      August 6, 2025

      AI site Perplexity uses “stealth tactics” to flout no-crawl edicts, Cloudflare says

      August 5, 2025
    • Crypto

      Japan Auto Parts Maker Invests US Stablecoin Firm and Its Stock Soars

      August 29, 2025

      Stablecoin Card Firm Rain Raise $58M from Samsung and Sapphire

      August 29, 2025

      Shark Tank Star Kevin O’Leary Expands to Bitcoin ETF

      August 29, 2025

      BitMine Stock Moves Opposite to Ethereum — What Are Analysts Saying?

      August 29, 2025

      Argentina’s Opposition Parties Reactivate LIBRA Investigation Into President Milei

      August 29, 2025
    • Technology

      It’s time we blow up PC benchmarking

      August 29, 2025

      If my Wi-Fi’s not working, here’s how I find answers

      August 29, 2025

      Asus ROG NUC 2025 review: Mini PC in size, massive in performance

      August 29, 2025

      20 free ‘hidden gem’ apps I install on every Windows PC

      August 29, 2025

      Lowest price ever: Microsoft Office at $25 over Labor Day weekend

      August 29, 2025
    • Others
      • Gadgets
      • Gaming
      • Health
      • Software and Apps
    Check BMI
    Tech AI Verse
    You are at:Home»Technology»Nvidia Bets Big on Synthetic Data
    Technology

    Nvidia Bets Big on Synthetic Data

    TechAiVerseBy TechAiVerseMarch 20, 2025No Comments7 Mins Read2 Views
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr Email Reddit
    Nvidia Bets Big on Synthetic Data
    Share
    Facebook Twitter LinkedIn Pinterest WhatsApp Email

    BMI Calculator – Check your Body Mass Index for free!

    Nvidia Bets Big on Synthetic Data

    Nvidia has acquired synthetic data firm Gretel for nine figures, according to two people with direct knowledge of the deal.

    The acquisition price exceeds Gretel’s most recent valuation of $320 million, the sources say, though the exact terms of the purchase remain unknown. Gretel and its team of approximately 80 employees will be folded into Nvidia, where its technology will be deployed as part of the chip giant’s growing suite of cloud-based, generative AI services for developers.

    The acquisition comes as Nvidia has been rolling out synthetic data generation tools, so that developers can train their own AI models and fine-tune them for specific apps. In theory, synthetic data could create a near-infinite supply of AI training data and help solve the data scarcity problem that has been looming over the AI industry since ChatGPT went mainstream in 2022—although experts say using synthetic data in generative AI comes with its own risks.

    A spokesperson for Nvidia declined to comment.

    Gretel was founded in 2019 by Alex Watson, John Myers, and Ali Golshan, who also serves as CEO. The startup offers a synthetic data platform and a suite of APIs to developers who want to build generative AI models, but don’t have access to enough training data or have privacy concerns around using real people’s data. Gretel doesn’t build and license its own frontier AI models, but fine-tunes existing open source models to add differential privacy and safety features, then packages those together to sell them. The company raised more than $67 million in venture capital funding prior to the acquisition, according to Pitchbook.

    A spokesperson for Gretel also declined to comment.

    Unlike human-generated or real-world data, synthetic data is computer-generated and designed to mimic real-world data. Proponents say this makes the data generation required to build AI models more scalable, less labor intensive, and more accessible to smaller or less-resourced AI developers. Privacy-protection is another key selling point of synthetic data, making it an appealing option for health care providers, banks, and government agencies.

    Nvidia has already been offering synthetic data tools for developers for years. In 2022 it launched Omniverse Replicator, which gives developers the ability to generate custom, physically accurate, synthetic 3D data to train neural networks. Last June, Nvidia began rolling out a family of open AI models that generate synthetic training data for developers to use in building or fine-tuning LLMs. Called Nemotron-4 340B, these mini-models can be used by developers to drum up synthetic data for their own LLMs across “health care, finance, manufacturing, retail, and every other industry.”

    During his keynote presentation at Nvidia’s annual developer conference this Tuesday, Nvidia cofounder and chief executive Jensen Huang spoke about the challenges the industry faces in rapidly scaling AI in a cost-effective way.

    “There are three problems that we focus on,” he said. “One, how do you solve the data problem? How and where do you create the data necessary to train the AI? Two, what’s the model architecture? And then three, what are the scaling laws?” Huang went on to describe how the company is now using synthetic data generation in its robotics platforms.

    Synthetic data can be used in at least a couple different ways, says Ana-Maria Cretu, a postdoctoral researcher at the École Polytechnique Fédérale de Lausanne in Switzerland, who studies synthetic data privacy. It can take the form of tabular data, like demographic or medical data, which can solve a data scarcity issue or create a more diverse dataset.

    Cretu gives an example: If a hospital wants to build an AI model to track a certain type of cancer, but is working with a small data set from 1,000 patients, synthetic data can be used to fill out the data set, eliminate biases, and anonymize data from real humans. “This also offers some privacy protection, whenever you cannot disclose the real data to a stakeholder or software partner,” Cretu says.

    But in the world of large language models, Cretu adds, synthetic data has also become something of a catchall phase for “How can we just increase the amount of data we have for LLMs over time?”

    Experts worry that, in the not-so-distant future, AI companies won’t be able to gorge as freely on human-created internet data in order to train their AI models. Last year, a report from MIT’s Data Provenance Initiative showed that restrictions around open web content were increasing.

    Synthetic data in theory could provide an easy solution. But a July 2024 article in Nature highlighted how AI language models could “collapse,” or degrade significantly in quality, when they’re fine-tuned over and over again with data generated by other models. Put another way, if you feed the machine nothing but its own machine-generated output, it theoretically begins to eat itself, spewing out detritus as a result.

    Alexandr Wang, the chief executive of Scale AI—which leans heavily on a human workforce for labeling data used to train models—shared the findings from the Nature article on X, writing, “While many researchers today view synthetic data as an AI philosopher’s stone, there is no free lunch.” Wang said later in the thread that this is why he believes firmly in a hybrid data approach.

    One of Gretel’s cofounders pushed back on the Nature paper, noting in a blog post that the “extreme scenario” of repetitive training on purely synthetic data “is not representative of real-world AI development practices.”

    Gary Marcus, a cognitive scientist and researcher who loudly criticizes AI hype, said at the time that he agrees with Wang’s “diagnosis but not his prescription.” The industry will move forward, he believes, by developing new architectures for AI models, rather than focusing on the idiosyncrasies of data sets. In an email to WIRED, Marcus observed that “systems like [OpenAI’s] o1/o3 seem to be better at domains like coding and math where you can generate—and validate—tons of synthetic data. On general purpose reasoning in open-ended domains, they have been less effective.”

    Cretu believes the scientific theory around model collapse is sound. But she notes that most researchers and computer scientists are training on a mix of synthetic and real-world data. “You might possibly be able to get around model collapse by having fresh data with every new round of training,” she says.

    Concerns about model collapse haven’t stopped the AI industry from hopping aboard the synthetic data train, even if they’re doing so with caution. At a recent Morgan Stanley tech conference, Sam Altman reportedly touted OpenAI’s ability to use its existing AI models to create more data. Anthropic CEO Dario Amodei has said he believes it may be possible to build “an infinite data-generation engine,” one that would maintain its quality by injecting a small amount of new information during the training process (as Cretu has suggested).

    Big Tech has also been turning to synthetic data. Meta has talked about how it trained Llama 3, its state-of-the-art large language model, using synthetic data, some of which was generated from Meta’s previous model, Llama 2. Amazon’s Bedrock platform lets developers use Anthropic’s Claude to generate synthetic data. Microsoft’s Phi-3 small language model was trained partly on synthetic data, though the company has warned that “synthetic data generated by pre-trained large-language models can sometimes reduce accuracy and increase bias on down-stream tasks.” Google’s DeepMind has been using synthetic data, too, but again, has highlighted the complexities of developing a pipeline for generating—and maintaining—truly private synthetic data.

    “We know that all of the big tech companies are working on some aspect of synthetic data,” says Alex Bestall, the founder of Rightsify, a music licensing startup that also generates AI music and licenses its catalog for AI models. “But human data is often a contractual requirement in our deals. They might want a dataset that is 60 percent human-generated, and 40 percent synthetic.”

    BMI Calculator – Check your Body Mass Index for free!

    Share. Facebook Twitter Pinterest LinkedIn Reddit WhatsApp Telegram Email
    Previous ArticleDad demands OpenAI delete ChatGPT’s false claim that he murdered his kids
    Next Article OpenAI’s Deep Research Agent Is Coming for White-Collar Work
    TechAiVerse
    • Website

    Jonathan is a tech enthusiast and the mind behind Tech AI Verse. With a passion for artificial intelligence, consumer tech, and emerging innovations, he deliver clear, insightful content to keep readers informed. From cutting-edge gadgets to AI advancements and cryptocurrency trends, Jonathan breaks down complex topics to make technology accessible to all.

    Related Posts

    It’s time we blow up PC benchmarking

    August 29, 2025

    If my Wi-Fi’s not working, here’s how I find answers

    August 29, 2025

    Asus ROG NUC 2025 review: Mini PC in size, massive in performance

    August 29, 2025
    Leave A Reply Cancel Reply

    Top Posts

    Ping, You’ve Got Whale: AI detection system alerts ships of whales in their path

    April 22, 2025166 Views

    6.7 Cummins Lifter Failure: What Years Are Affected (And Possible Fixes)

    April 14, 202548 Views

    New Akira ransomware decryptor cracks encryptions keys using GPUs

    March 16, 202530 Views

    Is Libby Compatible With Kobo E-Readers?

    March 31, 202528 Views
    Don't Miss
    Technology August 29, 2025

    It’s time we blow up PC benchmarking

    It’s time we blow up PC benchmarking Image: Willis Lai / Foundry Welcome to The…

    If my Wi-Fi’s not working, here’s how I find answers

    Asus ROG NUC 2025 review: Mini PC in size, massive in performance

    20 free ‘hidden gem’ apps I install on every Windows PC

    Stay In Touch
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    • YouTube
    • Vimeo

    Subscribe to Updates

    Get the latest creative news from SmartMag about art & design.

    About Us
    About Us

    Welcome to Tech AI Verse, your go-to destination for everything technology! We bring you the latest news, trends, and insights from the ever-evolving world of tech. Our coverage spans across global technology industry updates, artificial intelligence advancements, machine learning ethics, and automation innovations. Stay connected with us as we explore the limitless possibilities of technology!

    Facebook X (Twitter) Pinterest YouTube WhatsApp
    Our Picks

    It’s time we blow up PC benchmarking

    August 29, 20252 Views

    If my Wi-Fi’s not working, here’s how I find answers

    August 29, 20251 Views

    Asus ROG NUC 2025 review: Mini PC in size, massive in performance

    August 29, 20252 Views
    Most Popular

    Xiaomi 15 Ultra Officially Launched in China, Malaysia launch to follow after global event

    March 12, 20250 Views

    Apple thinks people won’t use MagSafe on iPhone 16e

    March 12, 20250 Views

    French Apex Legends voice cast refuses contracts over “unacceptable” AI clause

    March 12, 20250 Views
    © 2025 TechAiVerse. Designed by Divya Tech.
    • Home
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms & Conditions

    Type above and press Enter to search. Press Esc to cancel.