Close Menu

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    This HP mini PC delivers big power for $350

    Upgrade to Windows 11 Pro for $13 and feel the difference immediately

    This slim 1440p portable laptop monitor is 30% off

    Facebook X (Twitter) Instagram
    • Artificial Intelligence
    • Business Technology
    • Cryptocurrency
    • Gadgets
    • Gaming
    • Health
    • Software and Apps
    • Technology
    Facebook X (Twitter) Instagram Pinterest Vimeo
    Tech AI Verse
    • Home
    • Artificial Intelligence

      Read the extended transcript: President Donald Trump interviewed by ‘NBC Nightly News’ anchor Tom Llamas

      February 6, 2026

      Stocks and bitcoin sink as investors dump software company shares

      February 4, 2026

      AI, crypto and Trump super PACs stash millions to spend on the midterms

      February 2, 2026

      To avoid accusations of AI cheating, college students are turning to AI

      January 29, 2026

      ChatGPT can embrace authoritarian ideas after just one prompt, researchers say

      January 24, 2026
    • Business

      New VoidLink malware framework targets Linux cloud servers

      January 14, 2026

      Nvidia Rubin’s rack-scale encryption signals a turning point for enterprise AI security

      January 13, 2026

      How KPMG is redefining the future of SAP consulting on a global scale

      January 10, 2026

      Top 10 cloud computing stories of 2025

      December 22, 2025

      Saudia Arabia’s STC commits to five-year network upgrade programme with Ericsson

      December 18, 2025
    • Crypto

      Arthur Hayes Attributes Bitcoin Crash to ETF-Linked Dealer Hedging

      February 8, 2026

      Monero XMR Attempts First Recovery in a Month, But Death Cross Risk Looms

      February 8, 2026

      HBAR Price Eyes a Potential 30% Rally – Here’s What the Charts are Signalling 

      February 8, 2026

      Bitcoin Mining Difficulty Hits Its Biggest Drop Since 2021 China Ban

      February 8, 2026

      How Severe Is This Bitcoin Bear Market and Where Is Price Headed Next?

      February 8, 2026
    • Technology

      This HP mini PC delivers big power for $350

      February 9, 2026

      Upgrade to Windows 11 Pro for $13 and feel the difference immediately

      February 9, 2026

      This slim 1440p portable laptop monitor is 30% off

      February 9, 2026

      If you buy Razer’s insane $1337 mouse, I will be very disappointed in you

      February 9, 2026

      Nvidia is reportedly skipping consumer GPUs in 2026. Thanks, AI

      February 9, 2026
    • Others
      • Gadgets
      • Gaming
      • Health
      • Software and Apps
    Check BMI
    Tech AI Verse
    You are at:Home»Technology»Hugging Face: 5 ways enterprises can slash AI costs without sacrificing performance 
    Technology

    Hugging Face: 5 ways enterprises can slash AI costs without sacrificing performance 

    TechAiVerseBy TechAiVerseAugust 19, 2025No Comments7 Mins Read7 Views
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr Email Reddit
    Hugging Face: 5 ways enterprises can slash AI costs without sacrificing performance 
    Share
    Facebook Twitter LinkedIn Pinterest WhatsApp Email

    Hugging Face: 5 ways enterprises can slash AI costs without sacrificing performance 

    Enterprises seem to accept it as a basic fact: AI models require a significant amount of compute; they simply have to find ways to obtain more of it. 

    But it doesn’t have to be that way, according to Sasha Luccioni, AI and climate lead at Hugging Face. What if there’s a smarter way to use AI? What if, instead of striving for more (often unnecessary) compute and ways to power it, they can focus on improving model performance and accuracy? 

    Ultimately, model makers and enterprises are focusing on the wrong issue: They should be computing smarter, not harder or doing more, Luccioni says. 

    “There are smarter ways of doing things that we’re currently under-exploring, because we’re so blinded by: We need more FLOPS, we need more GPUs, we need more time,” she said. 


    AI Scaling Hits Its Limits

    Power caps, rising token costs, and inference delays are reshaping enterprise AI. Join our exclusive salon to discover how top teams are:

    • Turning energy into a strategic advantage
    • Architecting efficient inference for real throughput gains
    • Unlocking competitive ROI with sustainable AI systems

    Secure your spot to stay ahead: https://bit.ly/4mwGngO


    Here are five key learnings from Hugging Face that can help enterprises of all sizes use AI more efficiently. 

    1: Right-size the model to the task 

    Avoid defaulting to giant, general-purpose models for every use case. Task-specific or distilled models can match, or even surpass, larger models in terms of accuracy for targeted workloads — at a lower cost and with reduced energy consumption. 

    Luccioni, in fact, has found in testing that a task-specific model uses 20 to 30 times less energy than a general-purpose one. “Because it’s a model that can do that one task, as opposed to any task that you throw at it, which is often the case with large language models,” she said. 

    Distillation is key here; a full model could initially be trained from scratch and then refined for a specific task. DeepSeek R1, for instance, is “so huge that most organizations can’t afford to use it” because you need at least 8 GPUs, Luccioni noted. By contrast, distilled versions can be 10, 20 or even 30X smaller and run on a single GPU. 

    In general, open-source models help with efficiency, she noted, as they don’t need to be trained from scratch. That’s compared to just a few years ago, when enterprises were wasting resources because they couldn’t find the model they needed; nowadays, they can start out with a base model and fine-tune and adapt it. 

    “It provides incremental shared innovation, as opposed to siloed, everyone’s training their models on their datasets and essentially wasting compute in the process,” said Luccioni. 

    It’s becoming clear that companies are quickly getting disillusioned with gen AI, as costs are not yet proportionate to the benefits. Generic use cases, such as writing emails or transcribing meeting notes, are genuinely helpful. However, task-specific models still require “a lot of work” because out-of-the-box models don’t cut it and are also more costly, said Luccioni.

    This is the next frontier of added value. “A lot of companies do want a specific task done,” Luccioni noted. “They don’t want AGI, they want specific intelligence. And that’s the gap that needs to be bridged.” 

    2. Make efficiency the default

    Adopt “nudge theory” in system design, set conservative reasoning budgets, limit always-on generative features and require opt-in for high-cost compute modes.

    In cognitive science, “nudge theory” is a behavioral change management approach designed to influence human behavior subtly. The “canonical example,” Luccioni noted, is adding cutlery to takeout: Having people decide whether they want plastic utensils, rather than automatically including them with every order, can significantly reduce waste.

    “Just getting people to opt into something versus opting out of something is actually a very powerful mechanism for changing people’s behavior,” said Luccioni. 

    Default mechanisms are also unnecessary, as they increase use and, therefore, costs because models are doing more work than they need to. For instance, with popular search engines such as Google, a gen AI summary automatically populates at the top by default. Luccioni also noted that, when she recently used OpenAI’s GPT-5, the model automatically worked in full reasoning mode on “very simple questions.”

    “For me, it should be the exception,” she said. “Like, ‘what’s the meaning of life, then sure, I want a gen AI summary.’ But with ‘What’s the weather like in Montreal,’ or ‘What are the opening hours of my local pharmacy?’ I do not need a generative AI summary, yet it’s the default. I think that the default mode should be no reasoning.”

    3. Optimize hardware utilization

    Use batching; adjust precision and fine-tune batch sizes for specific hardware generation to minimize wasted memory and power draw. 

    For instance, enterprises should ask themselves: Does the model need to be on all the time? Will people be pinging it in real time, 100 requests at once? In that case, always-on optimization is necessary, Luccioni noted. However, in many others, it’s not; the model can be run periodically to optimize memory usage, and batching can ensure optimal memory utilization. 

    “It’s kind of like an engineering challenge, but a very specific one, so it’s hard to say, ‘Just distill all the models,’ or ‘change the precision on all the models,’” said Luccioni. 

    In one of her recent studies, she found that batch size depends on hardware, even down to the specific type or version. Going from one batch size to plus-one can increase energy use because models need more memory bars. 

    “This is something that people don’t really look at, they’re just like, ‘Oh, I’m gonna maximize the batch size,’ but it really comes down to tweaking all these different things, and all of a sudden it’s super efficient, but it only works in your specific context,” Luccioni explained. 

    4. Incentivize energy transparency

    It always helps when people are incentivized; to this end, Hugging Face earlier this year launched AI Energy Score. It’s a novel way to promote more energy efficiency, utilizing a 1- to 5-star rating system, with the most efficient models earning a “five-star” status. 

    It could be considered the “Energy Star for AI,” and was inspired by the potentially-soon-to-be-defunct federal program, which set energy efficiency specifications and branded qualifying appliances with an Energy Star logo. 

    “For a couple of decades, it was really a positive motivation, people wanted that star rating, right?,” said Luccioni. “Something similar with Energy Score would be great.”

    Hugging Face has a leaderboard up now, which it plans to update with new models (DeepSeek, GPT-oss) in September, and continually do so every 6 months or sooner as new models become available. The goal is that model builders will consider the rating as a “badge of honor,” Luccioni said.

    5. Rethink the “more compute is better” mindset

    Instead of chasing the largest GPU clusters, begin with the question: “What is the smartest way to achieve the result?” For many workloads, smarter architectures and better-curated data outperform brute-force scaling.

    “I think that people probably don’t need as many GPUs as they think they do,” said Luccioni. Instead of simply going for the biggest clusters, she urged enterprises to rethink the tasks GPUs will be completing and why they need them, how they performed those types of tasks before, and what adding extra GPUs will ultimately get them. 

    “It’s kind of this race to the bottom where we need a bigger cluster,” she said. “It’s thinking about what you’re using AI for, what technique do you need, what does that require?” 

    Daily insights on business use cases with VB Daily

    If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI.

    Read our Privacy Policy

    Thanks for subscribing. Check out more VB newsletters here.

    An error occured.

    Share. Facebook Twitter Pinterest LinkedIn Reddit WhatsApp Telegram Email
    Previous ArticleGEPA optimizes LLMs without costly reinforcement learning
    Next Article Nvidia releases a new small, open model Nemotron-Nano-9B-v2 with toggle on/off reasoning
    TechAiVerse
    • Website

    Jonathan is a tech enthusiast and the mind behind Tech AI Verse. With a passion for artificial intelligence, consumer tech, and emerging innovations, he deliver clear, insightful content to keep readers informed. From cutting-edge gadgets to AI advancements and cryptocurrency trends, Jonathan breaks down complex topics to make technology accessible to all.

    Related Posts

    This HP mini PC delivers big power for $350

    February 9, 2026

    Upgrade to Windows 11 Pro for $13 and feel the difference immediately

    February 9, 2026

    This slim 1440p portable laptop monitor is 30% off

    February 9, 2026
    Leave A Reply Cancel Reply

    Top Posts

    Ping, You’ve Got Whale: AI detection system alerts ships of whales in their path

    April 22, 2025659 Views

    Lumo vs. Duck AI: Which AI is Better for Your Privacy?

    July 31, 2025247 Views

    6.7 Cummins Lifter Failure: What Years Are Affected (And Possible Fixes)

    April 14, 2025148 Views

    6 Best MagSafe Phone Grips (2025), Tested and Reviewed

    April 6, 2025111 Views
    Don't Miss
    Technology February 9, 2026

    This HP mini PC delivers big power for $350

    This HP mini PC delivers big power for $350 Image: StackCommerce TL;DR: A small but powerful HP…

    Upgrade to Windows 11 Pro for $13 and feel the difference immediately

    This slim 1440p portable laptop monitor is 30% off

    If you buy Razer’s insane $1337 mouse, I will be very disappointed in you

    Stay In Touch
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    • YouTube
    • Vimeo

    Subscribe to Updates

    Get the latest creative news from SmartMag about art & design.

    About Us
    About Us

    Welcome to Tech AI Verse, your go-to destination for everything technology! We bring you the latest news, trends, and insights from the ever-evolving world of tech. Our coverage spans across global technology industry updates, artificial intelligence advancements, machine learning ethics, and automation innovations. Stay connected with us as we explore the limitless possibilities of technology!

    Facebook X (Twitter) Pinterest YouTube WhatsApp
    Our Picks

    This HP mini PC delivers big power for $350

    February 9, 20263 Views

    Upgrade to Windows 11 Pro for $13 and feel the difference immediately

    February 9, 20264 Views

    This slim 1440p portable laptop monitor is 30% off

    February 9, 20264 Views
    Most Popular

    7 Best Kids Bikes (2025): Mountain, Balance, Pedal, Coaster

    March 13, 20250 Views

    VTOMAN FlashSpeed 1500: Plenty Of Power For All Your Gear

    March 13, 20250 Views

    This new Roomba finally solves the big problem I have with robot vacuums

    March 13, 20250 Views
    © 2026 TechAiVerse. Designed by Divya Tech.
    • Home
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms & Conditions

    Type above and press Enter to search. Press Esc to cancel.