Close Menu

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    AI models may be accidentally (and secretly) learning each other’s bad behaviors

    vivo X Fold 5 launches in Malaysia for RM6999

    Microsoft finally fixes the worst thing about Excel’s pivot tables

    Facebook X (Twitter) Instagram
    • Artificial Intelligence
    • Business Technology
    • Cryptocurrency
    • Gadgets
    • Gaming
    • Health
    • Software and Apps
    • Technology
    Facebook X (Twitter) Instagram Pinterest Vimeo
    Tech AI Verse
    • Home
    • Artificial Intelligence

      AI models may be accidentally (and secretly) learning each other’s bad behaviors

      July 30, 2025

      Another Chinese AI model is turning heads

      July 15, 2025

      AI chatbot Grok issues apology for antisemitic posts

      July 13, 2025

      Apple sued by shareholders for allegedly overstating AI progress

      June 22, 2025

      How far will AI go to defend its own survival?

      June 2, 2025
    • Business

      Cloudflare open-sources Orange Meets with End-to-End encryption

      June 29, 2025

      Google links massive cloud outage to API management issue

      June 13, 2025

      The EU challenges Google and Cloudflare with its very own DNS resolver that can filter dangerous traffic

      June 11, 2025

      These two Ivanti bugs are allowing hackers to target cloud instances

      May 21, 2025

      How cloud and AI transform and improve customer experiences

      May 10, 2025
    • Crypto

      A Once-Rumored Trump Target Is Now Betting Big on Bitcoin

      July 29, 2025

      XLM Downturn Looms: MACD Crossover and Negative Sentiment Raise Red Flags

      July 29, 2025

      Ten Years of Ethereum: How a Blockchain Dream Grew Up

      July 29, 2025

      Bitcoin Is Becoming the Credit Default Swap on a Collapsing Fiat System | US Crypto News

      July 29, 2025

      HTX Introduces Custom Invitation Code Feature with Referral Campaign Offering 1 BTC Grand Prize

      July 29, 2025
    • Technology

      Microsoft finally fixes the worst thing about Excel’s pivot tables

      July 30, 2025

      Logitech’s newest MX Master mouse is on sale for $40 off right now

      July 30, 2025

      Adobe adds one of its most-requested updates to Photoshop

      July 30, 2025

      This AOC 1440p OLED gaming monitor is super cheap today: $485

      July 30, 2025

      Sharethrough, Contentful and Yahoo are among this year’s Digiday Technology Awards finalists

      July 30, 2025
    • Others
      • Gadgets
      • Gaming
      • Health
      • Software and Apps
    Check BMI
    Tech AI Verse
    You are at:Home»Technology»Model minimalism: The new AI strategy saving companies millions
    Technology

    Model minimalism: The new AI strategy saving companies millions

    TechAiVerseBy TechAiVerseJuly 1, 2025No Comments7 Mins Read2 Views
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr Email Reddit
    Model minimalism: The new AI strategy saving companies millions
    Share
    Facebook Twitter LinkedIn Pinterest WhatsApp Email

    BMI Calculator – Check your Body Mass Index for free!

    Model minimalism: The new AI strategy saving companies millions

    June 27, 2025 1:00 PM

    This article is part of VentureBeat’s special issue, “The Real Cost of AI: Performance, Efficiency and ROI at Scale.” Read more from this special issue.

    The advent of large language models (LLMs) has made it easier for enterprises to envision the kinds of projects they can undertake, leading to a surge in pilot programs now transitioning to deployment. 

    However, as these projects gained momentum, enterprises realized that the earlier LLMs they had used were unwieldy and, worse, expensive. 

    Enter small language models and distillation. Models like Google’s Gemma family, Microsoft’s Phi and Mistral’s Small 3.1 allowed businesses to choose fast, accurate models that work for specific tasks. Enterprises can opt for a smaller model for particular use cases, allowing them to lower the cost of running their AI applications and potentially achieve a better return on investment. 

    LinkedIn distinguished engineer Karthik Ramgopal told VentureBeat that companies opt for smaller models for a few reasons. 

    “Smaller models require less compute, memory and faster inference times, which translates directly into lower infrastructure OPEX (operational expenditures) and CAPEX (capital expenditures) given GPU costs, availability and power requirements,” Ramgoapl said. “Task-specific models have a narrower scope, making their behavior more aligned and maintainable over time without complex prompt engineering.”

    Model developers price their small models accordingly. OpenAI’s o4-mini costs $1.1 per million tokens for inputs and $4.4/million tokens for outputs, compared to the full o3 version at $10 for inputs and $40 for outputs. 

    Enterprises today have a larger pool of small models, task-specific models and distilled models to choose from. These days, most flagship models offer a range of sizes. For example, the Claude family of models from Anthropic comprises Claude Opus, the largest model, Claude Sonnet, the all-purpose model, and Claude Haiku, the smallest version. These models are compact enough to operate on portable devices, such as laptops or mobile phones. 

    The savings question

    When discussing return on investment, though, the question is always: What does ROI look like? Should it be a return on the costs incurred or the time savings that ultimately means dollars saved down the line? Experts VentureBeat spoke to said ROI can be difficult to judge because some companies believe they’ve already reached ROI by cutting time spent on a task while others are waiting for actual dollars saved or more business brought in to say if AI investments have actually worked.

    Normally, enterprises calculate ROI by a simple formula as described by Cognizant chief technologist Ravi Naarla in a post: ROI = (Benefits-Cost)/Costs. But with AI programs, the benefits are not immediately apparent. He suggests enterprises identify the benefits they expect to achieve, estimate these based on historical data, be realistic about the overall cost of AI, including hiring, implementation and maintenance, and understand you have to be in it for the long haul.

    With small models, experts argue that these reduce implementation and maintenance costs, especially when fine-tuning models to provide them with more context for your enterprise.

    Arijit Sengupta, founder and CEO of Aible, said that how people bring context to the models dictates how much cost savings they can get. For individuals who require additional context for prompts, such as lengthy and complex instructions, this can result in higher token costs. 

    “You have to give models context one way or the other; there is no free lunch. But with large models, that is usually done by putting it in the prompt,” he said. “Think of fine-tuning and post-training as an alternative way of giving models context. I might incur $100 of post-training costs, but it’s not astronomical.”

    Sengupta said they’ve seen about 100X cost reductions just from post-training alone, often dropping model use cost “from single-digit millions to something like $30,000.” He did point out that this number includes software operating expenses and the ongoing cost of the model and vector databases. 

    “In terms of maintenance cost, if you do it manually with human experts, it can be expensive to maintain because small models need to be post-trained to produce results comparable to large models,” he said.

    Experiments Aible conducted showed that a task-specific, fine-tuned model performs well for some use cases, just like LLMs, making the case that deploying several use-case-specific models rather than large ones to do everything is more cost-effective. 

    The company compared a post-trained version of Llama-3.3-70B-Instruct to a smaller 8B parameter option of the same model. The 70B model, post-trained for $11.30, was 84% accurate in automated evaluations and 92% in manual evaluations. Once fine-tuned to a cost of $4.58, the 8B model achieved 82% accuracy in manual assessment, which would be suitable for more minor, more targeted use cases. 

    Cost factors fit for purpose

    Right-sizing models does not have to come at the cost of performance. These days, organizations understand that model choice doesn’t just mean choosing between GPT-4o or Llama-3.1; it’s knowing that some use cases, like summarization or code generation, are better served by a small model.

    Daniel Hoske, chief technology officer at contact center AI products provider Cresta, said starting development with LLMs informs potential cost savings better. 

    “You should start with the biggest model to see if what you’re envisioning even works at all, because if it doesn’t work with the biggest model, it doesn’t mean it would with smaller models,” he said. 

    Ramgopal said LinkedIn follows a similar pattern because prototyping is the only way these issues can start to emerge.

    “Our typical approach for agentic use cases begins with general-purpose LLMs as their broad generalizationability allows us to rapidly prototype, validate hypotheses and assess product-market fit,” LinkedIn’s Ramgopal said. “As the product matures and we encounter constraints around quality, cost or latency, we transition to more customized solutions.”

    In the experimentation phase, organizations can determine what they value most from their AI applications. Figuring this out enables developers to plan better what they want to save on and select the model size that best suits their purpose and budget. 

    The experts cautioned that while it is important to build with models that work best with what they’re developing, high-parameter LLMs will always be more expensive. Large models will always require significant computing power. 

    However, overusing small and task-specific models also poses issues. Rahul Pathak, vice president of data and AI GTM at AWS, said in a blog post that cost optimization comes not just from using a model with low compute power needs, but rather from matching a model to tasks. Smaller models may not have a sufficiently large context window to understand more complex instructions, leading to increased workload for human employees and higher costs. 

    Sengupta also cautioned that some distilled models could be brittle, so long-term use may not result in savings. 

    Constantly evaluate

    Regardless of the model size, industry players emphasized the flexibility to address any potential issues or new use cases. So if they start with a large model and a smaller model with similar or better performance and lower cost, organizations cannot be precious about their chosen model. 

    Tessa Burg, CTO and head of innovation at brand marketing company Mod Op, told VentureBeat that organizations must understand that whatever they build now will always be superseded by a better version. 

    “We started with the mindset that the tech underneath the workflows that we’re creating, the processes that we’re making more efficient, are going to change. We knew that whatever model we use will be the worst version of a model.”

    Burg said that smaller models helped save her company and its clients time in researching and developing concepts. Time saved, she said, that does lead to budget savings over time. She added that it’s a good idea to break out high-cost, high-frequency use cases for light-weight models.

    Sengupta noted that vendors are now making it easier to switch between models automatically, but cautioned users to find platforms that also facilitate fine-tuning, so they don’t incur additional costs. 

    BMI Calculator – Check your Body Mass Index for free!

    Share. Facebook Twitter Pinterest LinkedIn Reddit WhatsApp Telegram Email
    Previous ArticleBetter governance is required for AI agents
    Next Article The inference trap: How cloud providers are eating your AI margins
    TechAiVerse
    • Website

    Jonathan is a tech enthusiast and the mind behind Tech AI Verse. With a passion for artificial intelligence, consumer tech, and emerging innovations, he deliver clear, insightful content to keep readers informed. From cutting-edge gadgets to AI advancements and cryptocurrency trends, Jonathan breaks down complex topics to make technology accessible to all.

    Related Posts

    Microsoft finally fixes the worst thing about Excel’s pivot tables

    July 30, 2025

    Logitech’s newest MX Master mouse is on sale for $40 off right now

    July 30, 2025

    Adobe adds one of its most-requested updates to Photoshop

    July 30, 2025
    Leave A Reply Cancel Reply

    Top Posts

    6.7 Cummins Lifter Failure: What Years Are Affected (And Possible Fixes)

    April 14, 202532 Views

    New Akira ransomware decryptor cracks encryptions keys using GPUs

    March 16, 202529 Views

    Ping, You’ve Got Whale: AI detection system alerts ships of whales in their path

    April 22, 202527 Views

    OpenAI details ChatGPT-o3, o4-mini, o4-mini-high usage limits

    April 19, 202522 Views
    Don't Miss
    Artificial Intelligence July 30, 2025

    AI models may be accidentally (and secretly) learning each other’s bad behaviors

    AI models may be accidentally (and secretly) learning each other’s bad behaviorsArtificial intelligence models can…

    vivo X Fold 5 launches in Malaysia for RM6999

    Microsoft finally fixes the worst thing about Excel’s pivot tables

    Logitech’s newest MX Master mouse is on sale for $40 off right now

    Stay In Touch
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    • YouTube
    • Vimeo

    Subscribe to Updates

    Get the latest creative news from SmartMag about art & design.

    About Us
    About Us

    Welcome to Tech AI Verse, your go-to destination for everything technology! We bring you the latest news, trends, and insights from the ever-evolving world of tech. Our coverage spans across global technology industry updates, artificial intelligence advancements, machine learning ethics, and automation innovations. Stay connected with us as we explore the limitless possibilities of technology!

    Facebook X (Twitter) Pinterest YouTube WhatsApp
    Our Picks

    AI models may be accidentally (and secretly) learning each other’s bad behaviors

    July 30, 20252 Views

    vivo X Fold 5 launches in Malaysia for RM6999

    July 30, 20252 Views

    Microsoft finally fixes the worst thing about Excel’s pivot tables

    July 30, 20252 Views
    Most Popular

    Xiaomi 15 Ultra Officially Launched in China, Malaysia launch to follow after global event

    March 12, 20250 Views

    Apple thinks people won’t use MagSafe on iPhone 16e

    March 12, 20250 Views

    French Apex Legends voice cast refuses contracts over “unacceptable” AI clause

    March 12, 20250 Views
    © 2025 TechAiVerse. Designed by Divya Tech.
    • Home
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms & Conditions

    Type above and press Enter to search. Press Esc to cancel.