Close Menu

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Newegg’s Ryzen 7 9850X3D combo bundle offers over $500 in savings on three key components, including 64GB DDR5 RAM

    Tenku Pocket 8 micro laptop launched with 8-inch touch display and 8-core Intel Alder Lake-N CPU

    Endorfy Signum M30 Air and M30 ARGB arrive as brand-new micro ATX PC towers

    Facebook X (Twitter) Instagram
    • Artificial Intelligence
    • Business Technology
    • Cryptocurrency
    • Gadgets
    • Gaming
    • Health
    • Software and Apps
    • Technology
    Facebook X (Twitter) Instagram Pinterest Vimeo
    Tech AI Verse
    • Home
    • Artificial Intelligence

      Tensions between the Pentagon and AI giant Anthropic reach a boiling point

      February 21, 2026

      Read the extended transcript: President Donald Trump interviewed by ‘NBC Nightly News’ anchor Tom Llamas

      February 6, 2026

      Stocks and bitcoin sink as investors dump software company shares

      February 4, 2026

      AI, crypto and Trump super PACs stash millions to spend on the midterms

      February 2, 2026

      To avoid accusations of AI cheating, college students are turning to AI

      January 29, 2026
    • Business

      How Smarsh built an AI front door for regulated industries — and drove 59% self-service adoption

      February 24, 2026

      Where MENA CIOs draw the line on AI sovereignty

      February 24, 2026

      Ex-President’s shift away from Xbox consoles to cloud gaming reportedly caused friction

      February 24, 2026

      Gartner: Why neoclouds are the future of GPU-as-a-Service

      February 21, 2026

      The HDD brand that brought you the 1.8-inch, 2.5-inch, and 3.5-inch hard drives is now back with a $19 pocket-sized personal cloud for your smartphones

      February 12, 2026
    • Crypto

      BitMine Buys $93 Million in ETH, but Ethereum Slides as Holders Resume Selling

      February 24, 2026

      XRP Ledger Sets Multiple Key Records in February Despite Price Decline

      February 24, 2026

      Bhutan Rolls Out Solana-Backed Visas Even As Demand Stays Weak

      February 24, 2026

      ZachXBT Teases Major Crypto Exposé Ahead of Feb. 26 — How Is Smart Money Positioned?

      February 24, 2026

      Acurast turns 225,000 smartphones into a secure AI network on Base

      February 24, 2026
    • Technology

      Newegg’s Ryzen 7 9850X3D combo bundle offers over $500 in savings on three key components, including 64GB DDR5 RAM

      February 25, 2026

      Tenku Pocket 8 micro laptop launched with 8-inch touch display and 8-core Intel Alder Lake-N CPU

      February 25, 2026

      Endorfy Signum M30 Air and M30 ARGB arrive as brand-new micro ATX PC towers

      February 25, 2026

      Ditch the Adobe subscription: This PDF editor is yours for life for $25

      February 25, 2026

      Her AI agent nuked 200 emails. This guardrail stops the next disaster

      February 25, 2026
    • Others
      • Gadgets
      • Gaming
      • Health
      • Software and Apps
    Check BMI
    Tech AI Verse
    You are at:Home»Technology»Early days for small language models and AI at the edge
    Technology

    Early days for small language models and AI at the edge

    TechAiVerseBy TechAiVerseMay 12, 2025No Comments9 Mins Read3 Views
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr Email Reddit
    Early days for small language models and AI at the edge
    Share
    Facebook Twitter LinkedIn Pinterest WhatsApp Email

    Early days for small language models and AI at the edge

    By

    • Stephen Pritchard

    Published: 08 May 2025

    Large language models (LLMs) use vast amounts of data and computing power to create answers to queries that look and sometimes even feel “human”. LLMs can also generate music, images or video, write code, and scan for security breaches among a host of other tasks.

    This capability has led to the rapid adoption of generative artificial intelligence (GenAI) and a new generation of digital assistants and “chatbots”. GenAI has grown faster than any other technology. ChatGPT, the best-known LLM, reached 100 million users in just two months, according to the investment bank UBS. It took the mobile phone 16 years to reach that scale.

    LLMs, however, are not the only way to run GenAI. Small language models (SLMs), usually defined as using no more than 10 to 15 billion parameters, are attracting interest, both from commercial enterprises and in the public sector.

    Small, or smaller, language models should be more cost-effective to deploy than LLMs, and offer greater privacy and – potentially – security. While LLMs have become popular due to their wide range of capacities, SLMs can perform better than LLMs, at least for specific or tightly defined tasks.

    At the same time, SLMs avoid some of the disadvantages of LLMs. These include the vast resources they demand either on-premise or in the cloud, and their associated environmental impact, the mounting costs of a “pay-as-you-go” service, and the risks associated with moving sensitive information to third-party cloud infrastructure.

    Less is more

    SLMs are also becoming more powerful and are able to rival LLMs in some use cases. This is allowing organisations to run SLMs on less powerful infrastructure – some models can even run on personal devices including phones and tablets.

    “In the small language space, we are seeing small getting smaller,” says Birgi Tamersoy, a member of the AI strategy team at Gartner. “From an application perspective, we still see the 10 to 15 billion range as small, and there is a mid-range category.

    “But at the same time, we are seeing a lot of billion parameter models and subdivisions of fewer than a billion parameters. You might not need the capability [of an LLM], and as you reduce the model size, you benefit from task specialisation.”

    For reference, ChatGPT 4.0 is estimated to run around 1.8 trillion parameters.

    Tamersoy is seeing smaller, specialist models emerging to handle Indic languages, reasoning, or vision and audio processing. But he also sees applications in healthcare and other areas where regulations make it harder to use a cloud-based LLM, adding: “In a hospital, it allows you to run it on a machine right there.”

    SLM advantages

    A further distinction is that LLMs are trained on publicly available information. SLMs can be trained on private, and often sensitive, data. Even where data is not confidential, using an SLM with a tailored data source avoids some of the errors, or hallucinations, which can affect even the best LLMs.

    “For a small language model, they have been designed to absorb and learn from a certain area of knowledge,” says Jith M, CTO at technology consulting firm Hexaware.

    “If someone wants an interpretation of legal norms in North America, they could go to ChatGPT, but instead of the US, it could give you information from Canada or Mexico. But if you have a foundation model that is small, and you train it very specifically, it will respond with the right data set because it doesn’t know anything else.”

    A model trained on a more limited data set is less likely to produce some of the ambiguous and occasionally embarrassing results attributed to LLMs.

    Performance and efficiency can also favour the SLM. Microsoft, for example, trained its Phi-1 transformer-based model to write Python code with a high level of accuracy – by some estimates, it was 25 times better.

    Although Microsoft refers to its Phi series as large language models, Phi-1 used only 1.3bn parameters. Microsoft says its latest Phi-3 models outperform LLMs twice their size. The Chinese-based LLM DeepSeek is also, by some measures, a smaller language model. Researchers believe it has 70bn parameters, but Deepseek only uses 37bn at a time.

    “It’s the Pareto principle, 80% of the gain for 20% of the work,” says Dominik Tomicevik, co-founder at Memgraph. “If you have public data, you can ask large, broad questions to a large language model in various different different domains of life. It’s kind of a personal assistant.

    “But a lot of the interesting applications within the enterprise are really constrained in terms of domain, and the model doesn’t need to know all of Shakespeare. You can make models much more efficient if they are suited for a specific purpose.”

    Another factor driving the interest in small language models is their lower cost. Most LLMs operate on a pay-as-you-go, cloud-based model, and users are charged per token (a number of characters) sent or received. As LLM usage increases, so do the fees paid by the organisation. And if that usage is not tied into business processes, it can be hard for CIOs to determine whether it is value for money.

    With smaller language models, the option to run on local hardware brings a measure of cost control. The up-front costs are capital expenditure, development and training. But once the model is built, there should not be significant cost increases due to usage.

    “There is a need for cost evaluation. LLMs tend to be more costly to run than SLMs,” says Gianluca Barletta, a data and analytics expert at PA Consulting. He expects to see a mix of options, with LLMs working alongside smaller models.

    “The experimentation on SLMs is really around the computational power they require, which is much less than an LLM. So, they lend themselves to the more specific, on the edge uses. It can be on an IoT [internet of things] device, an AI-enabled TV, or a smartphone as the computational power is much less.”

    Deploying SLMs at the edge

    Tal Zarfati, lead architect at JFrog, a software supply chain supplier making use of AI, agrees. But Zarfati also draws a distinction between smaller models running in a datacentre or on private cloud infrastructure and those running on an edge device. This includes both personal devices and more specialist equipment, such as security appliances and firewalls.

    “My experience from discussing small language models with enterprise clients is they differentiate by whether they can run that model internally and get a similar experience to a hosted large language model,” says Zarfati. “When we are talking about models with millions of parameters, such as the smaller Llama models, they are very small compared to ChatGPT4.5, but still not small enough to run fully on edge devices.”

    Moore’s Law, though, is pushing SLMs to the edge, he adds: “Smaller models can be hosted internally by an organisation and the smallest will be able to run on edge devices, but the definition of ‘small’ will probably become larger as time goes by.”

    Hardware suppliers are investing in “AI-ready” devices, including desktops and laptops, including by adding neural processing units (NPUs) to their products. As Gartner’s Tamersoy points out, companies such as Apple have patents on a number of specialist AI models, adding; “We are seeing some examples on the mobile side of being able to run some of these algorithms on the device itself, without going to the cloud.”

    This is driven both by regulatory needs to protect data, and a need to carry out processing as close to the data as possible, to minimise connectivity issues and latency. This approach has been adopted by SciBite, a division of Elsevier focused on life sciences data.

    “We are seeing a lot of focus on generative AI throughout the drugs discovery process. We are talking about LLMs and SLMs, as well as machine learning,” says Tamersoy.

    “In what scenario would you want to use an SLM? You’d want to know there is a specific problem you can define. If it’s a broad, more complex task where there is heavy reasoning required and a need to understand context, that is maybe where you would stick to an LLM.

    “If you have a specific problem and you have good data to train the model, you need it to be cheaper to run, where privacy is important and potentially efficiency is more important than accuracy, that is where you would be looking at an SLM.” Tamersoy is seeing smaller models being used in early stage R&D, such as molecular property prediction, right through to analysing regulatory requirements.

    At PA Consulting, the firm has worked with the Sellafield nuclear processing site to help them keep up to date with regulations.

    “We built a small language model to help them reduce the administrative burden,” says Barletta. “There’s constant regulatory changes that need to be taken into account. We created a model to reduce that from weeks to minutes. The model determines which changes are relevant and which documents are affected, giving the engineers something to evaluate. It is a classic example of a specific use case with limited data sets.”

    As devices grow in power and SLMs become more efficient, the trend is to push more powerful models ever closer to the end user.

    “It’s an evolving space,” says Hexaware’s Jith M. “I wouldn’t have believed two years ago that I could run a 70 billion parameter model on a footprint that was just the size of my palm…personal devices will have NPUs to accelerate AI. Chips will allow us to run local models very fast. You will be able to take decisions at wire speed.”

    Read more on Artificial intelligence, automation and robotics


    • Rev speaks out on the big future of small language models

      By: Adrian Bridgwater


    • SLM series – Agiloft: Language models in contract lifecycle management

      By: Adrian Bridgwater


    • The role of small language models in enterprise AI

      By: Cliff Saran


    • SLM series – Nooks: Downsizing AI without shrinking its smarts

      By: Adrian Bridgwater

    Share. Facebook Twitter Pinterest LinkedIn Reddit WhatsApp Telegram Email
    Previous ArticleGirls more concerned about AI bias than boys
    Next Article AI adoption: AWS addresses the skills barrier holding back enterprises
    TechAiVerse
    • Website

    Jonathan is a tech enthusiast and the mind behind Tech AI Verse. With a passion for artificial intelligence, consumer tech, and emerging innovations, he deliver clear, insightful content to keep readers informed. From cutting-edge gadgets to AI advancements and cryptocurrency trends, Jonathan breaks down complex topics to make technology accessible to all.

    Related Posts

    Newegg’s Ryzen 7 9850X3D combo bundle offers over $500 in savings on three key components, including 64GB DDR5 RAM

    February 25, 2026

    Tenku Pocket 8 micro laptop launched with 8-inch touch display and 8-core Intel Alder Lake-N CPU

    February 25, 2026

    Endorfy Signum M30 Air and M30 ARGB arrive as brand-new micro ATX PC towers

    February 25, 2026
    Leave A Reply Cancel Reply

    Top Posts

    Ping, You’ve Got Whale: AI detection system alerts ships of whales in their path

    April 22, 2025693 Views

    Lumo vs. Duck AI: Which AI is Better for Your Privacy?

    July 31, 2025279 Views

    6.7 Cummins Lifter Failure: What Years Are Affected (And Possible Fixes)

    April 14, 2025160 Views

    6 Best MagSafe Phone Grips (2025), Tested and Reviewed

    April 6, 2025122 Views
    Don't Miss
    Technology February 25, 2026

    Newegg’s Ryzen 7 9850X3D combo bundle offers over $500 in savings on three key components, including 64GB DDR5 RAM

    Newegg’s Ryzen 7 9850X3D combo bundle offers over $500 in savings on three key components,…

    Tenku Pocket 8 micro laptop launched with 8-inch touch display and 8-core Intel Alder Lake-N CPU

    Endorfy Signum M30 Air and M30 ARGB arrive as brand-new micro ATX PC towers

    Samsung Galaxy S26 series go official, 512GB base storage for Malaysia, from RM5199

    Stay In Touch
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    • YouTube
    • Vimeo

    Subscribe to Updates

    Get the latest creative news from SmartMag about art & design.

    About Us
    About Us

    Welcome to Tech AI Verse, your go-to destination for everything technology! We bring you the latest news, trends, and insights from the ever-evolving world of tech. Our coverage spans across global technology industry updates, artificial intelligence advancements, machine learning ethics, and automation innovations. Stay connected with us as we explore the limitless possibilities of technology!

    Facebook X (Twitter) Pinterest YouTube WhatsApp
    Our Picks

    Newegg’s Ryzen 7 9850X3D combo bundle offers over $500 in savings on three key components, including 64GB DDR5 RAM

    February 25, 20262 Views

    Tenku Pocket 8 micro laptop launched with 8-inch touch display and 8-core Intel Alder Lake-N CPU

    February 25, 20262 Views

    Endorfy Signum M30 Air and M30 ARGB arrive as brand-new micro ATX PC towers

    February 25, 20262 Views
    Most Popular

    7 Best Kids Bikes (2025): Mountain, Balance, Pedal, Coaster

    March 13, 20250 Views

    VTOMAN FlashSpeed 1500: Plenty Of Power For All Your Gear

    March 13, 20250 Views

    This new Roomba finally solves the big problem I have with robot vacuums

    March 13, 20250 Views
    © 2026 TechAiVerse. Designed by Divya Tech.
    • Home
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms & Conditions

    Type above and press Enter to search. Press Esc to cancel.