Close Menu

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Get Microsoft Visual Studio Pro FREE when you buy this $50 coding course bundle

    Windows 11’s yearly ’25H2′ update enters its final preview stage

    Watch: AMD talks ROCm and how it’s a game-changer for Radeon PCs

    Facebook X (Twitter) Instagram
    • Artificial Intelligence
    • Business Technology
    • Cryptocurrency
    • Gadgets
    • Gaming
    • Health
    • Software and Apps
    • Technology
    Facebook X (Twitter) Instagram Pinterest Vimeo
    Tech AI Verse
    • Home
    • Artificial Intelligence

      Blue-collar jobs are gaining popularity as AI threatens office work

      August 17, 2025

      Man who asked ChatGPT about cutting out salt from his diet was hospitalized with hallucinations

      August 15, 2025

      What happens when chatbots shape your reality? Concerns are growing online

      August 14, 2025

      Scientists want to prevent AI from going rogue by teaching it to be bad first

      August 8, 2025

      AI models may be accidentally (and secretly) learning each other’s bad behaviors

      July 30, 2025
    • Business

      Why Certified VMware Pros Are Driving the Future of IT

      August 24, 2025

      Murky Panda hackers exploit cloud trust to hack downstream customers

      August 23, 2025

      The rise of sovereign clouds: no data portability, no party

      August 20, 2025

      Israel is reportedly storing millions of Palestinian phone calls on Microsoft servers

      August 6, 2025

      AI site Perplexity uses “stealth tactics” to flout no-crawl edicts, Cloudflare says

      August 5, 2025
    • Crypto

      Former Indian Politician Convicted in Bitcoin Extortion Case

      August 30, 2025

      Top 3 Real World Asset (RWA) Altcoins to Watch in September

      August 30, 2025

      Ethereum Dip May Be Temporary with $1 Billion Whale Buys and Slower Profit Taking

      August 30, 2025

      Everything We Know So Far About the Bitcoin Thriller “Killing Satoshi”

      August 30, 2025

      Why HBAR’s Bearish Sentiment Might Be Its Trigger for a Price Rebound

      August 30, 2025
    • Technology

      Get Microsoft Visual Studio Pro FREE when you buy this $50 coding course bundle

      August 30, 2025

      Windows 11’s yearly ’25H2′ update enters its final preview stage

      August 30, 2025

      Watch: AMD talks ROCm and how it’s a game-changer for Radeon PCs

      August 30, 2025

      Tablo DVR users just got a feature they’ve been waiting years for

      August 30, 2025

      Eufy PoE Bullet Security Camera E40 review: Professional grade

      August 30, 2025
    • Others
      • Gadgets
      • Gaming
      • Health
      • Software and Apps
    Check BMI
    Tech AI Verse
    You are at:Home»Technology»How Sakana AI’s new evolutionary algorithm builds powerful AI models without expensive retraining
    Technology

    How Sakana AI’s new evolutionary algorithm builds powerful AI models without expensive retraining

    TechAiVerseBy TechAiVerseAugust 30, 2025No Comments7 Mins Read2 Views
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr Email Reddit
    How Sakana AI’s new evolutionary algorithm builds powerful AI models without expensive retraining
    Share
    Facebook Twitter LinkedIn Pinterest WhatsApp Email

    BMI Calculator – Check your Body Mass Index for free!

    How Sakana AI’s new evolutionary algorithm builds powerful AI models without expensive retraining

    August 29, 2025 5:14 PM

    Image credit: VentureBeat with ChatGPT

    Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now


    A new evolutionary technique from Japan-based AI lab Sakana AI enables developers to augment the capabilities of AI models without costly training and fine-tuning processes. The technique, called Model Merging of Natural Niches (M2N2), overcomes the limitations of other model merging methods and can even evolve new models entirely from scratch.

    M2N2 can be applied to different types of machine learning models, including large language models (LLMs) and text-to-image generators. For enterprises looking to build custom AI solutions, the approach offers a powerful and efficient way to create specialized models by combining the strengths of existing open-source variants.

    What is model merging?

    Model merging is a technique for integrating the knowledge of multiple specialized AI models into a single, more capable model. Instead of fine-tuning, which refines a single pre-trained model using new data, merging combines the parameters of several models simultaneously. This process can consolidate a wealth of knowledge into one asset without requiring expensive, gradient-based training or access to the original training data.

    For enterprise teams, this offers several practical advantages over traditional fine-tuning. In comments to VentureBeat, the paper’s authors said model merging is a gradient-free process that only requires forward passes, making it computationally cheaper than fine-tuning, which involves costly gradient updates. Merging also sidesteps the need for carefully balanced training data and mitigates the risk of “catastrophic forgetting,” where a model loses its original capabilities after learning a new task. The technique is especially powerful when the training data for specialist models isn’t available, as merging only requires the model weights themselves.


    AI Scaling Hits Its Limits

    Power caps, rising token costs, and inference delays are reshaping enterprise AI. Join our exclusive salon to discover how top teams are:

    • Turning energy into a strategic advantage
    • Architecting efficient inference for real throughput gains
    • Unlocking competitive ROI with sustainable AI systems

    Secure your spot to stay ahead: https://bit.ly/4mwGngO


    Early approaches to model merging required significant manual effort, as developers adjusted coefficients through trial and error to find the optimal blend. More recently, evolutionary algorithms have helped automate this process by searching for the optimal combination of parameters. However, a significant manual step remains: developers must set fixed sets for mergeable parameters, such as layers. This restriction limits the search space and can prevent the discovery of more powerful combinations.

    How M2N2 works

    M2N2 addresses these limitations by drawing inspiration from evolutionary principles in nature. The algorithm has three key features that allow it to explore a wider range of possibilities and discover more effective model combinations.

    Model Merging of Natural Niches Source: arXiv

    First, M2N2 eliminates fixed merging boundaries, such as blocks or layers. Instead of grouping parameters by pre-defined layers, it uses flexible “split points” and “mixing ration” to divide and combine models. This means that, for example, the algorithm might merge 30% of the parameters in one layer from Model A with 70% of the parameters from the same layer in Model B. The process starts with an “archive” of seed models. At each step, M2N2 selects two models from the archive, determines a mixing ratio and a split point, and merges them. If the resulting model performs well, it is added back to the archive, replacing a weaker one. This allows the algorithm to explore increasingly complex combinations over time. As the researchers note, “This gradual introduction of complexity ensures a wider range of possibilities while maintaining computational tractability.”

    Second, M2N2 manages the diversity of its model population through competition. To understand why diversity is crucial, the researchers offer a simple analogy: “Imagine merging two answer sheets for an exam… If both sheets have exactly the same answers, combining them does not make any improvement. But if each sheet has correct answers for different questions, merging them gives a much stronger result.” Model merging works the same way. The challenge, however, is defining what kind of diversity is valuable. Instead of relying on hand-crafted metrics, M2N2 simulates competition for limited resources. This nature-inspired approach naturally rewards models with unique skills, as they can “tap into uncontested resources” and solve problems others can’t. These niche specialists, the authors note, are the most valuable for merging.

    Third, M2N2 uses a heuristic called “attraction” to pair models for merging. Rather than simply combining the top-performing models as in other merging algorithms, it pairs them based on their complementary strengths. An “attraction score” identifies pairs where one model performs well on data points that the other finds challenging. This improves both the efficiency of the search and the quality of the final merged model.

    M2N2 in action

    The researchers tested M2N2 across three different domains, demonstrating its versatility and effectiveness.

    The first was a small-scale experiment evolving neural network–based image classifiers from scratch on the MNIST dataset. M2N2 achieved the highest test accuracy by a substantial margin compared to other methods. The results showed that its diversity-preservation mechanism was key, allowing it to maintain an archive of models with complementary strengths that facilitated effective merging while systematically discarding weaker solutions.

    Next, they applied M2N2 to LLMs, combining a math specialist model (WizardMath-7B) with an agentic specialist (AgentEvol-7B), both of which are based on the Llama 2 architecture. The goal was to create a single agent that excelled at both math problems (GSM8K dataset) and web-based tasks (WebShop dataset). The resulting model achieved strong performance on both benchmarks, showcasing M2N2’s ability to create powerful, multi-skilled models.

    A model merge with M2N2 combines the best of both seed models Source: arXiv

    Finally, the team merged diffusion-based image generation models. They combined a model trained on Japanese prompts (JSDXL) with three Stable Diffusion models primarily trained on English prompts. The objective was to create a model that combined the best image generation capabilities of each seed model while retaining the ability to understand Japanese. The merged model not only produced more photorealistic images with better semantic understanding but also developed an emergent bilingual ability. It could generate high-quality images from both English and Japanese prompts, even though it was optimized exclusively using Japanese captions.

    For enterprises that have already developed specialist models, the business case for merging is compelling. The authors point to new, hybrid capabilities that would be difficult to achieve otherwise. For example, merging an LLM fine-tuned for persuasive sales pitches with a vision model trained to interpret customer reactions could create a single agent that adapts its pitch in real-time based on live video feedback. This unlocks the combined intelligence of multiple models with the cost and latency of running just one.

    Looking ahead, the researchers see techniques like M2N2 as part of a broader trend toward “model fusion.” They envision a future where organizations maintain entire ecosystems of AI models that are continuously evolving and merging to adapt to new challenges.

    “Think of it like an evolving ecosystem where capabilities are combined as needed, rather than building one giant monolith from scratch,” the authors suggest.

    The researchers have released the code of M2N2 on GitHub.

    The biggest hurdle to this dynamic, self-improving AI ecosystem, the authors believe, is not technical but organizational. “In a world with a large ‘merged model’ made up of open-source, commercial, and custom components, ensuring privacy, security, and compliance will be a critical problem.” For businesses, the challenge will be figuring out which models can be safely and effectively absorbed into their evolving AI stack.

    Daily insights on business use cases with VB Daily

    If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI.

    Read our Privacy Policy

    Thanks for subscribing. Check out more VB newsletters here.

    An error occured.

    BMI Calculator – Check your Body Mass Index for free!

    Share. Facebook Twitter Pinterest LinkedIn Reddit WhatsApp Telegram Email
    Previous ArticleHow Intuit killed the chatbot crutch – and built an agentic AI playbook you can copy
    Next Article Software commands 40% of cybersecurity budgets as gen AI attacks execute in milliseconds
    TechAiVerse
    • Website

    Jonathan is a tech enthusiast and the mind behind Tech AI Verse. With a passion for artificial intelligence, consumer tech, and emerging innovations, he deliver clear, insightful content to keep readers informed. From cutting-edge gadgets to AI advancements and cryptocurrency trends, Jonathan breaks down complex topics to make technology accessible to all.

    Related Posts

    Get Microsoft Visual Studio Pro FREE when you buy this $50 coding course bundle

    August 30, 2025

    Windows 11’s yearly ’25H2′ update enters its final preview stage

    August 30, 2025

    Watch: AMD talks ROCm and how it’s a game-changer for Radeon PCs

    August 30, 2025
    Leave A Reply Cancel Reply

    Top Posts

    Ping, You’ve Got Whale: AI detection system alerts ships of whales in their path

    April 22, 2025167 Views

    6.7 Cummins Lifter Failure: What Years Are Affected (And Possible Fixes)

    April 14, 202548 Views

    New Akira ransomware decryptor cracks encryptions keys using GPUs

    March 16, 202530 Views

    Is Libby Compatible With Kobo E-Readers?

    March 31, 202528 Views
    Don't Miss
    Technology August 30, 2025

    Get Microsoft Visual Studio Pro FREE when you buy this $50 coding course bundle

    Get Microsoft Visual Studio Pro FREE when you buy this $50 coding course bundle Image:…

    Windows 11’s yearly ’25H2′ update enters its final preview stage

    Watch: AMD talks ROCm and how it’s a game-changer for Radeon PCs

    Tablo DVR users just got a feature they’ve been waiting years for

    Stay In Touch
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    • YouTube
    • Vimeo

    Subscribe to Updates

    Get the latest creative news from SmartMag about art & design.

    About Us
    About Us

    Welcome to Tech AI Verse, your go-to destination for everything technology! We bring you the latest news, trends, and insights from the ever-evolving world of tech. Our coverage spans across global technology industry updates, artificial intelligence advancements, machine learning ethics, and automation innovations. Stay connected with us as we explore the limitless possibilities of technology!

    Facebook X (Twitter) Pinterest YouTube WhatsApp
    Our Picks

    Get Microsoft Visual Studio Pro FREE when you buy this $50 coding course bundle

    August 30, 20252 Views

    Windows 11’s yearly ’25H2′ update enters its final preview stage

    August 30, 20252 Views

    Watch: AMD talks ROCm and how it’s a game-changer for Radeon PCs

    August 30, 20252 Views
    Most Popular

    Xiaomi 15 Ultra Officially Launched in China, Malaysia launch to follow after global event

    March 12, 20250 Views

    Apple thinks people won’t use MagSafe on iPhone 16e

    March 12, 20250 Views

    French Apex Legends voice cast refuses contracts over “unacceptable” AI clause

    March 12, 20250 Views
    © 2025 TechAiVerse. Designed by Divya Tech.
    • Home
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms & Conditions

    Type above and press Enter to search. Press Esc to cancel.