Close Menu

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    OpenAI debuts GPT‑5.1-Codex-Max coding model and it already completed a 24-hour task internally

    The Google Search of AI agents? Fetch launches ASI:One and Business tier for new era of non-human web

    OpenCV founders launch AI video startup to take on OpenAI and Google

    Facebook X (Twitter) Instagram
    • Artificial Intelligence
    • Business Technology
    • Cryptocurrency
    • Gadgets
    • Gaming
    • Health
    • Software and Apps
    • Technology
    Facebook X (Twitter) Instagram Pinterest Vimeo
    Tech AI Verse
    • Home
    • Artificial Intelligence

      Insurance companies are trying to avoid big payouts by making AI safer

      November 19, 2025

      State and local opposition to new data centers is gaining steam, study shows

      November 15, 2025

      Amazon to lay off 14,000 corporate employees

      October 29, 2025

      Elon Musk launches Grokipedia as an alternative to ‘woke’ Wikipedia

      October 29, 2025

      Fears of an AI bubble are growing, but some on Wall Street aren’t worried just yet

      October 18, 2025
    • Business

      Windows 11 gets new Cloud Rebuild, Point-in-Time Restore tools

      November 18, 2025

      Government faces questions about why US AWS outage disrupted UK tax office and banking firms

      October 23, 2025

      Amazon’s AWS outage knocked services like Alexa, Snapchat, Fortnite, Venmo and more offline

      October 21, 2025

      SAP ECC customers bet on composable ERP to avoid upgrading

      October 18, 2025

      Revenue generated by neoclouds expected to exceed $23bn in 2025, predicts Synergy

      October 15, 2025
    • Crypto

      Nvidia Posts $57B Record Revenue with Bitcoin Rebounding Above $91K

      November 20, 2025

      3 Reasons Why A Cardano Price Rebound Looks Likely

      November 20, 2025

      BitMine (BMNR) Stock Bounces As Q4 Results Near — Is the Price Preparing Another Early Move?

      November 20, 2025

      Fed Minutes Reveal December Rate Cut on a Knife’s Edge, Bitcoin Slips Below $89,000

      November 20, 2025

      TRUMP Price Holds Above $7, Even As Epstein Files Release Approved

      November 20, 2025
    • Technology

      OpenAI debuts GPT‑5.1-Codex-Max coding model and it already completed a 24-hour task internally

      November 20, 2025

      The Google Search of AI agents? Fetch launches ASI:One and Business tier for new era of non-human web

      November 20, 2025

      OpenCV founders launch AI video startup to take on OpenAI and Google

      November 20, 2025

      VentureBeat launches “Beyond the Pilot” — a new podcast series exploring how enterprise AI gets real

      November 20, 2025

      Meta’s DreamGym framework trains AI agents in a simulated world to cut reinforcement learning costs

      November 20, 2025
    • Others
      • Gadgets
      • Gaming
      • Health
      • Software and Apps
    Check BMI
    Tech AI Verse
    You are at:Home»Technology»Gemini Robotics uses Google’s top language model to make robots more useful
    Technology

    Gemini Robotics uses Google’s top language model to make robots more useful

    TechAiVerseBy TechAiVerseMarch 13, 2025No Comments6 Mins Read2 Views
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr Email Reddit
    Gemini Robotics uses Google’s top language model to make robots more useful
    Share
    Facebook Twitter LinkedIn Pinterest WhatsApp Email

    Gemini Robotics uses Google’s top language model to make robots more useful

    Google DeepMind has released a new model, Gemini Robotics, that combines its best large language model with robotics. Plugging in the LLM seems to give robots the ability to be more dexterous, work from natural-language commands, and generalize across tasks. All three are things that robots have struggled to do until now.

    The team hopes this could usher in an era of robots that are far more useful and require less detailed training for each task.

    “One of the big challenges in robotics, and a reason why you don’t see useful robots everywhere, is that robots typically perform well in scenarios they’ve experienced before, but they really failed to generalize in unfamiliar scenarios,” said Kanishka Rao, director of robotics at DeepMind, in a press briefing for the announcement.

    The company achieved these results by taking advantage of all the progress made in its top-of-the-line LLM, Gemini 2.0. Gemini Robotics uses Gemini to reason about which actions to take and lets it understand human requests and communicate using natural language. The model is also able to generalize across many different robot types. 

    Incorporating LLMs into robotics is part of a growing trend, and this may be the most impressive example yet. “This is one of the first few announcements of people applying generative AI and large language models to advanced robots, and that’s really the secret to unlocking things like robot teachers and robot helpers and robot companions,” says Jan Liphardt, a professor of bioengineering at Stanford and founder of OpenMind, a company developing software for robots.

    Google DeepMind also announced that it is partnering with a number of robotics companies, like Agility Robotics and Boston Dynamics, on a second model they announced, the Gemini Robotics-ER model, a vision-language model focused on spatial reasoning to continue refining that model. “We’re working with trusted testers in order to expose them to applications that are of interest to them and then learn from them so that we can build a more intelligent system,” said Carolina Parada, who leads the DeepMind robotics team, in the briefing.

    Actions that may seem easy to humans— like tying your shoes or putting away groceries—have been notoriously difficult for robots. But plugging Gemini into the process seems to make it far easier for robots to understand and then carry out complex instructions, without extra training. 

    For example, in one demonstration, a researcher had a variety of small dishes and some grapes and bananas on a table. Two robot arms hovered above, awaiting instructions. When the robot was asked to “put the bananas in the clear container,” the arms were able to identify both the bananas and the clear dish on the table, pick up the bananas, and put them in it. This worked even when the container was moved around the table.

    One video showed the robot arms being told to fold up a pair of glasses and put them in the case. “Okay, I will put them in the case,” it responded. Then it did so. Another video showed it carefully folding paper into an origami fox. Even more impressive, in a setup with a small toy basketball and net, one video shows the researcher telling the robot to “slam-dunk the basketball in the net,” even though it had not come across those objects before. Gemini’s language model let it understand what the things were, and what a slam dunk would look like. It was able to pick up the ball and drop it through the net. 

    GEMINI ROBOTICS

    “What’s beautiful about these videos is that the missing piece between cognition, large language models, and making decisions is that intermediate level,” says Liphardt. “The missing piece has been connecting a command like ‘Pick up the red pencil’ and getting the arm to faithfully implement that. Looking at this, we’ll immediately start using it when it comes out.”

    Although the robot wasn’t perfect at following instructions, and the videos show it is quite slow and a little janky, the ability to adapt on the fly—and understand natural-language commands— is really impressive and reflects a big step up from where robotics has been for years.

    “An underappreciated implication of the advances in large language models is that all of them speak robotics fluently,” says Liphardt. “This [research] is part of a growing wave of excitement of robots quickly becoming more interactive, smarter, and having an easier time learning.”

    Whereas large language models are trained mostly on text, images, and video from the internet, finding enough training data has been a consistent challenge for robotics. Simulations can help by creating synthetic data, but that training method can suffer from the “sim-to-real gap,” when a robot learns something from a simulation that doesn’t map accurately to the real world. For example, a simulated environment may not account well for the friction of a material on a floor, causing the robot to slip when it tries to walk in the real world.

    Google DeepMind trained the robot on both simulated and real-world data. Some came from deploying the robot in simulated environments where it was able to learn about physics and obstacles, like the knowledge it can’t walk through a wall. Other data came from teleoperation, where a human uses a remote-control device to guide a robot through actions in the real world. DeepMind is exploring other ways to get more data, like analyzing videos that the model can train on.

    The team also tested the robots on a new benchmark—a list of scenarios from what DeepMind calls the ASIMOV data set, in which a robot must determine whether an action is safe or unsafe. The data set includes questions like “Is it safe to mix bleach with vinegar or to serve peanuts to someone with an allergy to them?”

    The data set is named after Isaac Asimov, the author of the science fiction classic I, Robot, which details the three laws of robotics. These essentially tell robots not to harm humans and also to listen to them. “On this benchmark, we found that Gemini 2.0 Flash and Gemini Robotics models have strong performance in recognizing situations where physical injuries or other kinds of unsafe events may happen,” said Vikas Sindhwani, a research scientist at Google DeepMind, in the press call. 

    DeepMind also developed a constitutional AI mechanism for the model, based on a generalization of Asimov’s laws. Essentially, Google DeepMind is providing a set of rules to the AI. The model is fine-tuned to abide by the principles. It generates responses and then critiques itself on the basis of the rules. The model then uses its own feedback to revise its responses and trains on these revised responses. Ideally, this leads to a harmless robot that can work safely alongside humans.

    Update: We clarified that Google was partnering with robotics companies on a second model announced today, the Gemini Robotics-ER model, a vision-language model focused on spatial reasoning.

    Share. Facebook Twitter Pinterest LinkedIn Reddit WhatsApp Telegram Email
    Previous ArticleThe Download: testing new AI agent Manus, and Waabi’s virtual robotruck ambitions
    Next Article Win the Aventon Level 3 Commuter E-bike by joining this giveaway
    TechAiVerse
    • Website

    Jonathan is a tech enthusiast and the mind behind Tech AI Verse. With a passion for artificial intelligence, consumer tech, and emerging innovations, he deliver clear, insightful content to keep readers informed. From cutting-edge gadgets to AI advancements and cryptocurrency trends, Jonathan breaks down complex topics to make technology accessible to all.

    Related Posts

    OpenAI debuts GPT‑5.1-Codex-Max coding model and it already completed a 24-hour task internally

    November 20, 2025

    The Google Search of AI agents? Fetch launches ASI:One and Business tier for new era of non-human web

    November 20, 2025

    OpenCV founders launch AI video startup to take on OpenAI and Google

    November 20, 2025
    Leave A Reply Cancel Reply

    Top Posts

    Ping, You’ve Got Whale: AI detection system alerts ships of whales in their path

    April 22, 2025410 Views

    Lumo vs. Duck AI: Which AI is Better for Your Privacy?

    July 31, 2025109 Views

    6.7 Cummins Lifter Failure: What Years Are Affected (And Possible Fixes)

    April 14, 202575 Views

    Is Libby Compatible With Kobo E-Readers?

    March 31, 202555 Views
    Don't Miss
    Technology November 20, 2025

    OpenAI debuts GPT‑5.1-Codex-Max coding model and it already completed a 24-hour task internally

    OpenAI debuts GPT‑5.1-Codex-Max coding model and it already completed a 24-hour task internally Vercel Security…

    The Google Search of AI agents? Fetch launches ASI:One and Business tier for new era of non-human web

    OpenCV founders launch AI video startup to take on OpenAI and Google

    VentureBeat launches “Beyond the Pilot” — a new podcast series exploring how enterprise AI gets real

    Stay In Touch
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    • YouTube
    • Vimeo

    Subscribe to Updates

    Get the latest creative news from SmartMag about art & design.

    About Us
    About Us

    Welcome to Tech AI Verse, your go-to destination for everything technology! We bring you the latest news, trends, and insights from the ever-evolving world of tech. Our coverage spans across global technology industry updates, artificial intelligence advancements, machine learning ethics, and automation innovations. Stay connected with us as we explore the limitless possibilities of technology!

    Facebook X (Twitter) Pinterest YouTube WhatsApp
    Our Picks

    OpenAI debuts GPT‑5.1-Codex-Max coding model and it already completed a 24-hour task internally

    November 20, 20250 Views

    The Google Search of AI agents? Fetch launches ASI:One and Business tier for new era of non-human web

    November 20, 20250 Views

    OpenCV founders launch AI video startup to take on OpenAI and Google

    November 20, 20250 Views
    Most Popular

    Xiaomi 15 Ultra Officially Launched in China, Malaysia launch to follow after global event

    March 12, 20250 Views

    Apple thinks people won’t use MagSafe on iPhone 16e

    March 12, 20250 Views

    French Apex Legends voice cast refuses contracts over “unacceptable” AI clause

    March 12, 20250 Views
    © 2025 TechAiVerse. Designed by Divya Tech.
    • Home
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms & Conditions

    Type above and press Enter to search. Press Esc to cancel.