Google DeepMind’s new model, Gemini Robotics, integrates its top large language model, Gemini 2.0, to enhance robot capabilities. This integration allows robots to be more dexterous, follow natural-language commands, and generalize across various tasks. Previously, robots struggled with these abilities. The model’s success stems from leveraging advancements in Gemini 2.0, enabling robots to reason about actions, understand human requests, and communicate effectively. Gemini Robotics also generalizes across different robot types. In demonstrations, robots achieved a 90% success rate in tasks like sorting fruits into containers, folding glasses, and even performing a slam dunk with a toy basketball. These tasks were executed even when objects were moved, showcasing the model’s adaptability. Google DeepMind trained the robots using both simulated and real-world data, including teleoperation and video analysis. The robots were tested on the ASIMOV data set, where Gemini 2.0 Flash and Gemini Robotics models showed strong performance in recognizing safe and unsafe scenarios. Additionally, a constitutional AI mechanism based on Asimov’s laws was implemented to ensure safe robot-human interactions.
Source: www.technologyreview.com

Related Links
Related Videos
Related X Posts
Physical Intelligence
@physical_int
·
Feb 26
Vision-language models can control robots, but what if the prompt is too complex for the robot to follow directly?We developed a way to get robots to “think through” complex instructions, feedback, and interjections. We call it the Hierarchical Interactive Robot (Hi Robot).
Sergey Levine
@svlevine
·
Feb 26
We made π0 “think harder”: our new Hierarchical Interactive Robot (Hi Robot) method “thinks” through complex tasks and prompts, directing π0 to break up complex tasks into basic steps, handling human feedback, and modifying tasks on the fly.
Princeton Computer Science
@PrincetonCS
·
4h
Exploration is a key problem in reinforcement learning.
@ben_eysenbach
and Grace Liu have found an unexpected way to get machines to explore: they gave AI agents a single difficult task and no feedback at all.Learn more: https://bit.ly/3R1a6R0
Wes Roth
@WesRothMoney
·
Mar 6
Engineers at Boston Dynamics have revealed insights on “sequencing,” the first practical task mastered by Atlas, showcasing its ability to plan and execute complex, step-by-step actions autonomously.From parkour moves to practical tasks, Atlas continues to redefine what
UriG
@uri_gadot
·
Mar 10
Our Solution – RL-RC-DoT:
We propose to train a lightweight reinforcement learning (RL) agent that dynamically adjusts Quantization Parameters (QPs) at the macro-block level. This ensures task-relevant regions are prioritized while maintaining overall efficiency.
Kevin
@kevineress
·
5h
… Prplxity: FocusRot Apply dynamic allocation between subtasks using techniques from LLMs, switching between rule acquisition, strategy planning & error correct phases. Cluster code injection increases model size by 15%. Socratic validation adds .2-.4s latency per interaction.