Google DeepMind Unveils Advanced AI Models for Robotics
Google DeepMind has announced the development of two groundbreaking AI models designed to revolutionize robotic capabilities in real-world applications. The first model, Gemini Robotics, integrates vision, language, and action to enable robots to understand and respond to new situations effectively. Built upon the foundation of Gemini 2.0, this model expands Google’s AI prowess by incorporating physical actions into its multimodal understanding.
Gemini Robotics represents a significant leap forward in generality, interactivity, and dexterity. The model demonstrates an impressive ability to generalize across new scenarios, enhancing interaction with both people and environments. It excels in performing precise tasks, such as folding paper or removing bottle caps. Danny Parada, a research scientist at Google DeepMind, emphasized the model’s potential to improve robot responsiveness and robustness in various settings.
Complementing Gemini Robotics is Gemini Robotics-ER, a model focused on advanced visual language understanding. This AI is specifically designed for complex reasoning tasks, such as organizing items in a lunchbox. By connecting with existing low-level controllers, Gemini Robotics-ER enables new robotic capabilities that were previously challenging to achieve.
Safety remains a top priority in the development of these AI models. Google DeepMind has implemented a layered approach to ensure action safety and has released new benchmarks and frameworks to advance AI safety research. The company has also introduced a “Robot Constitution,” which guides robot behavior with principles inspired by Asimov’s laws of robotics.
In a move to accelerate progress in the field, Google DeepMind has partnered with Apptronik to develop next-generation humanoid robots. Additionally, trusted testers, including Agile Robots and Boston Dynamics, have been granted access to Gemini Robotics-ER for further testing and development.
Parada underscored the ultimate goal of these advancements: to create AI systems capable of understanding and acting in the physical world across a wide range of applications. As these models continue to evolve, they promise to unlock new possibilities in robotics and AI-driven automation.