Gemini Robotics ER 1.6 upgrades spatial reasoning and multi-view understanding, unlocking new capabilities like instrument reading for autonomous robots.

DM provides a diverse range of content spanning technology, business, and culture, offering articles, interviews, and analysis for readers interested in staying updated with the latest trends and developments across various industries. Readers can learn about emerging technologies, industry insights, and  perspectives from experts in different fields.

DeepMind

Google DeepMind has released Gemini Robotics-ER 1.6, an upgraded embodied reasoning model for robotics applications. Key improvements include enhanced spatial reasoning (pointing, counting, object detection), better multi-view understanding across multiple camera streams, and a new instrument reading capability developed with Boston Dynamics that lets robots accurately read analog gauges and sight glasses in industrial facilities. The model uses agentic vision — combining visual reasoning with code execution — to zoom into images, use pointing, and execute code for precise readings. Safety is also improved, with better compliance on adversarial spatial tasks and physical constraint adherence. The model is available via the Gemini API and Google AI Studio, with a Colab notebook for developers to get started.

Gemini Robotics ER 1.6: Enhanced Embodied Reasoning