Digestible AI
Posts
Gemini Commands Real Robots

Gemini Commands Real Robots

+ OpenAI's New Agent Arsenal ⚙️

Reyhan Merekar
March 13, 2025 • Estimated Reading Time: 8 minutes

In partnership with

Robots from Google and new agent capabilities from OpenAI…

In this edition we’ll be covering…

How Google is starting to bring AI into the physical world
A breakdown on OpenAI’s new agent tools
A tutorial on how to speed up your data work in Google Colab
5 trending AI signals
3 more AI tools to explore
And much more…

The Latest in AI

Google’s Next Big Play? Humanoid Robots…

Image from: Google

Google is taking AI beyond the screen and into the physical world with Gemini Robotics, a new AI system designed to control robots with natural language commands.

Announced this week, Gemini Robotics and Gemini Robotics-ER (Extended Reasoning) are built on Gemini 2.0 and allow robots to understand their surroundings, adapt on the fly, and manipulate objects with human-like dexterity.

The breakthroughs that set Gemini Robotics apart…

Robots can learn new tasks on the spot, even ones it wasn’t trained for.
They understand natural language commands, respond in real time, and adjust actions if objects move or tasks change.
Can perform multi-step, precise movements, like zipping a bag or folding origami.
Works across different robot types, including Apptronik’s Apollo humanoid robot.

So What?

Google’s move into robotics puts it in direct competition with Figure and Tesla, both of which are working on AI-powered humanoid robots.

While still in development, Gemini Robotics has already doubled performance on AI-driven robotics benchmarks and is being tested by companies like Boston Dynamics and Agility Robotics.

If AI-powered robots become mainstream, Google might just be the company leading the way…

Together with Superhuman AI

Find out why 1M+ professionals read Superhuman AI daily.

In 2 years you will be working for AI

Or an AI will be working for you

Here's how you can future-proof yourself:

Join the Superhuman AI newsletter – read by 1M+ people at top companies
Master AI tools, tutorials, and news in just 3 minutes a day
Become 10X more productive using AI

Join 1,000,000+ pros at companies like Google, Meta, and Amazon that are using AI to get ahead.

Innovation Showcase

OpenAI’s Agent Awakening

Image from: OpenAI

OpenAI just released a new set of AI agent tools, giving developers and enterprises the power to build autonomous AI systems with its models.

At the core of this launch is the Responses API, which replaces the Assistants API and allows AI agents to search the web, scan company files, and even navigate apps and websites. OpenAI is betting that these tools will push AI agents from flashy demos to real-world utility.

What’s New?

Responses API – A new foundation for AI agents, combining Chat Completions with built-in tool use.
Web Search Integration – Uses GPT-4o search and GPT-4o mini search to provide real-time, cited answers.
File Search – Lets AI quickly scan through internal company databases for relevant information.
Computer-Using Agent (CUA) – The same AI model that powers Operator, capable of automating tasks like data entry, website navigation, and app workflows.

So What?

OpenAI (just like everyone else) is making a bold bet: 2025 will be the year AI agents become truly useful.

While the tech still has accuracy issues, OpenAI’s approach—giving developers the tools to build their own agents with its models—could be the key to making AI assistants actually worth using.

Tool Spotlight

Data Science Without the Drudgery

Data science agent goes to work…

Long gone are the days of performing EDA and training models by hand in a notebook.

Google recently released the Data Science Agent in Google Colab powered by Gemini. It removes tedious setup tasks like importing libraries, loading data, and writing boilerplate code — potentially freeing you from the tyranny of typing import pandas as pd forever…

Here’s how you can use it for data analysis:

Navigate to Google Colab and open up a new notebook.
Hit the Gemini button on the top right.
Upload data and define your goals (e.g., "Visualize trends," "Build and optimize prediction model")
Sip your coffee and let the agent get to work!

Quick Bites

Stay updated with our favorite highlights, dive in for a full flavor of the coverage!

Introducing YouTube video 🎥 link support in Google AI Studio and the Gemini API. You can now directly pass in a YouTube video and the model can usage its native video understanding capabilities to use that, with just a link! 🚢
— Logan Kilpatrick (@OfficialLoganK)
8:03 PM • Mar 12, 2025

Google has introduced support for YouTube video links in its AI Studio and Gemini API.

Google unveiled Gemma 3, a versatile AI model supporting over 140 languages and multimodal inputs.

Meta has begun testing its first internally developed AI training chip.

Mistral launched a new OCR API that converts PDFs into AI-ready Markdown files, streamlining document processing workflows.

Dario Amodei, CEO of Anthropic, shared insights on the evolving role of coding in the age of AI.

🦉 OWL - Another open-source alternative to the viral Manus Agent.

💸 Strama - Sales outreach done for you with AI.

🦜 LangSmith - All-in-one developer platform for every step of the LLM-powered application lifecycle.

The Neural Network

Quite a Google-heavy newsletter today, so I need to cap it off with this.

This is what AI advancement is really for…

Until we Type Again…

Thank you for reading yet another edition of Digestible AI!

How did we do?

This helps us create better newsletters!

If you have any suggestions or specific feedback, simply reply to this email. Additionally, if you found this insightful, don't hesitate to engage with us on our socials and forward this over to your friends!