Google Cloud Next 2024 — Minority Report

Sam Lin
8 min readApr 15, 2024
Google 2000-1900

It’s been a busy week for Cloud, you can check out all 218 things announced at Google Cloud Next ‘24 here. Sure, the current AI is a bubble within another bubble as Jeremy Grantham, co-founder of GMO LLC called it. Nevertheless, quite a few great tech inventions were actually founded during one bubble or the other. So instead of worrying too much about bubbles, let’s focus on how to make the best out of it by riding the tidal wave. Psst! One from the “old Google’s playbook”: more Internet/AI usage is always better for ads/Cloud business. Aka, creating value by AI innovation stacks, and then capturing value on the Cloud.

The AI opportunity: Sequoia Capital’s AI Ascent 2024 opening remarks — Why now?

To B (Businesses), or not to B (Consumers)

“Use the force, Luke” : Old ideas + AI = new ventures

It was only 2 years ago when ChatGPT kicked off a renaissance for chat bots. Sure, ChatBots are great to consumers (B2C). But, the question is how many ChatBots do consumers really need? Psst! I barely use other search engines beside Google’s. But, I do use Amazon search when shopping in the app. For the most new ventures, the real gold mine will be purposely built AI apps.

The AI opportunity: Sequoia Capital’s AI Ascent 2024 opening remarks — Real user scale

Agentic Workflow — Beyond Few Shots

Currently, we ask LLMs to “just type the essay from start to finish without ever using backspace” — What’s next for AI agentic workflows, Andrew Ng of AI Fund

AI agents have been a rush to make Gen AI more useful in the real life. Just like any creative work is never a linear one shot journey. We scribble n scrap to build our pieces. More we iterate, the better pieces becomes, always. Andrew Ng shared great Agentic Reasoning Design Patters: Reflection, Tool use, Planning, Multi-agent collaboration/conversions. Welcome to the dawn of AI Agent rising. Just go to count how many times that “agent” is mentioned in the keynote.

But, chat bots will not cut it for pros. It’ll be easier to meet the pros where they are in their workflow. Furthermore, I wish Human-in-the-Loop, end-to-end & always. So, a better opportunity will be AI augmented tools, aka AI first Studio for [X]. For example Google Vids, a new AI-powered video creation app for work is coming to Google Workspace soon.

So, what’s the tool you may purposely build with AI agents? Psst! The new Vertex AI Agent Builder may help you to build and deploy gen AI experiences quickly.

What’s next for AI agentic workflows: Agentic Reasoning Design Patters

Image Creative Agents

There are already many popular AI Image Generators, such as DALL·E 3, Midjourney & Stable Diffusion. They are perfect for consumers. On the other end, there are also more professional tools, like Adobe Firefly or Pixlr. These no-code options are getting tractions and becoming better as their usages scale up. I can’t wait to see what they can become with Agentic Reasoning capabilities.

But, it’s no fun if chat bots are the only ways forward. Because I bet you can find some pain points/tasks unique to your users. Or, it’s just a part of your bigger workflow. What-if, it can be integrated seamless into your workflow? Psst! Google Cloud just updated Imagen delivering more advanced generative media capabilities, just for that. Imagen 2.0 stands out with a few uniques innovations built in.

  1. Identifying AI-generated content with Google DeepMind’s SynthID. Which is a big deal for the future content platforms. For example, scammy AI-Generated books rewrites are flooding Amazon. As a buyer, what do you feel when you were scammed? How long will it take for buyers and even writes have enought of scammy then switching to a better platform?
  2. Adding, removing elements, proper text rendering, Fluid style conditioning, outpainting, etc. are useful to craft the images to realize your imaginations.
  3. Logo generation can be useful for small businesses & communities. And, animated GIF generation makes Harry Potter moving portraits a reality.
SynthID: A new tool for watermarking and identifying AI-generated content

Take 1, 2, 3

“we also leverage GPT to turn short user prompts into longer detailed captions that are sent to the video model. This enables Sora to generate high quality videos that accurately follow user prompts.” — Open AI Sora technical report

Image Creative Agents can do much more when more APIs in their disposal, and better under human’s guidances. Just a concierge prototype to illustrate the idea.

  1. Ask top chat bots to generate images.
  2. Prompt AI image generators by ChatGPT generated Stand Appearance description.
  3. Ask ChatGPT generated prompts from the description, and then prompt AI image generators.
  4. Next: how may you take it forward from here?
JoJo’s Bizarre Adventure STONE OCEAN | Official Trailer | Netflix

Take 1: Chat bots as image generators

  • Prompts: Generate a picture of an JoJo’s Bizarre Adventure style artist stand, which can transform its host’s imagination to a high quality painting. Whoever sees the painting will be shouting take my money.
  • Results: Gemini generates 2 images, but I will not use any of them as it. ChatGPT is kind of useful by generating the stand card of Masterpiece Maker. Whereas, Claude can not generate images yet.
Google Gemini to generate “the Masterpiece Maker stand”

ChatGPT 3.5 generated [Masterpiece Maker] stand card

Stand Name: [Masterpiece Maker]

Stand Ability: [Masterpiece Maker has the ability to turn the imagination of its host into high-quality, lifelike paintings. These paintings are incredibly detailed and realistic, capturing the essence of the imagined scene or object with stunning accuracy. Anyone who sees these paintings is overcome with an intense desire to possess them, often leading them to shout phrases like “Take my money!” or “I must have it!”]

Stand Appearance: [Masterpiece Maker appears as a humanoid stand with a body resembling a classical artist, wearing elegant, Renaissance-style clothing. It has a paintbrush for one hand and a palette in the other. Its eyes glow with creativity, and its aura exudes an air of artistic genius.]

Stand Stats:

Power: B

Speed: C

Range: A

Durability: B

Precision: A

Potential: A

Take 2: Using the Stand Appearance from ChatGPT

  • Prompts: Masterpiece Maker appears as a humanoid stand with a body resembling a classical artist, wearing elegant, Renaissance-style clothing. It has a paintbrush for one hand and a palette in the other. Its eyes glow with creativity, and its aura exudes an air of artistic genius.
  • Results: ImageFX, Pixlr & Adobe Firefly generated some images mostly matched the prompts, but not really what I’ll use directly. Whereas, Gemini’s ability to generate images of people is still working in progress on 14 Apr. 2024.
Take 2: ImageFX screenshot
Tack 2: Pixlr screenshot
Tack 2: Adobe Firefly screenshot
Take 2: Gemini screenshot

Take 3.1 Chat Bots as prompt refiners

  • Prompts: Create prompts for an AI image generator to generate images of: Masterpiece Maker appears as a humanoid stand with a body resembling a classical artist, wearing elegant, Renaissance-style clothing. It has a paintbrush for one hand and a palette in the other. Its eyes glow with creativity, and its aura exudes an air of artistic genius.
  • ChatGPT:

Here are some prompts you can use:

“Generate an image of a humanoid stand resembling a classical artist from the Renaissance era. The stand is wearing elegant clothing and holds a paintbrush in one hand and a palette in the other. Its eyes are glowing with creativity, and its aura exudes an air of artistic genius.”

“Create an image of a stand that looks like a Renaissance artist, with a human-like body dressed in elegant, old-fashioned clothing. The stand’s right hand is holding a paintbrush, and its left hand is holding a palette. Its eyes are glowing with creativity, and it emanates an aura of artistic brilliance.”

“Generate an image of a humanoid stand that embodies the essence of a classical artist. The stand is dressed in Renaissance-style clothing and holds a paintbrush and palette. Its eyes emit a radiant glow of creativity, and its aura reflects its artistic genius.”

  • Gemini:

Here are some prompts for an AI image generator to create an image of Masterpiece Maker:

Text Prompt:

A majestic humanoid Stand named Masterpiece Maker. It has the physique of a classical artist, clad in flowing Renaissance-style garments. One hand grasps a delicate paintbrush, while the other holds a vibrant artist’s palette. Its eyes shimmer with boundless creativity, and its entire form radiates an aura of artistic mastery.

Additional details you can include (optional):

The Stand is bathed in a soft, ethereal light that emphasizes its artistic aura.

The paintbrush bristles are made of pure energy, swirling with a kaleidoscope of colors.

The artist’s palette is overflowing with an impossible spectrum of colors, hinting at the limitless potential of Masterpiece Maker’s ability.

The background can be a grand art gallery filled with masterpieces, or a breathtaking landscape waiting to be captured on canvas.

Take 3.2:

  • Prompts: Generate an image of a humanoid stand resembling a classical artist from the Renaissance era. The stand is wearing elegant clothing and holds a paintbrush in one hand and a palette in the other. Its eyes are glowing with creativity, and its aura exudes an air of artistic genius.
  • Results: ImageFX, Pixlr & Adobe Firefly generate images. But they are still not really what I want.
Take 3: ImageFX screenshot
Tack 3: Pixlr screenshot
Take 3: Adobe Firefly screenshot

--

--

Sam Lin

A Taiwanese lives in Silicon Valley since 2014 with my own random opinions to share. And, they are my own, not those of companies I work for.