Replace Stuff in Images Using Only Text

Text-based inpainting lets you replace objects in existing images simply by describing what you want to change. It's as straightforward as it sounds.

Daniel Nest

Nov 24, 2022

The concept of AI-driven inpainting—replacing specific parts of an image with something else—isn’t new.

DALL-E and Stable Diffusion have had this functionality for months now:

Inpainting isn’t very complicated. You just:

Upload an image
Erase (mask) parts of it you want replaced
Specify the inpainting area
Describe to the AI model what you want to see there
Profit?

But a new tool has taken this simplicity to the next level.

Let’s take a quick look.

Text-based inpainting demo

The tool is currently a standalone demo, but my guess is that it won’t be long before we see this feature incorporated into AI generators by default.

Here’s how it works.

1. Go here: Interactive demo: Text-based inpainting

Screenshot of Interactive demo: Text-based inpainting with CLIPSeg x Stable Diffusion

2. Upload an image. I’ve used this Pixabay photo of an easter egg basket:

Screenshot of text-based inpainting tool's upload image function

3. Specify the object you want replaced (and with what):

Screenshot of the replacement text descriptors

4. Get your output image:

Easter basket of eggs with the yellow egg replaced by a tennis ball — “Impostor Among Us!”

Super cool, right?

What’s impressive is that the tool can zero in on an object based on additional descriptors. I deliberately picked a basket of eggs to see if I could pinpoint just the yellow one for replacement.

A few caveats

Nothing’s perfect, so here are a few things to keep in mind:

The object you want replaced should ideally exist in a single version.
It’d be harder to tell AI which of the two yellow eggs you want replaced.
It should be easy to identify and isolate.
Selecting a specific star in a photo of a starry sky is tricky using just text.
Replacement objects should have roughly the same size and shape.
I tried replacing the egg with a banana and AI had to get, uh, creative:
“You never said a whole banana!”
You can expect imperfections and artifacts.
Have you noticed what happened to the blue egg next to our tennis ball?

Still, it’s quite extraordinary that we can now edit images using nothing but words. I hope to see this functionality perfected and implemented in other tools soon!

Over to you...

Have you played around with inpainting in DALL-E, Stable Diffusion, or other products? What’s your impression of this text-based functionality?

If you know of products with similar features, I’d love to hear about them. Drop me an email or leave a comment.

Loved this post? Spread the word!

Why Try AI

Discussion about this post