Strikingloo

Text to Image Art: Experiments and Prompt Guide for DALL-E Mini and Other AI Art Models

Ever since DALL-E came out in January 2021, or even before that with PixelRNN, I’ve found generative models, especially for images, amazing. But what I was interested in was how could I write better prompts that yielded the most beautiful images?

The idea of text-to-image generators like CLIP or GLIDE is even more astounding, and I love being able to play with them and trying to get a glimpse of the way they “perceive” textual and image inputs. I think interpretability is a fascinating field of study, and understanding a model’s representations may yield ideas for better models in the future (though usually the simplest way to make a better model is training a bigger one with more compute).

Giving a model a prompt to generate an image out-of becomes sort of addictive, and I’ve spent longer than I am proud to admit fiddling around with these. Here are some of the images I made with EleutherAI’s .imagine model (which I think is a VQVAE like DALL-E) and some I made using everyone’s favorite viral text-to-image generator, DALL-E Mini (a small open source version of DALL-E 2, a guided diffusion model with extra steps).

What follow are a small guide on prompt engineering for dall-e mini and other text-to-image models. Then a showcase of the prompts I used for generating images, and the images I obtained. These were cherrypicked, as they were the ones I liked the most.

The first batch is all made with EleutherAI’s imagine model, and the second one was made with DALL-E Mini, which can be told from the screenshot’s visible UI.

I found the comparisons where I used the same prompt in both models to be particularly interesting in showcasing how much better DALL-E mini is at composition, even though VQVAE, being bigger and taking longer, generates images at a higher resolution and thus renders textures and objects in a more believable way.

I also hope seeing these prompts can give you a small hint of how to make better prompts for your own ideas (usually, just appending ‘digital painting’ or ‘oil painting’ and ‘artstation’ in the end will do half of the trick).

My general idea is that DALL-E mini can generate satisfying results when the prompt is well-constructed enough, if asked to generate images of inanimate objects, landscapes, buildings and so on. However, illustrations containing animals, people, humanoids or anything that moves, or prompts that ask for a specific action or use verbs, usually get poor results. Bigger models usually deal better with humanoid or animal shapes, and construct scenes better when they include action.

Feel free to steal any of these images and use them for anything, or share them on social media. I mostly tried prompts that dealt with Biblical or mythological themes, because for some reason I found most people didn’t do those sorts of prompts as much (or maybe because I’m a big fantasy/D&D geek).

Update: Besides Craiyon, I’ve found Dall-E Flow, a colab notebook that uses Jina-AI to be the best tool for generating beautiful DALL-E images, and I recommend everyone to give it a spin. It’s free and open source, and I’m loving it.

I also wrote separate articles after experimenting with OpenAI’s DALL-E 2 and its open source competition, StableDiffusion both of which blew my mind and made me think of the future differently.

How to write prompts for DALL-E / StableDiffusion

Usually, what I do is write what I want (adjectives + nouns usually get better results than verbs or complex scenes), then append

Using this simple framework often gets me results close to what I want. If you have any tips on how to do better, tell me on Twitter.

For example, this is a prompt that gave me great results in Craiyon.

‘Cluttered house in the woods | anime oil painting high resolution cottagecore ghibli inspired 4k’

As you can see, just appending ”| oil painting high resolution 4k” will improve most of your results. You can then also add a style cue like ‘Ghibli inspired’, ‘Giger’ or ‘Salvador Dali’.

Other DALL-E/craiyon prompt templates for future use:

That last one has worked well for me for almost any prompt that describes a simple static scene (like houses, cities, landscapes or interiors) or a single humanoid/animal/plant. You can append styles to these in the end too (Eldritch, disco, lo-fi, etc.).

VQVAE (EleutherAI)

Steampunk inventor’s library | Gorgeous digital painting with sober colours amazing art mesmerizing, captivating, artstation 3, cozy

mechanical clockwork flying machine renaissance | Gorgeous digital painting with sober colours amazing art mesmerizing, captivating, artstation 3, realistic, render materials

A beautiful painting of waves crashing on a cliff by Thomas Cole

A beautiful painting of waves crashing on a cliff by Thomas Cole

A glade under the stars | Gorgeous digital painting with sober colours amazing art mesmerizing, captivating, artstation 3, cottagecore, cozy

A disco coral reef underwater | Gorgeous digital painting with aggressive colours amazing art mesmerizing, captivating, artstation 3, cozy, lo-fi

The sunken city of R’lyeh lies dormant | Breath-taking digital painting with dark colours amazing art mesmerizing, captivating, artstation 3, Lovecraftian, eldritch

The white fox in the Arcadian praerie | Breath-taking digital painting with vivid colours amazing art mesmerizing, captivating, artstation 3, japanese style

The green idyllic Arcadian praerie with sheep | Breath-taking digital painting with placid colours amazing art mesmerizing, captivating, artstation 3, cottagecore A beautiful painting of the Garden of Eden by Thomas Cole

Biblical References

A beautiful painting of the Tower of Babel by Thomas Cole

the Tower of Babel by beeple gurney richter | 3D Depth Shader;special effects;production values;movie FX;VFX;sci-fi;4K resolution;high dynamic range;Dolby Vision;hdr10;atmos;3 dimensional;vray; ray tracing;hyperrealistic;

the Garden of Eden by beeple gurney richter | 3D Depth Shader;special effects;production values;movie FX;VFX;sci-fi;4K resolution;high dynamic range;Dolby Vision;hdr10;atmos;3 dimensional;vray; ray tracing;hyperrealistic;matte painting

Fantasy Prompts

A breath-taking painting of an eldritch beetle god that should not be by Thomas Cole | old masters, artstation 3

A breath-taking painting of Valkyries riding pegasi on the clouds over cliff by Thomas Cole | old masters, artstation 3

golden fae stands on flower at night | Gorgeous digital painting with rich colours amazing art mesmerizing, captivating, artstation 3, cottagecore aesthetic

Fae Blessing | Breath-taking digital painting with placid colours amazing art mesmerizing, captivating, artstation 3

An etching of a Troll with a grin by Gustave Doré | artstation 3

A wood engraving of Goblins by Gustave Doré

Refreshing composition, thick texture. Grotesque Centaur by Zdzisław Beksiński and Geiger inspired

Grotesque Centaur by Zdzisław Beksiński and Geiger inspired A centaur in a glade under the stars | Gorgeous digital painting with sober colours amazing art mesmerizing, captivating, artstation 3, cottagecore, cozy

fantasy tavern | Breath-taking digital painting with warm colours amazing art mesmerizing, captivating, artstation 3, cottagecore

fantasy tavern interior | Breath-taking digital painting with warm colours amazing art mesmerizing, captivating, artstation 3

fantasy tavern interior | Breath-taking digital illustration with warm colours amazing art mesmerizing, captivating, artstation 3, D&D Style

purple Fungi from Yuggoth | Gorgeous digital painting with intense colours amazing art mesmerizing, captivating, artstation 3, thought-provoking, dark

3D hyperrealistic materials and soft warm lighting | 3d render amazing graphics shaders 4k UHD| An amazing digital illustration depicting Archangel Michael spread enormous wings at night | artstation 3, hyperrealist, realistic

DALL-E Mini

[Share on twitter]

11 Jul 2022 - importance: 7