AI Image Generators: Midjourney v6 vs DALL-E v3
In the past few months I’ve been testing AI image generators and have begun using them to create visuals for both fun and marketing. I love this new ability to envision a custom image that compliments the topic that I’m writing about. And with the recent advance of MidJourney’s version 6, AI image generation has taken yet another leap forward. After enjoying a holiday “break” during which I dove headlong into this new update to Midjourney, I thought I’d share my take on where AI image generation now stands, and compare Midjourney to it's closest compatitor, DALL-E 3.
To set the table, the primary field of players includes Stable Diffusion, Adobe Firefly v2, DALL-E v3, and the new Midjourney v6.
But before I dive into the models, allow me to start with a bit of context. AI image generation blew up this year, and we’re now getting models that are capable of rivaling custom photography. As a photographer, yes, this is disheartening, but as a marketer that creates content, it is very exciting. This is especially good for those that need images of minorities, for which stock photography is quite limited - especially of people with disabilities.
Each model combines machine learning and creative expression to create images from written prompts. Each is far from perfect, and requires patience… a lot of patience to fine tune results to get the visual you want through trial and error. For anyone just trying, this may be a surprise, but I will say that it builds your skills in envisioning a desired image, and converting that vision into words.
Midjourney
Midjourney Inc. is a private company based in San Francisco and founded by David Holz. The first version of the model launched in its beta in March of 2022 on Discord, which is a separate platform for communication and communities. The previous version was version 5.2.
Midjourney 6 Alpha
Version 6 continues to be accessible via Discord as of this writing, but access through a more standard website is expected soon. And I can’t wait. There’s nothing fun about using Discord and I look forward to using an effective UI.
Where MJ6 stands out is in rendering amazingly realistic images. It seems to understand composition much better and the relationship between objects in an image.
Comparing MJ6 to MJ5.2
Midjourney 6 over 5.2 creates images that are more realistic, crisp, detailed and accurate. The nuances and depth have increased - especially for things like backlighting, reflections, and representing the natural world overally.
Midjourney v6
Midjourney v5.2
While the 5.2 image does have fine detail, in version 6, Chewie looks like he’s really chillin’.
Experimenting with Midjourney v6
Over the holiday, I was in a Star Wars mood and had fun hanging with Chewie. It took much patience and trial and error to get these. It often took dozens of prompts to get an image that I really liked.
While the detail of each is impressive, what I think is most interesting is the depth, the tone, the style and composition. These are nuanced images which tell stories.
prompt: photo of tiny Pikachu and huge Chewbacca on a couch. Chewbacca trying to take game controller from Pikachu. Dark room
prompt: Little Annie character smoking a fat lit cigar from side of mouth
prompt: a black 2019 Volkwagen Golf GTI with snow groomer cat tracks instead of wheels. Plowing through snow heading up mountain
prompt: a photo of cyberpunk post apocolyptic roadwarrior woman with shaved head ordering a drink at a dimly lit rough bar. she has abundant tatoos and peircings. leather vest, leather boots
prompt: a photo of Pikachu meeting the Queen of England
Midjourney 6 Pros
- For generating photo-real images, it’s the best. Images not only have more detail, but also more natural nuance. It gets nature and seems to lean toward taking its own poetic license.
- Artistic - nuanced
- Strong community via Discord
Midjourney 6 Cons
- Does not follow instructions well.
- Not iterative, you can’t respond to it and ask it to make adjustments
- Not good at graphic design such as logo creation
- Text generation has improved but still has a journey to take
- Requires Discord - for now
- User controls not as advanced as Firefly
ProTip: Use – style raw to get the most realistic images
Open AI's DALL-E 3
DALL-E, developed by Open AI, may not deliver the same level of natural realism, but where it shines is in creating graphical design and illustrations. It can also be accessed for free using Bing Image Generator or through ChatGPT-4. One thing I really like as a ChatGPT 4 Pro user is the convenience of having these two combined. And not only in the same space, but what I find really interesting is that if you enter a vague prompt, it may take the opportunity to juice it up (see the following example), offering more details to get a better image back from DALL-E. I also really like the iterative process, which allows you to make adjustments in subsequent prompts, like add or remove objects, increase styling, or anything you dream up. This is hopefully the direction that Midjourney will take with its new website.
DALL-E enhanced prompt with followup
DALL-E’s usage restrictions may be too much for some as well. For instance, I wanted it to create an ironic image of Orphan Annie smoking a cigar. DALL-E said no and suggested swapping the cigar for a lollipop. And also no-go for Chewie in a hot tub.
DALL-E alternative rendering of Orhan Annie smoking a cigar
DALL-E alternative rendering of Chewbacca in a hot tub
DALL-E Pros
- It follows instructions better
- Better at graphic design
- Better for highly saturated eye-catching social media images
- Infusion of GPT allows it to enhance the prompt you write to give DALL-E better instructions
- Iterative in nature that allows the user to interact with the the image
DALL-E Cons
- Simplistic immature design style
- Not quite capable of creating photo-realistic images
- Displays two images, versus Midjourney’s 4 per prompt
- Also has issues with text generation
- Restrictive usage rules
Output Comparisons: Midjourney 6 vs DALL-E 3
The proof is in the pudding, so let’s see some comparisons starting with the user interface. Although note that we’re expecting Midjourney’s new website UI soon. But I do want to show the difference between the two, and also show Adobe Firefly’s interface, which is the best.
User Interface Comparisons for Prompting
I'm first throwing in Adobe's Firefly UI, because it is just so good and easy to use. I hope to see Midjourney emulate this.
DALL-E is offers nothing more than an open prompt field, and outputs two versions:
Midjourney via Discord... augh. Soon to be updated:
Comparing Renderings from DALL-E and Midjourney
First though, I want to include just one rendering from Adobe Firefly to start us off. The rest compare DALL-E and Midjourney where I trust you'll see how much better Midjourney is at creating realistic visual spaces, and how DALL-E shines in graphic design.
Comparing Adobe Firefly v DALLE-3 V Midjourney 6
Prompt : realistic photo of a brook in a lush forest in New England. Morning light streaking through the trees. A deer approaching the brook. Ferns.
< Firefly>
< DALL-E 3 >
< Midjourney 6 >
Comparing DALLE-3 V Midjourney 6
Prompt : realistic closeup photo of an elderly man in an urban setting, leaning against a door opening with his hand to his face smoking a cigar. No emotion. Distant look in eyes. Moody misty.
< DALL-E >
< Midjourney >
Comparing DALLE-3 V Midjourney 6
Prompt :a graphic designed logo for a company called Buzz Coffee that sells very strong coffee
< DALL-E >
< Midjourney>
Comparing DALLE-3 V Midjourney 6
Prompt : a creative graphic design square poster for an electronic music band from the late 1990s. Flyer art style.
< DALL-E >
< Midjourney >
Final Thoughts
I'm more of a photographer than a graphic designer, so I'm just blown away by the natural realism that Midjourney delivers. It gets mood. It gets color, depth, nuance and composition. It can create visual poetry.
Considering these capabilities were developed in just a year, just imagine what 2024 will bring for AI image creation… and video… and music…
Stay tuned...
and may The Force be with you.
- Dave