What the FLUX?

The original team behind Stable Diffusion's new alternative to Midjourney & DALL-E3

Heather Cooper

Aug 02, 2024

Total reading time around 4 minutes.

Welcome to Visually AI!

🎧IntelliVerse Podcast

I’m thrilled to announce my podcast launch!

I wanted to have a forum to hear from people building, shaping, and utilizing generative AI.

Listen on Spotify

Listen on PodBean

For my first episode, I am going to explore the world of generative AI and visual storytelling with my business partner, Thomas Haynes. Thomas shares his personal journey from being initially skeptical of AI to becoming an active user, discussing the complexities, benefits, and risks of AI technologies.

Watch the video on YouTube:

Intelliverse #1 | An AI Skeptic's Journey of Discovery w. Thomas Haynes

I’ll be uploading these interviews, soon:

Michael Lingelbach, Co-founder of Hedra:

Victor Perez, Co-founder of Krea:

Araminta K, Founder of Promptcrafted:

🔮AI News This Week

FLUX.1: Rivals Midjourney, DALL•E 3 & SDXL

Black Forest Labs is a new venture formed by the O.G. team behind Stable Diffusion, focused on advancing generative AI technology.

They introduced FLUX.1, a cutting-edge AI model designed for high-quality image generation, offering creators a blend of creative freedom and precision. The company emphasizes open research and innovation, providing access to their models and tools through their website.

FLUX.1 models use advanced multimodal and parallel diffusion transformer technology, surpassing other popular models in visual quality, prompt following, and output diversity.

The FLUX.1 [schnell] variant is particularly notable for its speed and efficiency, outperforming many of its competitors. Here's a comparison with other models:

FLUX.1 [pro] / [dev]:
- High visual quality
- Strong prompt following
- Great output diversity
FLUX.1 [schnell]:
- Fast and efficient
- Maintains high-quality output

Comparison:

“FLUX.1 [pro] and [dev] surpass popular models like Midjourney v6.0, DALL·E 3 (HD) and SD3-Ultra in each of the following aspects: Visual Quality, Prompt Following, Size/Aspect Variability, Typography and Output Diversity.
FLUX.1 [schnell] is the most advanced few-step model to date, outperforming not even its in-class competitors but also strong non-distilled models like Midjourney v6.0 and DALL·E 3 (HD).”

I tested FLUX on Fal’s AI Playground and I was impressed by the crisp details, prompt adherence, and accurate typography rendering:

The Flux models are available on several platforms and the code is open source to run on your own machine.

This is a short list of places to get started:

📸AI Snapshots

Google made their experimental version of Gemini 1.5 Pro available for anyone to use on Google AI Studio.

Leonardo AI and Canva have partnered to enhance creative design capabilities by integrating advanced AI features into Canva's platform.

Midjourney's latest update Version 6.1 improves image coherence, quality, and detail, offers faster processing, and includes new upscalers and a personalization model for enhanced accuracy and beauty.

RunwayML added Image to Video for Gen-3 and beginning to roll out a Gen-3 Turbo model at 7x faster speeds and available to free users.

GitHub Models is launching as a beta to empower over 100 million developers to become AI engineers by offering easy access to various AI models, from experimenting in a playground to deploying in production via Codespaces and Azure.

Stability AI's Stable Fast 3D offers a new, fast, and efficient model for generating 3D assets, accessible via their API and the code on GitHub.

Meta Segment Anything Model 2 (SAM 2) is an advanced AI model for image segmentation that offers highly accurate and efficient object detection and masking.

I tested SAM 2’s Demo and you can see how it works below, and try it yourself here.

🛠️ This Week’s AI Tools

Cuebric: Generate 3D mesh with a text prompt and export in .usd, .fbx, and .obj. (link)

Export Comments: Drop the URL exports all comments from your social media posts to Excel file. (link)

Opus Clips: AI-powered video repurposing tool that turns long videos into high-quality, viral short clips for social media platforms like TikTok, YouTube Shorts, and Reels. (link)

Canva Music: Add music, sound effects, or upload your own audio for your Canva designs. (link)

💻Visually AI on YouTube

In this episode of 'Under the Hood,' myself and Miguel (Angry Penguin) discuss the capabilities of LivePortrait, a tool that animates static images using AI-powered facial expressions and movements.

Under the Hood | Episode 4: Animate characters easily with LivePortrait

🖼️ Image Prompts

Prompt: Coloring page of an intricate mandala with geometric patterns, thick black outlines, white background, no shading, complex design for adults

Prompt: A black and white cityscape in negative colors, a single yellow building alone

🎞️ Video Prompt

I am a pharmacist and I always found microbiology to be a fascinating subject, so I loved this prompt:

Macro view of bioluminescent microorganisms. Zoom in to reveal the individual creatures then pull back to showcase the grand scale of their combined effect.

I used an image with this prompt in Runway’s Gen-3 video generator, but you can use the prompt as a text-only prompt:

Thanks for reading, and have a creative week!