GPT-4 Vision: Multi-Modal AI Capability

Oct 16, 2023

Total reading time around 4 minutes.

Welcome to Visually AI!

This week, I was honored to have the opportunity to present about AI Art to the Bar Association's 2023 IP Fall Institute CLE.

I teamed up again with Joe Gratz from Morrison Foerrster to present a CLE program:

"AI & IP: IP Protection for AI-Generated Works"

The audience members were attorneys primarily focused on copyright and trademark issues. I appreciate their willingness to listen to my perspective and understand how most people use AI Art and our legal concerns.

You can watch my presentation and Joe’s excellent overview of how “Fair Use” is determined for intellectual property, including AI Art and ongoing cases - here.

🔮AI News This Week

News for generative AI is arriving in waves of incredible advancements across multiple categories: text-to-image, image analysis, chatbots like ChatGPT, and multi-modal applications.

There is too much to fit in this newsletter, but I’ll run through the topics I’ve been following and had a chance to explore this week.

GPT-4 Vision Available in ChatGPT

OpenAI rolled out GPT-4 Vision to most ChatGPT Plus users, including me.

Vision allows you to upload images for analysis and further questions.

It’s available in the ChatGPT web app and mobile app:

Vision feature available in ChatGPT web and mobile apps

GPT-4 Vision doesn’t have a separate tab on the dropdown menu of ChatGPT because it’s part of the “Default” GPT-4 feature.

Vision and DALL•E 3 are both available in ChatGPT for paid Plus subscribers, but they can’t be used together. That would be ideal, but we’ll have to wait for that to occur.

You can use Vision to generate image text prompts based on an image you upload or ask questions about that image. It works surprisingly well, even with a quick picture of a sandwich ad from a magazine (my first test because I forgot all of the amazing things I wanted to do with Vision):

I used Vision to generate image text prompts for Halloween costumes based on an image I uploaded from a fashion photographer - not Halloween-related. I generated images in Midjourney with the prompts shown below.

Read it in my post on 𝕏.

Halloween costumes generated with prompts from GPT-4 Vision in ChatGPT

I’ll write more about the use cases for Vision and DALL•E 3 in ChatGPT in the coming weeks.

Adobe Firefly Image 2 Model (beta)

Adobe released a new Firefly model, only available in the Firefly web app. Read about it here.

Firefly comes with greatly improved photorealistic capabilities.

Updates include improved:

Photographic quality
Colors & dynamic range
Recognizes more cultural symbols, landmarks
Generates people better, including hands & bodies

Here are a few stunning examples:

Several photos of an eye with the universe reflected, a woman at a parade, tattoos and hands holding a flower — Images generated on Adobe Firefly 2

Meta’s Surreal Celebrity AI Characters and More

Meta announced new AI experiences across its family of apps, including:

Customizable AI stickers
Image editing with AI
MetaAI Assistant available on WhatsApp, Messenger, and Instagram
A universe of celebrity-themed AI characters you can interact with through WhatsApp, Messenger, and Instagram

Many people are concerned about that last item on the list - it seems pretty wild to think about speaking to an AI character who looks and sounds exactly like the real celebrity…

I’m an Ambassador for Hive3, the first competitive generative AI league!

Hive3 is a community of creators developing skills by competing in challenges for cash prizes. The challenges range from creating images to fit a brand’s style and culture, or short videos.

Check out the current competition called “Have a Wicked Halloween” here.

Join me Thursday, October 19 for a Q&A session on Hive3’s Discord server here.

You could have your AI service, tool, or event seen by Visually AI’s community of over 6,200 subscribers:

Advertise with me

🚀 This Week’s AI Tools

ElevenLabs AI Dubbing: Get automatic voice translation from uploaded audio or video files. See the menu below. (link)

Cereal Box Maker: This fun tool is free and fun. Create your own cereal box and download it; it appears to be the correct size for a small cereal box! (link)

Blaze: AI marketing tool quickly creates content using your brand voice and repurposes for multiple platforms. (link)

AutoDraw: This free tool lets you turn scribbles into drawings quickly with color, text, and templates. (link)

Wirestock AI Image Generator: Showcase and monetize your AI art on Wirestock. Reach big marketplaces like Imago and Freepik easily. (link)

Mirage: Find the assets you need for your project, including images, videos, audio, and 3D. Generate a new asset on the app, if necessary. (link)

🎁 Get it free: The AI Visual Creator’s Toolkit

Boost your content with my all-in-one, free visual AI toolkit!

Access AI-powered tools for AI-generated images, image editing, and more:

Get your toolkit

Thanks for reading and have a creative week!