Total reading time around 5 minutes.
Welcome to Visually AI!
🔮AI News this Week
My favorite AI tools I actually use
I use a lot of different tools for various purposes, but there are some that I use daily or several times a week.
Organization / Notes / Information
Mem: I use Mem to drop notes, post drafts, forward emails, and research because it uses AI to self-organize information without labeling, tagging, or creating folders.
Notion: I use Notion to add research, draft posts, databases, course resources, and visual material. I like being able to see a visual layout and access to lists of prompts, camera angles, keywords, etc.
mymind: mymind is another self-organizing app that uses beautiful cards to store your notes, bookmarks, images, articles, and more. It has a browser extension I can use on my desktop or mobile to capture things I want to save.
Screen recording / screen shots
Screen Studio: Screen Studio is a screen recorder tool for Macs, and it has an amazing editor, automatic zoom feature, and option to record yourself along with the video. You can trim video, adjust speed and zoom strength. It creates a sleek video result that looks fantastic.
Cleanshot X: Cleanshot X is an excellent screenshot & screen recording tool for Macs. I love the built-in annotation tool for screenshots and the OCR reader to capture text from a screenshot.
Descript: Descript added a lot of new AI features with the Underlord tool. I use it to record tutorials and for editing podcast interviews or any videos with more than one camera recording. It has helpful video and sound editing, plus Underlord will generate YouTube descriptions, social media posts, and find clips from longer recordings.
Video chat / Notetaker
Zoom: I’ve been using the professional version of Zoom for the past 2 years, because it works better for me than Google Meet (I always lose the signal…).
Read AI: I found Read AI through my Zoom subscription and I use it as much as possible to record meetings with a video, live transcript, highlight reel, and summary prepared in a thorough report.
📸AI Snapshots
OpenAI launched a new "Work with Apps" feature for ChatGPT on macOS, allowing Plus and Team users to integrate the AI with coding applications like VS Code and Xcode. The feature enables ChatGPT to access content from compatible apps in editor panes or terminal lines, to provide more contextual and accurate responses. Enterprise and Education users will get access in the coming weeks.
Freepik introduced Freepik Tunes, a new platform combining royalty-free music and AI voice generation. The tool offers multiple AI voices in various languages with a daily limit of 5,000 characters for free users, while also providing access to an AI-generated music library curated by musicians, allowing users to download up to 50 tracks daily.
The newly released Qwen2.5-Coder series offers state-of-the-art performance in open-source code generation, repair, and reasoning, rivaling GPT-4o. It supports six model sizes, from 0.5B to 32B, catering to diverse developer needs while excelling across over 40 programming languages.
Anthropic’s Console now offers a prompt improver and example management tools to refine AI prompts for better accuracy and consistency. Developers and any users can use Claude to enhance prompts with techniques like chain-of-thought reasoning, example standardization, and output format enforcement.
This example improved my prompt for creative proposal ideas for marketing videos:
⚙️ Workflow
How to use lip sync tools
There are several different lip sync tools that work on characters in video input. I compared Kling AI, Sync labs, and Runway.
Generate a voice
Freepik launched Freepik Tunes with curated AI-generated music and a voice generator. I found some unique voices to choose from:
Video input - Lip sync with Kling
I generated the video on Kling because you can't use the lip sync feature with an uploaded video. Kling is unique because it doesn't trim the video length to match the audio file, which is useful.
I animated this character, generated with my custom FLUX LoRA model I trained on Fal:
Lip Sync - Sync
Sync has a few different models to choose from and you can compare the results from different models in the Playground:
Lip Sync - Runway
You can upload videos with Runway's lip sync feature. It will identify a face(s) from your video or image, before animating the character:
Results
I was pleased with each of the results, and these models are definitely improving:
🛠️ This Week’s AI Tools
RMBG-2.0: SoTA open & non-commercial background removal model by BRIA AI. Try the free demo on Hugging Face, here.
FacePoke: Open-source demo on Hugging Face that enables real-time manipulation of facial features, expressions, and head positions in portrait images. (link)
Bolt.new: Full-stack web developer app lets you build, edit and deploy from text prompts. (link)
Freepik Tunes: AI voice generator and curated AI royalty-free music for content creators. (link)
SeedEdit: ByteDance demo on Hugging Face is an image editor that works from text prompts to change backgrounds, facial expressions and poses, or edit text. (link)
Get Rizzed: Hugging Face Space by Play HT, Upload an image, click 'Rizz Image', and the AI will roast it. (link)
🖼️ Image Prompts
Prompt: a teddy bear as a mosaic of colorful pixelated images, highly detailed, white background, 8k
Prompt: Minimalistic rainbow riverside house, seamless spectrum gradient, photorealistic, UHD
🎬Video Prompt
This prompt is from my friend, LudovicCreator in an excellent thread of 16 video prompts on 𝕏.
Prompt: A wide shot of a mystical lake at night, with a full moon reflecting on its calm surface. The IMAX camera glides slowly over the water, capturing the hyper-detailed reflections of surrounding trees and fireflies dancing above the lake. Soft moonlight and mist create a tranquil, otherworldly atmosphere.
Thank you for reading, and have a creative week!
You create fantastic work, Heather. Thanks for sharing your knowledge and enthusiam.