AI Art Weekly #94

Hello there, my fellow dreamers, and welcome to issue #94 of AI Art Weekly! 👋

Just a heads up, if you don’t hear from me starting from next week, it’s because Mini-Tulpa has joined our realm. Yep, wifey is due any day now. Super excited and nervous at the same time. I’m gonna try to keep the newsletter going, but I might miss a week or two. I’m sure you understand 😅

On another note, I was busy exploring Midjourney SREF codes with Midjourney v6.1 this week. Already up to 40+ high-quality SREF codes on PROMPTCACHE. If you want to get lifetime access at the current price, now is the time to get in, gonna raise the price starting next week.

Some of you are already in and created some amazing art with it. Checkout this beautiful piece by Sherpa created with the Chromatic Lineage code. Please feel free to share your creations with me, gonna try to figure out a way to showcase them!

In this issue:

  • 3D: MeshAnything V2, AvatarPose, UniTalker, An Object is Worth 64x64 Pixels, Head360, RayGauss, TexGen
  • Image: Fast Sprite Decomposition, IPAdapter-Instruct, Lumina-mGPT, VAR-CLIP, Smoothed Energy Guidance, ProCreate, Dont Reproduce!, TurboEdit
  • Video: ReSyncer
  • and more!

Cover Challenge 🎨

Theme: hidden truths
31 submissions by 17 artists
AI Art Weekly Cover Art Challenge hidden truths submission by NomadsVagabonds
🏆 1st: @NomadsVagabonds
AI Art Weekly Cover Art Challenge hidden truths submission by risugawa
🥈 2nd: @risugawa
AI Art Weekly Cover Art Challenge hidden truths submission by EternalSunrise7
🥉 3rd: @EternalSunrise7
AI Art Weekly Cover Art Challenge hidden truths submission by pactalom
🥉 3rd: @pactalom

News & Papers

3D

MeshAnything V2: Artist-Created Mesh Generation With Adjacent Mesh Tokenization

MeshAnything V2 can generate 3D meshes from basically anything! Text, images, point clouds, NeRFs, Gaussian Splats, you name it.

MeshAnything V2 examples

AvatarPose: Avatar-guided 3D Pose Estimation of Close Human Interaction from Sparse Multi-view Videos

AvatarPose can estimate the 3D position of people from videos! It works well for groups of people close together, using special methods to make the results more accurate and realistic.

AvatarPose example

UniTalker: Scaling up Audio-Driven 3D Facial Animation through A Unified Model

UniTalker can create 3D face animations from speech input! It works better than other tools, making fewer mistakes in lip movements and performing well even with new data it hasn’t seen before.

UniTalker example

An Object is Worth 64x64 Pixels: Generating 3D Object via Image Diffusion

An Object is Worth 64x64 Pixels can generate 3D models from 64x64 pixel images! It creates realistic objects with good shapes and colors, working as well as more complex methods.

An Object is Worth 64x64 Pixels examples

Head360: Learning a Parametric 3D Full-Head for Free-View Synthesis in 360°

Head360 can generate a parametric 3D full-head model you can view from any angle! It works from just one picture, letting you change expressions and hairstyles quickly.

Head360 examples

RayGauss: Volumetric Gaussian-Based Ray Casting for Photorealistic Novel View Synthesis

RayGauss can create realistic new views of 3D scenes, using Gaussian-based ray casting! It produces high-quality images quickly, running at 25 frames per second, and avoids common picture problems that older methods had.

RayGauss example

TexGen: Text-Guided 3D Texture Generation with Multi-view Sampling and Resampling

TexGen can create high-quality 3D textures for objects using text descriptions! It uses a special multi-view technique with a pre-trained text-to-image diffusion model creating more detailed and consistent textures than other methods.

TexGen examples

Image

Fast Sprite Decomposition from Animated Graphics

Sprite-Decompose can break down animated graphics into sprites using videos and box outlines.

Sprite-Decompose example

IPAdapter-Instruct: Resolving Ambiguity in Image-based Conditioning using Instruct Prompts

IPAdapter-Instruct can efficiently combine natural-image conditioning with “Instruct” prompts! It enables users to switch between various interpretations of the same image, such as style transfer and object extraction.

IPAdapter-Instruct example

Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining

Lumina-mGPT can create photorealistic images from text and handle different visual and language tasks! It uses a special transformer model, making it possible to control image generation, do segmentation, estimate depth, and answer visual questions in multiple steps.

Lumina-mGPT examples

VAR-CLIP: Text-to-Image Generator with Visual Auto-Regressive Modeling

VAR-CLIP creates detailed fantasy images that match text descriptions closely by combining Visual Auto-Regressive techniques with CLIP! It uses text embeddings to guide image creation, ensuring strong results by training on a large image-text dataset.

VAR-CLIP examples

Smoothed Energy Guidance: Guiding Diffusion Models with Reduced Energy Curvature of Attention

SEG improves image generation for SDXL by smoothing the self-attention energy landscape! This boosts quality without needing guidance scale, using a query blurring method that adjusts attention weights, leading to better results with fewer drawbacks.

Smoothed Energy Guidance examples

ProCreate, Dont Reproduce! Propulsive Energy Diffusion for Creative Generation

ProCreate boosts the diversity and creativity of diffusion-based image generation while avoiding the replication of training data. By pushing generated image embeddings away from reference images, it improves the quality of samples and lowers the risk of copying copyrighted content.

ProCreate, Dont Reproduce! Propulsive Energy Diffusion for Creative Generation example

TurboEdit: Text-Based Image Editing Using Few-Step Diffusion Models

TurboEdit enables fast text-based image editing in just 3-4 diffusion steps! It improves edit quality and preserves the original image by using a shifted noise schedule and a pseudo-guidance approach, tackling issues like visual artifacts and weak edits.

TurboEdit examples

Video

ReSyncer: Rewiring Style-based Generator for Unified Audio-Visually Synced Facial Performer

ReSyncer can create high-quality lip-synced videos from audio and allows for quick personalized adjustments and video-driven lip-syncing. It can transfer speaking styles and swap faces, making it ideal for virtual presenters and performers.

ReSyncer example

Also interesting

“Seasons II” by me.

And that my fellow dreamers, concludes yet another AI Art weekly issue. Please consider supporting this newsletter by:

  • Sharing it 🙏❤️
  • Following me on Twitter: @dreamingtulpa
  • Buying me a coffee (I could seriously use it, putting these issues together takes me 8-12 hours every Friday 😅)
  • Buy my Midjourney prompt collection on PROMPTCACHE 🚀

Reply to this email if you have any feedback or ideas for this newsletter.

Thanks for reading and talk to you next week!

– dreamingtulpa

by @dreamingtulpa