Process

Consistent AI Character Generation With Different Poses in Stable Diffusion

One of the largest drawback to image generating AIs is that it is hard to get a consistent character. In this article I’ll explain how I have used a set of tools to overcome that impediment and make an AI character generator.

Getting a set of six views of the same character

Below is a set of images I generated using Stable Diffusion 1.5, Char Turner V2, and Control Net with Open Pose. These images have not been edited, and were generated during a single run. I’m using runpod.io servers, as it is much easier than trying to run Stable Diffusion locally.

  • Asian woman in green dress made with an ai character generator
  • statues
  • cat astronaut
  • cat bronzes
  • man in suit made with an ai character generator
  • 6 views of the queen
  • knights
  • red robot made with an ai character generator
  • statues
  • statues
  • asian man with tattoos

For these images I used the model dreamlike-photoreal-2.0. There are still issues, and I tried to push the flow to the limits. The character doesn’t need to be truly human, but it helps. The tattoos also don’t match perfectly, and things tend to swap from left side to right side when the characters turn. There are an impressive number of details that do match, such as the necklace and earrings on the queen character. You can also see that the feet get easily turned around, which is driven by the openpose I’m using not being as well laid out as it could be. Openpose doesn’t give joints angles, so when legs are straight they can easily turn around.

Settings for a Good AI Character Generator

The important settings here are:

  • Hires. fix, upscaled by 2
  • Width : 1024
  • Height : 512
  • Prompt includes (charturnerv2) and the description of the character
  • CFG Scale : 14
  • Enable Control Net with Open Pose
  • Grab an image of some characters standing in clear and non-overlapping poses, I used the ones from the charturnerv2 examples
  • Tiling checked

I found that using Hires. fix led to lots of the differences between the characters being smoothed out, and the upscaling helped with the consistency of the fine details. Tiling also makes the backgrounds more likely to be neutral, making characters easier to cut out, and eliminating noise in later steps. Without tiling, the left and right side characters tended to be somewhat different from the central characters.

Once you get your set of poses into an openpose image, you can just use that instead of generating it fresh each time. To do this just look at the second image that was generated on any run. It will look like this:

six openpose poses
The six openpose image I use.

Just swap the preprocessor to ‘none’ instead of openpose if you are using a posed image directly.

Now we drop in a new pose

The six poses of the same character are now a great starting point, but what if we want that character in a different pose? Well we can take the initial picture and use it for an inpainting.

We first choose 4 of the 6 characters and put them along the sides of a 1024×512 image.

a black square with two cat astronauts on the left and right. setting up for use as an ai character generator
The image with characters on the side and the canvas for inpainting in the middle.

And we need a corresponding mask for the inpainting.

the mask

And we use the (charturnerv2) in the prompt again with InPaint masked content set to latent noise and denoising strength set to 1. We also want to use an inpainting model. I’m using sd-v1-5-inpainting.

Now we need a pose for our character. I’m using a pose in the wrong aspect ratio, but it still works if we have the setting in ControlNet of Scale to Fit (Inner Fit). If you want exact control you can make a pose in the correct aspect ratio (1024×512). Here is the pose I used.

a colorful representation of a human from openpose
A pose from openpose tool

And the output with these settings looks like this. The charturnerv2 hates empty space, and will fill it with something that is kind of like your character. You can also tell that the model I used to generate the initial image isn’t a match to the inpainting model I used. If you do have a matching inpainting model it will turn out much better. These tools also need humanoid figures. You can see that even with the cat headed astronaut it will do get close to the matching view.

six cat astronauts made with an ai character generator
A new image of our anthropomorphic cat astronaut

Something this opens up is creating a number of images in different poses, then being able to train a custom embedding or model on the character. You could create a whole cast from scratch. Hopefully now you can set up your own AI character generator. We’d love to know what you make with it.

Other Guides on Stable Diffusion

How To Set Up ControlNet Models in Stable Diffusion
How to Train a Custom Embedding in Stable Diffusion Tutorial
Stable Diffusion Tutorial: How to In Paint
Stable Diffusion Denoising Strength Explained
What are Sampling Steps and How to Reduce Them in Stable Diffusion