Saturday, September 17, 2022

stable diffusion video processing experiment 3


All of these experiments use the same source video used in the last 2 posts.  Again note how well the overhead lights are tracked in the generative image synthesis.
 
These video processing experiments are all based on the set of style prompts used in the art walk posts earlier this month.  The style changes every 10 frames.  Above and below show how things change as you move from a strength of .3 to .1. 

Both of the examples above use a fixed random seed, the one below is exactly the same as the .1 strength above using a variable random seed for each frame.  Note the difference.

I messed with a lot of different text prompt additions to try and pull things more to what i was going for, a stylized representation of the source video.  The best example of those tries below.

You can still see an issue with style prompts based on artists names in this example, and even more in the very first one, where the generative synthesis ends up generating an image of the artist rather than stylizing the video content.  It is way less of an issue with generic SD still image generative (no image or video frame input into the U-Net latent input), but seems to be a big issue with video processing.  Maybe textual inversion is a better approach for this kind of thing?

No comments: