Video footage above was used as a video source for stable diffusion generative image synthesis. Different text prompts below applied to the same source footage with the same synthesis settings otherwise.
If you approach this kind of thing where you use the source video as a spatial anchor for generative animation to latch onto then this kind of processing scenario will work. If you try to more accurately reproduce the original source in a style it will have a lot of funny or troubling issues, like the ethnic race of the person potentially changing every few frames, the stable diffusion synthesis model adding cleavage and breast size enhancement to the female singer, temporal discontinuities, etc. Note that the white ceiling lights in the source video are actually what is tracked the best.
I have not been able to get it to successfully do things Studio Artist is really great at, like taking a source video and rendering it in various visual art styles where the source video footage is closely tracked and reproduced in that style, or things like natural wet mixing or other kinds of dispersion interactions that happen over time. You really need to view and approach working with stable diffusion video processing as just spatially guided generative animation. The outcome of the 2 different approaches are really very different in nature.
No comments:
Post a Comment