Part 1 of a 2 part post showcasing the strengths and weaknesses of using stable diffusion for video processing vs generative animation.
If you try to use the system as a straight video effect processing approach, you get the kind of thing you see above. Fascinating, but questionable for a video processing effect. You could try to elaborately keyframe things like the strength of # of diffusion steps, or build in additional attention masking or additional prompting that tries to focus what is happening, or get into Ted Adelson video layer kind of modeling to add to the system, but maybe it's just a case of trying to push a large square peg into a round hole.
A better way to view the system in my opinion is as a pure generative image synthesis generator. Some examples of what happens if you work the system that way in the next part 2 post. Everything in this series of posts is using the same static text prompt.
The source video used as input to the image to image algorithms is shown below.
No comments:
Post a Comment