Above is a straight recursive feedback of the previous generative output into the U-Net latent to build the animation. This is using the Pytorch CompVis model on Colab using the exact same static text prompting as the previous 'grin' post, which maybe gives you some more insight when i keep wondering why that other implementation has such different visual properties for it's generative output.
I tried a different approach below where i used the new generalize latent diffusion framework in a 2 step process to build a generative animation that i then use to drive the U-Net latent input below with a static random seed to build the below animations. So it's one approach to use the generalized architecture resynthesis approach to build somewhat coherent animations (after my 'woman dancing part 3' post yesterday wondering if that was worthwhile pursuing). I'll dive into the details about how i did that in a part 2 post.
No comments:
Post a Comment