Thursday, November 17, 2022

Alternate Realities - margaritaville cover art


One shot image resynthesis experiments using the cover art from the WMF cd "Margaritaville" as the input image (no text prompting). Above is using the new Versatile Diffusion model, which is an integrated multi-modal model that is indeed quite versatile.  Above is using it in image to image mode.  

Below is using it in an alternate image to text embedding to image mode.  I tried to negate the cars and trucks in the intermediate text embedding space, which was a big fail.  However, it did act to break the image resynthesis out of the tight hold it has on the input in an interesting way.  So a useful knob in some respects that is an interesting artistic tool.

Below is using the retrained image to image stable diffusion model used in previous alternate realities posts.  It give a more abstracted cartoony look as output, looser interpretation of the content.  None of the neural net models ever understand the visual joke present in any of the WMF cd covers, so while the 'understanding' is very impressive on some level for these models, there is a long ways to go towards complete human understanding of the content they are manipulating.

The original WMF "Margaritaville" cd cover art is shown below for reference.

No comments: