Monday, April 25, 2022

maui hana highway traffic pileup

 

This one is hilarious if you live here.  It also tells you oh so much about what is going on in the synthesis algorithm.

Also makes me wonder about the landscape renditions they love so much in the paper.  The stitching works for this kind of thing.

Does not work so much for this 512 x 512 rendition of Hotel Street in Chinatown in Honolulu.
The 256 x 256 output seems to be much more coherent, but once you go above that it reminds me of what you see going on in VQGAN.

Same thing with this shot at rendering Honolulu harbor.  The VQGAN model i've been working with does a much better job at this kind of thing.  The LAION-400M dataset and associated CLIP seems to not be as good for Hawaii specific localisms.  Both VQGAN and the VQVAE model i tried seemed to be better at catching local style idiosyncrasies i like to mess with.

The other thing about this model is that a hallucinated stock photo watermark footer seems to be rendered by the generative algorithm at the bottom of the generated image approx every 10 images.  They should reallu clean that stuff out of the database before training on it (i think).



No comments: