this took wayyy too long because other projects came up but here it finally is ヽ(・∀・)ﾉ
the base is SD 1.5 and it’s finetuned on 2875 HAND CAPTIONED (you heard that rightttt) images. resolution is 512×512. i initially wanted to finetune it on 768×768 like Luna Diffusion but since this dataset is a lot bigger it’s a much slower process and i kept running into some errors. might still do a 768×768 version if I find the space & time for it. but the main priority would be improving this one.
examples / comparisons
- examples use Euler a; 20 steps; CFG 7; basic Hires fix and vae-ft-mse-840000-ema-pruned.ckpt as VAE
- portraits can get a bit crispy, if so adding “artefacts, glitch, blemishes, dirt, noise, holes” (one or all of those) as negative prompt can help
- kinda bad with swords. one of the things i’ll try to improve with the next version. because swords are kinda important (;;;*_*)
- fortunately great with quokkas!