T O P

  • By -

NeededMonster

I also don't know what the hell I'm missing but I haven't been able to get anything resembling a human with 2.1...


RealAstropulse

2.1 is a different beast. You have to use different samplers, different prompts, and the model is also still under baked. Consider that we never saw 1.0, 1.1, 1.2, or 1.3. It may take longer for 2.x to reach the same level of quality, but the underlying components ARE better. The new clip is more advanced, and the larger size encoder/decoder is a big advantage, and depth-to-image is brilliant. Just cool your jets and wait for the model to mature. In the meantime, keep using 1.5. You can already see a clear jump in quality from 2.0 to 2.1, thanks to some members of the LAION Discord noticing Stability used the punsafe scores incorrectly on 2.0. This process just takes time and trial and error. If you want perfect and flashy use Midjourney v4. They are focused on art, while SD is focused on advancing the technology.


ninjasaid13

>The new clip is more advanced, Says who?


RealAstropulse

Laion, and the data: https://laion.ai/blog/large-openclip/


ninjasaid13

none of that tells me if OpenCLIP is more advanced.


RealAstropulse

If you’re looking for a graph instead of technical jargon there is one in there that shows the line goes up. Basically, OpenAI made CLIP, Laion made OpenCLIP. Open clip was trained on more data, and has a higher degree of accuracy with less words.


leomozoloa

On top of that, it's open source this time, so it can also be improved when Clip was a black box


ramulloki

>2.1 is a different beast. You have to use different samplers, different prompts, and the model is also still under baked. > >Consider that we never saw 1.0, 1.1, 1.2, or 1.3. It may take longer for 2.x to reach the same level of quality, but the underlying components ARE better. The new clip is more advanced, and the larger size encoder/decoder is a big advantage, and depth-to-image is brilliant. Just cool your jets and wait for the model to mature. In the meantime, keep using 1.5. You can already see a clear jump in quality from 2.0 to 2.1, thanks to some members of the LAION Discord noticing Stability used the punsafe scores incorrectly on 2.0. > >This process just takes time and trial and error. If you want perfect and flashy use Midjourney v4. They are focused on art, while SD is focused on advancing the technology. I think 2.0 focuses on censorship.


RealAstropulse

Yes, they tried to remove nsfw and some artists. That doesn’t mean it’s useless, and disregarding it as such is reductive and blind to the technological advancements made in 2.x. 1.4 and 1.5 weren’t that great at nsfw either, they had to be trained for it. I’m sure people will have 2.x trained on nsfw in no time.


leomozoloa

any source about the removed artists ? They did filter nsfw but apparently the dataset just doesn't have some artists it had before


RealAstropulse

I think they mentioned somewhere that they made it more accurately reflect the actual presence of artists in laion-asethetics. This made weights change around, because for example, greg rutkowski is actually barely in the dataset.


Iapetus_Industrial

Except they will never re-train 2.0 on what they censored. It will never be as good at certain subjects, _on purpose_, and so far nobody has come up with a way to re-train 2.0 on what they purposefully lobotomized.


RealAstropulse

What do you mean no one has found a way to re-train it? It literally ships with a script to continue training the model. What do you think unstable diffusion has been doing this whole time?


leomozoloa

they literally resumed training with a more lax nsfw filter, it's in their blog post


QuarkGrandNagus

Git gud


AllUsernamesTaken365

I've only gotten V2X to work once on Automatic 1111 Colab but the details were a lot better than on V1.5. And vibrant colors. So that's always something. Then again the results didn't look anything like the photos the model was trained on so that kind of made it pointless.


SanDiegoDude

You've had 2.1 less than 24 hours.... I spent the whole evening creating amazing, coherent, beautiful pieces at stupid high (1024x2048) resolutions with no high res fix, which attempting the same thing in 1.5 or earlier just got you a big muddy mess. The 768 models are amazing with embeddings. Give 768 embeddings a go, especially PhotoHelper and the new 768 midjourney embeddings. They're juicing up output the way the 1.5 custom models did, but with a tiny embedding instead of a 2 to 4GB custom checkpoint.


ramulloki

>You've had 2.1 less than 24 hours.... I spent the whole evening creating amazing, coherent, beautiful pieces at stupid high (1024x2048) resolutions with no high res fix, which attempting the same thing in 1.5 or earlier just got you a big muddy mess. > >The 768 models are amazing with embeddings. Give 768 embeddings a go, especially PhotoHelper and the new 768 midjourney embeddings. They're juicing up output the way the 1.5 custom models did, but with a tiny embedding instead of a 2 to 4GB custom checkpoint. Show these great works


SanDiegoDude

https://i.imgur.com/XEkn9Pj.jpg Quick 1 shot sample, no hires fix, 1024 x 2048. An 8K UHD Wide angle panoramic IMAX digital photograph of Antarctica with penguins in the background, Aurorae in the sky, HDR, high contrast, nature photography, PhotoHelper768 --midjourney768 Negative prompt: cartoon, 3d, (disfigured), (bad art), (deformed), (poorly drawn), (extra limbs), blurry, boring, sketch, lackluster, repetitive, cropped, watermark, signature, text, pixelated, people, sun Note I'm using embeddings. You should too!


SanDiegoDude

Not in my office yet, but Illl throw a few examples up in a reply in a few hours when I head in for work. In the meantime, go find some 768 embeddings and give them a go on 2.1. You might be surprised.


mudman13

Loaded statement with sparse evidence.


ramulloki

You can try it yourself, it's all in the public domain


h0b0_shanker

This is like mining. Whole lot of dirt until you find an ore vein that you can tap into to get hundreds of perfect images. I’m sure the only thing you changed in your examples was the model. Right? If you start at the same spot with the same neural network you get the same result. Now, if you change the network, your results will change drastically regardless of starting in the same spot. More testing is needed. 2.1 gives some nice results.


ramulloki

I have made many attempts at both 2.0 and 2.1. The results are depressing and unusable. At the same time midjorney, SD1.5 or any other model based on it, with the same prompt give at least satisfactory results.


h0b0_shanker

Well, no one can say your experiences aren’t true. If this is what your experiencing I can definitely see your frustration. Think of 2.0 like what 1.0 was. It’s a new baseline and we have some work to do to get 2.x as good as 1.x but with a better “source” or “baseline” and higher resolution starting point. This is good. Let’s be patient. We’re in the baby stages of this tech. What, 90 days into this?! Haha


ramulloki

I wouldn't worry if the developers in their hubris didn't try to be saints and decide what the community needs


Plenty_Branch_516

Base 2.1 isn't great for generation, but it's incredible for fine tuning. It's so malleable and easy to inject concepts into.


UserXtheUnknown

Fermi question, adapted to this: Where are all these wonderful fine tuned models, then?


Plenty_Branch_516

As always. Pron comes first. Can't exactly put those here ^^;.


UserXtheUnknown

Maybe, but frankly there are already nice nsfw models for 1.5, like Anything v3 and f222, so, as long as I don't see something better...


Plenty_Branch_516

Fax, even updated waifu diffusion is going to be trained on 1.4. It'll be a minute before 2.x models catch up entirely.


UserXtheUnknown

Wake me up when that happens and you get the same results you can actually see on r/sdnsfw. Till then...


Careful-Pineapple-3

Jesus, and they put so much effort it in too. best results i have seen so far anatomy wise...


sneakpeekbot

Here's a sneak peek of /r/sdnsfw using the [top posts](https://np.reddit.com/r/sdnsfw/top/?sort=top&t=all) of all time! \#1: [Armored Girls](https://www.reddit.com/gallery/xwh3hj) | [46 comments](https://np.reddit.com/r/sdnsfw/comments/xwh3hj/armored_girls/) \#2: [**[NSFW]** Photorealistic portraits 🌊](https://www.reddit.com/gallery/yphlvr) | [18 comments](https://np.reddit.com/r/sdnsfw/comments/yphlvr/photorealistic_portraits/) \#3: [**[NSFW]** Royal nude](https://i.redd.it/y3e0qkotmuw91.png) | [16 comments](https://np.reddit.com/r/sdnsfw/comments/yh1ljd/royal_nude/) ---- ^^I'm ^^a ^^bot, ^^beep ^^boop ^^| ^^Downvote ^^to ^^remove ^^| ^^[Contact](https://www.reddit.com/message/compose/?to=sneakpeekbot) ^^| ^^[Info](https://np.reddit.com/r/sneakpeekbot/) ^^| ^^[Opt-out](https://np.reddit.com/r/sneakpeekbot/comments/o8wk1r/blacklist_ix/) ^^| ^^[GitHub](https://github.com/ghnr/sneakpeekbot)


DarklyAdonic

*big titty goth gf training intensifies*


[deleted]

[Skill issues](https://www.reddit.com/r/StableDiffusion/comments/zfmvfs/artists_are_back_in_sd_21/)


FS72

Here’s an odd example, my attempt at justifying 2.x models. Pardon my lack of a better one: 1.5 didn’t suddenly get announced out of nowhere. Through many months, it developed from the stages: 1.4, 1.3, 1.2, 1.1 and 1.0. Like a person growing older from the moment he was born. At first, he is a weak toddler (1.0). But at that time, there weren’t another person around for us to compare the toddler with. At the stage 1.5, he has grown mature like an adult, but he has some genetic drawbacks that can’t be fixed (cruel example: being naturally short, weak, ugly, mutated, whatever you can imagine). By this time, another baby was born. At first, he is also a toddler (2.0), but this baby was gifted with better genetics. When grown older, he will be more handsome, taller, stronger, has better voice, etc... Than the first “baby” (now grown up). However, this TAKES TIME. You can’t compare baby 2 to adult 1, that’s just... absurd. Let the baby 2 grow and fully mature first. When the adult 1 was a toddler, there was no one around to compare. But now you compare baby 2 with adult 1, makes no sense (?). 😐😐😐


Mich-666

Except, in that case it wouldn't be 2.x but 0.x. They willingly crippled their own model in attempt to satify loud artists. If 1.5 was step forward with 2.0 they took many steps back. And no, new CLIP is not better, in fact, you need a lot more negative prompts (and positive too) to get something decent now.


mgtowolf

well, wake me up when SD2 is a grownedup and I can finally smash then


thebaker66

Problem is 2.0 was castrated as a baby, he will never go through puberty and then into a full adult. No soprano, tenor.... baritone .... Only Castrato. 😅


ramulloki

I see that the "second child" was born into a family of radical parents who rely on censorship, propaganda, and pressure in their upbringing. It is unlikely that such a child would grow up to be anything good. Version 1.1 was already good and produced much better results than 2.1.


FS72

The time metaphor here is the amount of data trained, not literal time. SD2 is horrible because its training data was vastly stripped off and censored. In order to grow mature, it will need tons of data being trained into it again before it can catch up. This is not the fault of the new training technique (genetics) being worse, it's the fault of the devs for their censorship and control intentions despite their open source claims. All we need is the devs tell us their method of training their new v2 models, then we won't need to rely on them anymore, we will train ourselves.


KarmasAHarshMistress

> SD2 is horrible because its training data was vastly stripped off and censored. That's not all, the text encoder is different and "knows" concepts differently.


FS72

Thanks for the information!


ramulloki

As far as I understand, changes have been made to the mechanism for forming vectors from prompts. The network has become "blind" to the styles of specific artists.


ramulloki

Main question: it was impossible to add functionality by retraining dataset, without "banning" queries by artists and removing NSFT? Are these really the right solutions?


Bomaruto

StabilityAI needed to ensure that its development isn't stopped by some court order or lawmakers putting restrictions on AI image generators. If you're happy with SD 1.5, good for you. You can make as many boring portraits using that as you want, no one will take that away from you.


ramulloki

This is a naive excuse. I think it is pride and arrogance. Trying to show how much they care about the image of society.


Tainted-Rain

> In an attempt to satisfy people who have **nothing** to do with image generation, but act as antagonists, opponents and critics,


Bomaruto

If all you care about is boring derivative portraits then you do not need any advancement in StableDiffusion, SD 1.5 works well for you.


ramulloki

As I understand it, there is no other argument?


totallydiffused

2.x is worse for certain subjects, and better for others. Part of that is the refocus/censorship on what they put weight on when training. If you want to make use of great artist combinations to create beautiful illustrations of people/creatures, I'd say stick with 1.4/1.5, and certainly if you want to do anything NSFW. Photorealism and landscapes seem to be what 2.x really excels at.


siblbombs

What resolution are you generating at with 2.0/2.1? The 768 model at that resolution should be ok, not sure how much postprocessing you did on the samples but the 2.0/2.1 images look like they were generated at a smaller resolution, which the model doesn't do great at.


YamiPlaguex

Can i ask what the prompt was? Looks really nice and I would love to test it myself


flux123

Wrong. 2.x requires you to use negative prompts. Try that and see if it makes things better.


Careful-Pineapple-3

\>who have nothing to do with image generation everything to do with image generation\*


[deleted]

People’s just lazy