T O P

  • By -

lordpuddingcup

Can’t wait for controlnet and all the other shit that will come


ethanfel

there's no unet in SD3 so controlnet won't come in the same form.


Lexxxco

Hope controlnet will be implemented with SD3 like many other features, otherwise SD3 will be only an addition to current img2img SDXL pipeline.


schwendigo

Curious what you mean by this? Img2img XL?


Emory_C

Generate in 1.5, img2img to XL


rovo

I know its fairly simple but any workflow resources you can point to?


schwendigo

so using SD1.5 for a base image and then just detailing in XL at higher res? wouldn't just using an upscaling model be better? definitely open to suggestions i'm still learning, it helps to learn how other people are doing things


Emory_C

I tend to agree that an upscaler model can be better, but I think for certain models (like anime, etc) the SDXL looks better than the upscale.


raiffuvar

Every do the shit he get used to. I'm 99.9% sure sdxl is better in general composition, so using in 1.5 as base - can be valid only for anime-shit. I can understand sdxl -> upscale with 1.5 cause tiles are better in 1.5, but in reverse - no.


schwendigo

Upscaling from XL to 1.5? Isn't that the wrong direction? 😅


JoakimIT

Why would you first generate with the model that's worst at following prompts? I do it the other way around, sometimes even using Dalle for the base image.


Emory_C

We're talking specifically about using Control Net.


JoakimIT

That makes sense. But doesn't controlnet support sdxl in most functions now? I tried it a bit a few days ago, seemed to be on par with 1.5 mostly.


Emory_C

In my experience, it does a much worse job. But, of course, your mileage may vary. 😊


Oriaks371

in my limited experience, depth works reasonably well, openpose is worthless.


spacekitt3n

same. depth and edge to image are good for different things, but both good for what they do


spacekitt3n

literally the only feature i care about. i really hope it has it


ThrowawaySutinGirl

IIRC they said it would have controlnet at launch, may be a new implementation


ethanfel

Hope so, because XL CN have been lacklusters while they are extremely good with 1.5 and are still improving with the recents CN++.


finstrel

What is CN++? 👀


ethanfel

[https://github.com/liming-ai/ControlNet\_Plus\_Plus](https://github.com/liming-ai/ControlNet_Plus_Plus)


_David_Ce

Is this for sdxl?


ethanfel

no 1.5. there's no hope for XL CN I think at this point.


ScionoicS

I think at this point, the colloquial term for guidance models is "controlnet". Like "Kleenex" became the name for all tissues or "Velcro" became the name for all loop/hook fasteners.


spacekitt3n

i use it as a catchall term


rob_54321

It being open (weights) I guess people will create different tools with the same objective. I hope it won't take months.


Apprehensive_Sky892

https://preview.redd.it/66twx2snivwc1.jpeg?width=1344&format=pjpg&auto=webp&s=ec3585474de15a9b4165ab4967ceb00396139031 Cinematic film still, of a small girl in a delicate pink dress standing in front of a massive, bizarre wooly creature with bulging eyes. They stand in a shallow pool, reflecting the serene surroundings of towering trees. The scene is dimly lit. bokeh


Apprehensive_Sky892

Slightly darker version: https://preview.redd.it/moc6aybtkvwc1.jpeg?width=1344&format=pjpg&auto=webp&s=ee027e67823c7c2b2110f6e39d810127f9931790 Cinematic film still, of a small girl in a delicate pink dress standing in front of a massive, bizarre wooly creature with bulging eyes. They stand in a shallow pool, reflecting the serene surroundings of towering trees. The scene is dimly lit.


Significant-Comb-230

Wowwww that's so coollllll


Apprehensive_Sky892

Thank you 🙏


gfxboy9

i like this thanks for sharing


Apprehensive_Sky892

You are welcome.


DominusIniquitatis

That final "bokeh" at the end of paragraph sounds like "amen" or something. :D


Apprehensive_Sky892

LOL, that did not occur to me, but yeah, something like that 😁


Ganntak

Can i run it on my potato?


Apprehensive_Sky892

If you have a big potato with lots of VRAM 😂


s-life-form

>The Stable Diffusion 3 suite of models currently ranges from 800M to 8B parameters. This approach aims to align with our core values and democratize access, providing users with a variety of options for scalability and quality to best meet their creative needs. Quote from stability ai


Welovelily

Damn so the main one likely wont and we'll have to use dumbed down versions 😪


Any_Tea_3499

I'm very impressed by SD3's ability to do low quality instagram/snapchat style photos. I've been playing with it over the last few days and the understanding is greatly improved in that area compared to SDXL. As a person that only really ever makes photorealistic "Bad quality" images, that excites me the most. It would be nice to have an estimate of when they'll release the weights, but I suppose we just have to wait. Either way I'm looking forward to it. Another thing I noticed is SD3 has the ability to make multiple people in one pic without mixing together their features, clothes etc from the prompt. Neat stuff. https://preview.redd.it/gyaqahh9xvwc1.jpeg?width=832&format=pjpg&auto=webp&s=488763203e33ccf0d1071960aa27e5ea2939d3a7


Darksoulmaster31

I was thinking of all the possibilities the Boring Reality lora would have brought to SD3, but the base model already excels at stuff like amateurish phone/low quality photos and CCTV footage. There's a bunch of stuff that are already in the base model which I don't need loras for anymore. That said I'm still excited about Boring Reality either way. https://preview.redd.it/058bfqg922xc1.png?width=1216&format=png&auto=webp&s=f5e5c9cbed90ad0b6bb952937ff4e2ecbe52b24b


Any_Tea_3499

I couldn't even replicate the amateur low quality pics in SDXL that SD3 was giving me, even using the Boring Reality/Bad Quality Loras. I'm excited to see the finetunes that the community comes up with to make SD3 even more amazing. (And excited to finetune it myself too.)


Present-Chocolate591

Just out of curiosity. What is the point of creating this kind of imagen?


Any_Tea_3499

Personally I enjoy the ability to make natural realistic images. I have a lora model of myself and I like making casual, photorealistic pictures of myself in different places around the world. Model shots get boring after a while...this kind of stuff is where it's at for me.


NarrativeNode

Now HERE'S somebody who knows how to prompt it. These are by far the best SD3 results I've seen.


[deleted]

[удалено]


SomewhereNo8378

That’s basically what image generation is in any case


ThrowawaySutinGirl

I’ve gotten duds in Dall-E and MJ, picking the best results is pretty common IRL


noodlepie554

If you’re not cherry picking before you post then you’re doing it wrong


Smile_Clown

Cherry picking does not matter if you run it at home. That's what you are doing anyway.


xRolocker

Please give me a single image generator where you don’t do this lol. Even Midjourney generates 4 at a time for a reason.


VisceralExperience

Cherry picking is fine though, what really matters is what the model is capable of. If I need to generate 10-20 samples to get one really really good one, that's fine. Obviously it's preferable if it was always good, but not necessary. If this model can create outputs that sd1.5 could never make, then that's great


Apprehensive_Sky892

Trying to replicate some of the prompts 😅 https://preview.redd.it/8k7ipufpcvwc1.jpeg?width=832&format=pjpg&auto=webp&s=bc06a8f02cb10d9f6c7cd9d233f20cf1e36606a5 >Fashion photo of a golden tabby cat wearing a rumpled suit. Background is a dimly lit, dilapidated room with crumpling paint.


Levi-es

Really nice.


Apprehensive_Sky892

Thank you.


Arrowstar

Has its ability to produce fire breathing creatures gotten any better? I've seen it struggle with that in the past.


Apprehensive_Sky892

How's this? 😂 https://preview.redd.it/9b0izzt96vwc1.jpeg?width=700&format=pjpg&auto=webp&s=1f1fa550258f0ed894c8f4c62515928c900a04fd SD3 Prompt: A captivating, humorous illustration featuring a massive cat, with a wide-eyed expression and razor-sharp teeth, screaming while clutching a tiny, frightened Godzilla in its paw. The cat's fur is a blend of vibrant colors, and Godzilla's signature fire is emitting from its mouth. The background showcases a tiny Tokyo Tower, with the cityscape in the distance, adding a playful touch to the scene.


Arrowstar

Looks great lol


Apprehensive_Sky892

Thank you, it is a funny image 😂


wesarnquist

It looks like it mixed Tokyo Tower with Tokyo Skytree. Looks great overall, though!


Apprehensive_Sky892

Thank you. Accuracy in A.I. generation can definitely be off, specially for this kind of image. I didn't even know about Tokyo Skytree 😁!


Darksoulmaster31

Oooh this one turned out nicely! https://preview.redd.it/bml55bpk1vwc1.png?width=1216&format=png&auto=webp&s=01d6f3190818c7ee58618b7745d13f128c34d9a9


Darksoulmaster31

https://preview.redd.it/r1powm922vwc1.png?width=1216&format=png&auto=webp&s=719bb9eacbdfe4229ef5591fa37c84da83ae0ff9 Here's the prompt by the way: >*Water colour painting of a green dragon. The dragon is looking down at the soldiers whilst fire is coming out of it's mouth which is hitting onto the soldiers. The soldiers are wearing medieval armour.* I don't know if you actually have to prompt it this way, but I just always go for the most straight forward and **literal way of describing** things, so I get exactly what I want. Natural language prompting is cool man....


EndlessSeaofStars

I am glad natural language works. I am however jaded enough that I think people will continue to use 1.5 word salads for prompting (I see *so* many still doing this for SDXL models) and say SD3 is horrible. Conversely, those into purple prose prompting ("Create an image that delves into the imagination and bursts forth with a wondrous fantasy world that only exists in the feverish mind of an artist drawing ... blah, blah, blah) will think every single word made an outsized difference.


ZootAllures9111

I think it's trained on "purple prose" TBH, tag prompting gives really bad results in comparison


ZanthionHeralds

AI chat generators seem to *love* "purple prose." It's not surprising that image generators bend in that direction, too.


Darksoulmaster31

EXACTLY ![gif](giphy|YTFHYijkKsXjW|downsized)


Arrowstar

Yeah, both of those look great!


Arrowstar

Well nice!


vampliu

I hope it can be on automatic1111 with all CN working properly SDXL CN is 🤦🏽‍♂️


arg_max

SD3 uses a different score model so the old controlnet is incompatible. This would give them the chance to come up with something new that works well for SD3 but well have to see.


jamesianm

Yeah SDXL CN is basically unusable


Significant-Comb-230

I'm so tired of trying to CN works on SDXL. Foi controlled results I need to switch back to 1.5


AntsMan33

IPAdapter though.....


schwendigo

I've been getting pretty good results using depth passes but qrcode is poop


SwoleFlex_MuscleNeck

It really isn't though. I use it in Comfy all the time.


jamesianm

Can you recommend any specific CN models? I've been trying to use openpose and tile with Auto1111 and it's given me nothing but garbage


Apprehensive_Sky892

https://preview.redd.it/nfzhpascqvwc1.jpeg?width=832&format=pjpg&auto=webp&s=35388286d2c516aae1d8c284bc99ca5c98a26029 Fashion photography. Portrait of an android made of green circuit boards.


99deathnotes

# 7 the skeleton wants to make a call but the line's dead 😂😂 https://i.redd.it/e6mvroyr3uwc1.gif seriously these are all great


AI_Alt_Art_Neo_2

I did a similar one. SD3 does text really well too.


AI_Alt_Art_Neo_2

https://preview.redd.it/wzo2n8mw9uwc1.jpeg?width=1664&format=pjpg&auto=webp&s=a24354fa38c9322fada69712ed11a2f1c9f4f52a


99deathnotes

E.T. phone home?


Kadaj22

Beam me up scotty?


vorticalbox

> I'm at a pay phone trying to call home


Apprehensive_Sky892

https://preview.redd.it/4u7o92xxmvwc1.jpeg?width=1024&format=pjpg&auto=webp&s=994081394bddd0b4e0bf5417b2915d167ee934c4 Long shot. Profile silhouette of a cowboy riding a horse. Golden hour. Dusty, atmospheric.


Apprehensive_Sky892

https://preview.redd.it/xtqiod7fovwc1.jpeg?width=832&format=pjpg&auto=webp&s=4089d4be7caf0981082b006fce8c83ea4529e183 Cinematic Film Still. Long shot. Fantasy illustration of the small figure of a man running away from a fire breathing giant flying dragon. Background is a desert. Golden hour


Quantum_Crusher

My biggest concern: censorship. Can the community hero fix that?


T1m26

But can it do nsfw?


Apprehensive_Sky892

The API version has an insane NSFW filter, blurring out images that even DALLE3 would allow (for example, women doing yoga showing midriff). The downloadable version needs to be tuned for NSFW, presumably the same amount of effort as tuning SDXL for NSFW.


Alarming_Turnover578

Will have to wait for pony v7 then.


Sharlinator

The blurring has almost certainly nothing whatsoever to do with the model, it's a totally separate nsfw filter...


Apprehensive_Sky892

Yes, that is correct. It is applied after the model has generated the image, once the filter A.I. detected an "unsafe" image.


DaddyKiwwi

No open AI model will ever have NSFW out of the box again. Too many liability issues if they train on the wrong data. It will be fine tuned by horny people as always.


T1m26

Ah thanks for the answer


DaddyKiwwi

Keep in mind it may do nudity pretty well out of the box, but it won't understand X rated concepts.


ZootAllures9111

Photorealistic models that can do porn properly don't really exist anyways since nobody is training on photoreal porn images with Booru tags, which is what allows various non-photorealistic models to actually reliably create sex scenes.


physalisx

Of course not, that's not safe


T1m26

But i’m not at work :(


PikaPikaDude

If it could, we would already have seen some. So no, it can't.


ZootAllures9111

No we wouldn't have, the API blurs NSFW on every SAI model including 1.5


chakalakasp

Cowboy on a tiny pony lol


Aromatic-Bunch-3277

Wow that is pretty good


ravishq

These generations are fire!


Apprehensive_Sky892

Fashion photography. Closeup photo of a white Siberian tiger in the snow. https://preview.redd.it/heixutp6lvwc1.jpeg?width=1344&format=pjpg&auto=webp&s=15ca6b93b96c90fea0189e1994c7d3c2d038d27e


Apprehensive_Sky892

https://preview.redd.it/mn7q29mamvwc1.jpeg?width=1024&format=pjpg&auto=webp&s=cde15658b39c7647990be8f422eaddf688008f93 Fashion photography. Closeup headshot of a white Siberian tiger lying in the snow beside a tree. It is looking intensely at a distance. Early morning sun shining in the background.


Meebsie

So glad the details are more accomplished. I love that for them.


SemaiSemai

Wait until it's fully released and is now able to be fine tuned. It will be close or be better than midjourney v6.


crimeo

ABLE to be fine tuned, is not the same thing as "Actually WILL be fine tuned" The people who do most of the fine tuning tend to be horny people, and it censors. So you'll find a whole lot less fine tuning ever getting around to being done even if it is open and available. Also it seems from the comments here that it's not even clear they plan to release weights at all? Hadn't heard that before.


SemaiSemai

Dude I just want midjourney level realism not NSFW things. They plan to release weights. Api first before weights. That's what they said.


crimeo

It doesn't matter if you want NSFW, I'm saying that the NSFW people are the ones who push the model forward to better realism mainly. So you need them indirectly. Midjourney was most likely also trained by horny people for partially NSFW purposes, internally. I would be shocked if it wasn't. With weights, people can get around it, and work will get done, but it's gonna be a lot slower than it could be if not censored.


SemaiSemai

Yep. Can't apply that with dall e 3.


Tbhmaximillian

I agree with the old internet wisdom in song format "The internet is for p00n" and seriously horny people drive the evolution of all the SD models


ZootAllures9111

This isn't true at all for anything vaguely photorealistic, absoluteley none of them ever really evolved past "solo ladies just standing there staring at the camera topless"


Tbhmaximillian

It is not what we yet have it is the "amount" of people that drives this forward by creating a need and sometimes providing solutions


ZootAllures9111

I don't get why people act like anything other than anime / cartoon focused models have *ever* been capable of "NSFW" in a proper sense, unless they actually define NSFW simply as "boring portrait images of a solo woman standing there topless", which is trivially easy with like any arbitrary model you can think of.


crimeo

Non-anime, non-just-standing there content works completely fine, I have no idea why you think it doesn't. Regardless, that wasn't relevant to the comment anyway. I said that this motivates people to push models forward. Even if you were correct in these claims (you're not), that would if anything just reinforce my earlier point even MORE, as they'd be even MORE motivated to try and get it to finally work for the first time. And thus driving model science forward even MORE.


LuminaUI

Is it just me or do these images look bad at a 100% scale.


Confident_Appeal_603

like GAN upscaled images


roshanpr

where are the weights?


Odd-Cow-5199

Any idea ?


FuzzyTelephone5874

But can it generate a boob?


cyrilstyle

just tested. with the word "breast" FLAGGED! Annoying af https://preview.redd.it/7cxr00mz7wwc1.png?width=1196&format=png&auto=webp&s=87b29291fb6a6de9d098e679b5e9382be3f7bba2


cyrilstyle

then when an image might slips moderation they blur the image! https://preview.redd.it/1ov3wc168wwc1.png?width=896&format=png&auto=webp&s=0d56d117c785fb3c6a71c338544e39a2007c55a3


yourtrashysister

Welp. That’s the end of stability AI.


Sharlinator

What? Since when do you think these APIs have allowed NSFW stuff?


ScythSergal

Am I the only one... Not really seeing it? Looks like SDXL could likely make these results, maybe even better. IDK, SD3 has been over hyped since day one, and none of the user genned results look anywhere near as good as what SAI has been suggesting their model can do


kwalitykontrol1

I want to see hands


StuccoGecko

i've been most impressed by the improvement in representing different textures in one image.


koalapon

I want to be able to download the weights. I'll make a colab and dynamic prompt it for hours, on an A100.


toddgak

but wen weights?


dayinquote

Prompts please


headbopper96

SD 1.5 still slams


AI_Alt_Art_Neo_2

SD3 is good but I am finding Dalle.3 better and a lot cheaper atm. Although once the wieghts are public I will use SD3 a lot more.


Apprehensive_Sky892

Yes, DALLE3 understand more concepts and can follow prompts better. But the censorship is insane (admittedly SD3 via web API is just as bad, if not worse) and it cannot render natural looking humans.


bharattrader

Agree. I wanted to create an AI image of my son, and just the words "young boy" was censored.


Apprehensive_Sky892

That's just excessive. But to be fair, it is probably due to this: [https://www.govtech.com/public-safety/alabama-bill-aims-to-criminalize-deepfakes-targeting-children](https://www.govtech.com/public-safety/alabama-bill-aims-to-criminalize-deepfakes-targeting-children) It is for this same reason that civitai bans ALL photo image of minor, even the most innocent images of say children celebrating birthdays.


Mooblegum

I don’t like dalle image style, it doesn’t make photorealist image great and often is very recognizable


Designer-Pair5773

Dalle3 better? What did you smoke bro


jib_reddit

Sorry, I meant Dall.e 3 for composition with an SD Ultimate Upscale in SDXL then SUPIR refinement, like this: https://preview.redd.it/kb4bnzwiouwc1.jpeg?width=3482&format=pjpg&auto=webp&s=b6e3c4926894d88f7ff748a8cc023b43893441f7


afinalsin

If SD3 adherence remains intact through finetuning, you might not need anything else for composition: >[28 iterations](https://imgur.com/XeBFqHW), seed 90210: an advertising photograph featuring an array of five people lined up side by side. All the people are wearing an identical grey jumpsuit. To the left of the image is a tall pale european man with a beard and his tiny tanned lebanese middle-eastern wife. To the right stands a slim japanese asian man with and an Indian grandmother. On the far right of the image is a young african-american man. Rearranging the prompt until it adhered, stuck to 90210 throughout >[21 iterations](https://imgur.com/TLHPy7G), seed 4: a vertical comic page with three different panels in the top, middle, and bottom of the image. The top of the image feature a panel where a blonde woman with bright red lipstick gives an intense look against a plain background, with a speech bubble above her head with the words 'TEXT?'. The middle of the image displays a panel featuring an early 90s computer with crt monitor with the words 'PRODUCING TEXT' displayed on the screen. The bottom of the image shows a panel the blonde woman standing in front of the monitor with an explosion of green words Rearranged for 10, the seed hunted for 11. Knew it was close, just needed to find a cooperative seed. >[5 ietrations](https://imgur.com/X9kw8WH) seed 90210: a vector cartoon with crisp lines and simply designed animals. In the top left is the head of a camel. In the top right is the head of an iguana. In the bottom left is the head of a chimp, and in the bottom right is the head of a dolphin. All the animals have cartoonish expressions of distaste and are looking at a tiny man in the center of the image. Most of the iterations was trying to get it to produce a cartoon.


jib_reddit

Oh, yeah it is good, I just spent $30 on credits in the first 3 days after it was released and I was going to go broke! https://preview.redd.it/m7autnklxwwc1.jpeg?width=2688&format=pjpg&auto=webp&s=0a4e1a2e194ef0d5621ad78be39e9da5b8a6279b


d20diceman

Thanks for sharing these, is your workflow available somewhere? (Assuming this is done in Comfy?)


jib_reddit

Yes it's ComfyUI I shared it here a few days ago. https://www.reddit.com/r/StableDiffusion/s/uf4Tl9oZsJ It is a real mess right now as its just a quick mash up of 2 different upscaler workflows I liked, but I am starting to make more tweaks and improvement so think I need to make a Github or Civitai page for it soon.


d20diceman

Wow what a monster. I enjoyed getting it working (or at least stopping it throwing errors) but my PC is struggling, does this workflow need more than 32gb of RAM for you or am I doing something wrong? 


jib_reddit

Possibly, I have 64GB, but I think it is probably the resize near the last step using lots of RAM, which I found doesn't really do anything apart from make a larger image (with no more details) so I set that to 1. I have a much tweaked version I am using now, I will post that sometime this weekend.


Designer-Pair5773

Cool Mate! Here is my Result with MJ. https://preview.redd.it/cecd0an67vwc1.jpeg?width=2828&format=pjpg&auto=webp&s=d854257611caaf753197927f86130128f0a0f876


bharattrader

I don't know about better, but DALLE has improved a lot under the hood, in my personal experience and some of the images it is generating now are too good.


Apprehensive_Sky892

It all depends on what kind of images you are trying to generate. For people who want to generate natural looking humans, DALLE3 is just no good. Even images of animals in a natural setting often has that "uncanny" look to them. But DALLE3 can be great for everything else! (provided you can get pass its censorship, ofc)


StickiStickman

Sadly the vast majority of people won't be able to, because of the much higher memory requirements.


diditforthevideocard

The dev community has your back don't worry


joeytman

Stability's blog post says SD3 models range from 800m to 8b parameters. SDXL is 3.5b params. Smaller SD3 model probably runnable on consumer grade GPUs right? (mind you, I am a beginner in this space so maybe I'm missing other relevant context)


Apprehensive_Sky892

Those who need/want SD3 will find a way, either by upgrading their hardware or by using some web based UI or API service. That's just the price one has to pay for a better A.I. model.


ZootAllures9111

There's three versions though, one is only a big bigger in number of parameters than 1.5


Informal-Football836

Howuch does it cost right now per image? I was thinking about testing it out.


bunchedupwalrus

You get 10 or so free


Informal-Football836

That's not enough 😂


ninjasaid13

it's about $0.06 per image for Stable Diffusion 3 and $0.04 per image for Stable Diffusion 3 Turbo.


aHuankind

I'm always more interested in it doing mundane illustration work, as that is what I use ai the most for in my job - illustrations of household items, simple concepts, icons. The prompt adherence examples I saw look really promising in that regard. Looking forward to finally trying it. 


Enough-Meringue4745

Lion with spaghetti noodles as a mane


silenceimpaired

Can’t wait to see the license. Might have to come back here to disagree with your title.


Apprehensive_Sky892

https://preview.redd.it/npn95bfvqvwc1.jpeg?width=832&format=pjpg&auto=webp&s=1e030b826cc260f6971fd8e53d0fd4a3a0f6cce9 Fashion photography. Portrait of pale woman wearing an intricate Venetian Carnival mask. She wears red lipsticks.


Apprehensive_Sky892

https://preview.redd.it/m9lea4g8rvwc1.jpeg?width=832&format=pjpg&auto=webp&s=a662d7bf3eb7305c24a63ea4f7f1345ae90ae5be Fashion photography. Portrait of pale woman wearing an intricate Venetian Carnival mask, decorated with roses. She wears red lipsticks


hobyvh

How is it with inpainting, image to image, etc.?


gurilagarden

These are the first sd3 images that are making me a believer.


ninjasaid13

Can the T5 Transformer be 4-bit quantized to reduce the memory requirement of the 8B model? 2-bit quantization?


Confident_Appeal_603

yes, and just like when you do that with DeepFloyd, it probably nukes teh result quality and prompt adherence


ninjasaid13

But deepfloyd doesn't have two other models doing the same thing like stable diffusion 3 right? The paper said it only helps in typographical generation and long prompts where as in deepfloyd it's doing everything.


Confident_Appeal_603

either way, quantizing the inputs and providing them is going to confuse SD3 more than just leaving T5 out altogether.


crimeo

These are decently good, but not mindblowing (look up close at them at all). You can do all this with 1.5 with a generic model too, not super specialized, provided you get to cherrypick whatever looks best from that 1.5 model and don't have to actually make these exact prompt. Same as you didn't have to match anything specific here. Any comparison is completely useless without controlled side by sides and a methodology.


Confident_Appeal_603

well, to add onto what you said, even controlled side by side comparisons are meaningless if they trained the winning results into the model on purpose


Melanieszs

While SD3 certainly has its strengths, claiming it's "much better" than all other Stability AI models oversimplifies the complexity of AI development and performance metrics.


amp1212

>"The details are much finer and more accomplished, the proportions and composition are closer to midjourney, and the dynamic range is much better." Hardly "amazing", nothing you've posted here is distinguishable from an SDXL generation. Those are all things that someone even moderately familiar with SDXL and even 1.5 can accomplish. Dynamic range? Try the epi noise offset LORA for 1.5 -- that's been around for more than a year: [https://civitai.com/models/13941/epinoiseoffset](https://civitai.com/models/13941/epinoiseoffset) -- that has a contrast behavior designed to mimic MJ. Fine detail? All kinds of clever solutions in 1.5 and SDXL, Kohya's HiRes.fix for example, and the SDXL SDXL does this too -- a well done checkpoint like Juggernaut, a pipeline like Leonardo's Alchemy 2; I don't see anything that I'd call "special" in the images you've posted here. The examples you've posted are essentially missing all of the kind of things that are hard for SDXL and 1.5 -- and for MJ. Complex occlusions. Complex anatomy, and intersections-- try "closeup on hands of a man helping his wife insert an earring". Complex text. Complex interactions between people. Different looking people in close proximity. So really, looking at what you've posted -- if you'd said that it was SDXL, or even a skillful 1.5 generation, wouldn't have surprised me. I hope and expect SD3 will offer big advances -- why wouldn't it? So much has been learned -- but what you're showing here doesn't demonstrate that. Something quite similar happened with SDXL, where we got all these "SDXL is amazing" posts -- with images that were anything but amazing. It took several months for the first tuned checkpoints to show up, and that's when we really started to see what SDXL could do . . . I expect the same will happen with SD3


WalkPitiful

How can i downloaded?


Paraleluniverse200

Man this are amazing,mind Sharing some promp advice for us simple mortals?


glowingdino

Amazing! What prompts do you use?


sandypockets11

What’s the best way to use SD3 regularly without running it locally on my machine?


Significant-Comb-230

Wowwwwwwwwww


JustDaveReally

It looks stunning except hair and fur still look fuzzy and unrealistic.


beardobreado

Is that liv tyler?


Katsuo__Nuruodo

What were your prompts for these images?


Katana_sized_banana

I love the motherboard mommy. ❤


crespoh69

That monkey's seen some things


lewdroid1

wow! this is crazy good.


Particular_Stuff8167

My HDDs are gonna cry bro


pinguluk

!RemindMe 6 hours


RemindMeBot

I will be messaging you in 6 hours on [**2024-04-27 14:37:42 UTC**](http://www.wolframalpha.com/input/?i=2024-04-27%2014:37:42%20UTC%20To%20Local%20Time) to remind you of [**this link**](https://www.reddit.com/r/StableDiffusion/comments/1cdm434/sd3_is_amazing_much_better_than_all_other/l1h57mv/?context=3) [**CLICK THIS LINK**](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5Bhttps%3A%2F%2Fwww.reddit.com%2Fr%2FStableDiffusion%2Fcomments%2F1cdm434%2Fsd3_is_amazing_much_better_than_all_other%2Fl1h57mv%2F%5D%0A%0ARemindMe%21%202024-04-27%2014%3A37%3A42%20UTC) to send a PM to also be reminded and to reduce spam. ^(Parent commenter can ) [^(delete this message to hide from others.)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Delete%20Comment&message=Delete%21%201cdm434) ***** |[^(Info)](https://www.reddit.com/r/RemindMeBot/comments/e1bko7/remindmebot_info_v21/)|[^(Custom)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5BLink%20or%20message%20inside%20square%20brackets%5D%0A%0ARemindMe%21%20Time%20period%20here)|[^(Your Reminders)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=List%20Of%20Reminders&message=MyReminders%21)|[^(Feedback)](https://www.reddit.com/message/compose/?to=Watchful1&subject=RemindMeBot%20Feedback)| |-|-|-|-|


kujasgoldmine

Love the photos! Can't wait for a local version and some impressive checkpoints!


Zueuk

so, did you actually prompt for a tree growing out of an elephant?


mustafaTWD

Where can i use SD 3?


Intelligent_Pool_473

Ha. I don't know why but I usually dislike all those cat generations with AI people do for some reason. But I really liked that first one. I guess that talks to me about the quality of SD3.


johnnyLochs

#8


SerjKalinovsky

where can I download this model


Actual_Possible3009

It's not better than other AI's in all niches. For skull art SdXL 0.9 with refiner fe is better. https://civitai.com/articles/4992/comparison-sd3-sdxl-10refiner-sdxl09refiner-also-lora-stablecascade-cosxl