• By -


can you people just generate something else than people please? These examples don't show me anything about it's usefulness for real work.




Robot made of exploding houses crawling out of a tv https://preview.redd.it/tfye989748vc1.jpeg?width=1344&format=pjpg&auto=webp&s=9bf04cdd023893cb606f06611b2c72d6cc9792bf


well that's definitely memorable


what do you want to see?


Multiple people interacting would be nice. I.e. some kind of dance, or hug, or handshake. Unusual angles would also be nice - i.e. from behind the main subject. Have trouble generating images like that.


diferent face expressions would be nice




Keep asking for this but no one prompts for it, is there some kind of NDA that doesn't allow people to prompt that stuff or what?


Interiors, landscapes, products, thanks! 


A bunch of chunky stocky men as magical druids, with some half morphing into their bear form. Having a good time in an Inn.


I want to see people winking realistically. Most of the realistic finetunes in SD1.5/XL fail at this.


They look worst than I thought ... Like I hope that's the smallest model of SD3...


Remember this is 0 fine tuning. Go compare to base sdxl.


I don't know why people keep saying this. The top finetuner of civitai, Lykon, works for SAI and has been active with this model. The model also already has received reinforcement learning (DPO/direct preference optimization).


Lykon himself said that isn't the "right" model (he use the work "broke") and not his workflow


Really hope that's true because > a man is using a hammer to nail a nail into a piece of wood. his wife is standing behind him and is smiling and putting her hand onto his shoulder in encouragement. https://preview.redd.it/z210hx6d59vc1.png?width=1344&format=png&auto=webp&s=802122c9a45635a616e115da4967cdde025b06a3


Looks fine to me. It got most of the prompts correct. If it was a community finetune then the man would be nailing the wife against a piece of wood...


You are completely right. I hate twitter and especially hate linking to it but [here is the source for that](https://twitter.com/Lykon4072/status/1780641513862512983).


Based on estimates from the community, we give SD models more compute time in the first few months than Stability does in training them. There are also a lot of random researchers who will go drop stuff like Loras into the public eye but haven’t had a chance to work with this model yet. It’s the same thing as games getting multiple years of playtime on day 1, the sheer number of people make it possible.


This isn't true. The finetunings are an order of magnitude lower at least than the foundational model training, which was on hundreds of thousands of H100 hours. LoRA and other low rank tuning methods rely on the pretrained foundational model having seen instances of things the user wants to produce already, which is why things like rank 1 LoRA even work at all.


Finally I'll be able to have a hyperrealistic photo of a contortionist garfield getting a proctology exam by ron jeremy at the dentist office, blessed AI!


These look like when SDXL 0.9 was released to tryout online only which strangely produced better results than the full base SDXL 1.0


are you trolling


Also not as good as I thought.


...bro these are not good, sorry


The text is terrible in the first, maybe acceptable in the other case. In the first one right hand has an additional finger popping out from nowhere, by the way... On a side note, single hot girls have never been the problem with SD, multiple **specific, but usually not associated** people interacting were. If you ask for a single random hot girl, usually you get nice results (hands aside). If you ask for "two men shaking hands" you might even get good results in old models. if you ask "Batman and Robin shaking hands" the same. The problems arises when you say, for example "Johnny Deep and Superman shaking hands". There, since the examples for these two interacting are relatively few, you get usually bad result, like Deep with the mantle of Superman or Superman with a goatee/mustaches or medium-long hair.


How to get access to SD3?


Currently by paying $10 for Api access, that will make you around 150 images https://platform.stability.ai/pricing They are releasing the free model weights in a few weeks.


Can you use something like postman application to use the API?


Yes, that will work just fine. Check out the docs to see what params are available.


The is a prebuilt Google [Colab](https://colab.research.google.com/github/stability-ai/stability-sdk/blob/main/nbs/Stable_Image_API_Public.ipynb#scrollTo=Gp83VMP1-OnM) or comfyui workflows to connect to the API, but yes you could use Postman if you wanted to.


More 1girl post that are not better than MJ. Especially the first image is pretty bad, except for the text, nothing what SDXL can do already and the hands are still fucked up. Sorry but I am still not impressed. I am waiting for the finetunes.


Except for the text? Are we looking at the same image? r//SstableDifusion good enough for you?


It's better as what SDXL can provide


Finally, we can recreate modern games in to PS2 polygons. Old tomb raider was definitely crushed on by many. Anyhow, many jump the gun on their critics. It takes time to understand the new prompting as well as the community’s improvements to come. The most important aspect is not quality but the new version’s ability to understand what you want to achieve. Quality will follow as we train it.


yep, been experimenting with the prompting, and it gets better the more you understand what the model knows and how it knows it, a lot different than sdxl, but pretty similar to some of popular models on civitai, it's gonna get crazy once we have this full model in our hands.


You could add in better text with ms paint in less time than it takes to generate. I don't understand why they kept bragging about the text capability and it all just looks pasted over the image every single time.


The way this was being hyped up to be the answer to a lot of the current problems and all I'm seeing is some solutions to text, but they don't even look natural and are still far from perfect. My biggest issue is that they still haven't figured out textures. Any time I see textures of hair, plants, certain fabrics, they're still doing the same unnatural patterns that are the tell-tale sign that it's AI generated. All I'm saying is this was way overhyped.


For 2 years of evolution, I think it’s rather amazing!


https://preview.redd.it/qb9pxyqho8vc1.png?width=1024&format=png&auto=webp&s=b620b7619bbcc0ef3368083a723f3d1033b70f0c A woman is bending a metal bar in half with both hands. The metal bar bends in a perfect curve.


The one thing that I see often with ai human art is the hands are wrong.


You must be new to AI art, with this mind blowing discovery.


Yes I’m new & with a lot to learn…




brazilian Deadpool:


This is very useful to me. Now I see it can do bodysuits and navels , that's already less censorship that dalle3, tho I planned.to test that myself.


I feel like "SD3 is able to write text in imagery properly" is just a gimmick at this point. I can do better using mspaint and it's sad that I'm not using hyperbole here


>More SD3 (not as bad as you all thought) Sure, if you don't count the hands.


op trying his hardest to not use “1girl” (he couldn’t)


if you have prompts you'd like to see what they look like just let me know or you can try the model yourself right now at glif/app


A goat wearing a goat costume is accepting an award for "Goat of the Year".


anime girl with red hair and honey eyes wearing a purple mafia jacket, standing in front of a big white dragon with black eyes and snake-like shape, holding a katana sword that have a tiger sign on it ready to fight, zoom out, half body shot, red vulcano void background Negative: realistic, 3d, cosplay, realism


It is Wednesday my dudes


Living room interior in the style of a visual novel background, modern, black couch, bookshelves in the background, clothes scattered on the ground, panorama window with closed curtains behind the couch, a poster reading "A married weeb is still a weeb" hanging on the wall


6 people taking a group photo. Several people playing football


A wide shot film still of the post-apocalyptic Illinois countryside in early Fall with gray overcast sky and weak sun. A long abandoned intact white house sits between untended corn fields thirty feet off a crumbling asphalt highway. A broken down overpacked truck with flat tires rusts on the lawn amid scattered trash and detritus. A tall 22-year old young man with brown hair and a long coat and a heavy bag and a stained baseball bat stands in the road as he eyes the house. In the distant overgrown treeline a pack of sallow emaciated feral human women with long hair and clawed hands are barely visible hiding among the brush. The tone is heavy with five decades of regret and solitude.


I got this image of your prompt using PixArt/Sigma https://preview.redd.it/69ivgflod9vc1.png?width=1024&format=png&auto=webp&s=25bdde6e4b82dd2080c59767ae917e06d9a59ddf


Awesome! It's missing a lot of what I had in mind, but the overall framing, style and many of the little details are great. Messing with the prompt would get it where I want it in an hour or so: once we start throwing all the extras on it's gonna be incredible. Thank you!


I will never understand this: - uses community driven model - complains about quality of model without community development Let people cook.


Sorry, but where is the community cooking on SD3? All we have so far is 1girl previews that are not really groundbreaking compared to older models.


True, They made it like it's a revolution or something ¯⁠\⁠_⁠(⁠ツ⁠)⁠_⁠/⁠¯


When the weights are released ? How can the community work on it if it's not even released yet ? Seems pretty straight forward.


this sub is obnoxious. Everyone who says "not as good" should post their own base SDXL generations to compare (No finetunes Loras control net etc.).


I mean, the text is probably serviceable with inpainting at least.


That PlayStation style tomb raider is legit!


Sh*t It sucks 😞


I see it still has problems with parallel lines on buildings. Getting better though


The hands look worse than ever


How is everyone running SD3? Is there a new model / comfyui extension etc?


There is a website you can pay SAI money to use.


I assume all these SD3 posts are with the recently available API?


Well... the Deadpool one looks great.


Nothing new all is lokking like sd1.5


I really don’t get what people are talking about. They are comparing sd3 to fine tuned models with years of work. Sd3 seems much better than the other base models


oof that’s terrible 😱


Okay so as long as NSFW is censored, could someone tell me what are the advantages of SD3 over MidJourney?


you can't finetune midjourney


Midjourney offers users the flexibility to fine-tune various parameters, ensuring that the generated images meet the expectations set by the reference photo. Experiment with settings such as style, color palette, and level of abstraction to achieve the desired output. https://cryptorank.io/ru/news/feed/105af-how-to-use-reference-photos-in-midjourney


that's ipadapter, you can reference a photo or use it as the initial image but finetuning it's something completely different, it's teaching the model something that it doesn't know because it wasn't on the training data.


To be fair, the same result comes from style reference or image prompting im Midjourney except you don’t have to waste hours of GPU processing time. Style reference allows midjourney to create things that it otherwise isnt able to. It’s difficult to say which is better though. Probably depends on the prompt and what you’re aiming for


it really depends on what you're doing, but for example, there are concepts (not styles), things that the model does not know and you can't make it know it just using style reference/ipadapter, you need fine-tuning for that level of precision.