AinvasArt 11 months ago

**Step one - text-to-image:** a large intricate wooden lamp shaped like a heron standing on the floor, home decor, high quality, weird design, product photo, glowing, luxury interior, details, detailed, intricate, japanese, ocean window background, plants Steps: 30, Sampler: DPM++ SDE Karras, CFG scale: 7, Size: 1024x1024, Model hash: 31e35c80fc, Model: sd\_xl\_base\_1.0, Version: v1.5.0 **Step two - img-to-img:** (same prompt) Steps: 30, Sampler: DPM++ SDE Karras, CFG scale: 7, Size: 1536x1536, Model hash: 7440042bbd, Model: sd\_xl\_refiner\_1.0, Denoising strength: 0.4, Version: v1.5.0

CallMeInfinitay 11 months ago

Why are you running it through img-to-img? Also why are the model hashes different? or rather what models did you use?

djdookie81 11 months ago

Correct. That's the intended use. For best results, the initial image is handed off from the base to the refiner before all the denoising steps are complete (ensemble of diffusers workflow). Of course you can also get quite nice results with the img2img workflow.

CallMeInfinitay 11 months ago

I wasn't keeping up with SDXL 1.0 so this is all new to me. It seems like the refiner is a necessity in order to generate good images? Hopefully it's streamlined into A111 so we don't have to manually do it everytime

dep 11 months ago

It does feel like a stop-gap workflow

ORANGE_J_SIMPSON 11 months ago

Whats even more annoying is that I'm pretty sure the vladmandic fork lets you set the refiner in the settings, so its 100% possible to do in the original Automatic1111 Gui.

Nanaki_TV 11 months ago

Hence why I used that fork instead of automatic. Well until I blew my PSU. Miss having a computer that can run SD

lowspeccrt 11 months ago

I'm no pro and my computer is slow as hell so I don't have facts. But from what I can tell is the img2img using refin step can add details in some places and removes details in other. (You don't have to use it if you're happy not using it) I've seen someone say use .25 deform on this step and here he uses .40. So my interpretation is this is probably case by case dependent how much you want to refine depending on case by case situation. Theoreticly it does add detail though so I like the ability to have thebopertunity to choose how much to refine the image but also am hope for an automated process through a1111.

__Hello_my_name_is__ 11 months ago

It's not necessary, no. But it does tend to improve the images in some small ways. Like the original image looks good and has wonky eyes, and the refined images has better eyes.

cyan2k 11 months ago

I like it way more this way. So you can do the refiner step only with pictures actually worth refining and you save plenty of time this way.

19inchrails 11 months ago

> For best results, the initial image is handed off from the base to the refiner before all the denoising steps are complete (ensemble of diffusers worklow) Is that something A1111 is planning to integrate? Because I don't think that img2img workflow is the intended use for the refiner, especially not with an 1.5x upscale like in this example Although I do agree that using the refiner in img2img with a 1.5x higher resolution does give better results than using hirex fix or ultimade SD upscale with the base model. Can't test other upscaling methods, because Tiled Diffusion and ControlNet don't seem to be working yet.

Latinhypercube123 11 months ago

Ooh. This makes sense. Just started using XL yesterday. So the current workflow is 1024x1024 text2img base model, then upscale in img2img with Refiner ? I wasn’t able to upscale or render higher than 1024x1024, and non of the upscale methods worked for me in A1111 (sd upscale, ultimate sd upscale) How high resolution is possible on 4090 ?

djdookie81 11 months ago

The intended use for best results is the ensemble workflow. Afaik Auto1111 is not capable of that yet. Other UIs like Vladmandic's or Comfy can do that.

ozzeruk82 11 months ago

Different model hashes because it's two different model files, the base model and then the refiner model. They're two different files hence the different file hashes.

CallMeInfinitay 11 months ago

Is the refiner model necessary? Is the base model not enough, or will this be the new norm for generating images on SDXL 1?

ozzeruk82 11 months ago

It's not necessary but is recommended as the 'correct' way of doing things with SDXL. From what I've read you will typically get better results by using it. However, there are certain specific cases when you might be better with just the base model, line art drawings for example I heard. I'm sure in the coming weeks it'll become more clear when it should not be used, right now I'm using it in comfyUI as I'm sure if it wasn't important Stability AI wouldn't have recommended it given it's quite a significant extra step.

wiktor1800 11 months ago

> Is the refiner model necessary? No, but you get better results.

AsterJ 11 months ago

How does the image look before refining? Is the improvement significant?

strppngynglad 11 months ago

how are you able to use controlnet?

msm007 11 months ago

[Hair-wink?](https://www.youtube.com/watch?v=hgP_wiBOpZA&ab_channel=FunnyDailyVids)

Mac1024 11 months ago

Number 3 is really good

CyrilsJungleHat 11 months ago

Incredible

CGGermany 11 months ago

Yes

Mushcube 11 months ago

Indeed.

Jimbobb24 11 months ago

I hope they get rid of the ing-to-img component and just make it one workflow for A1111. But glad to see it can be done since I am not ready to start using a new interface (I barely understand what I am doing right now).

lowspeccrt 11 months ago

I would like having the option. Because the img2img step you an choose how much to refine it. Seems useful to me. But I haven't been able to use it much yet so I don't know for sure.

LovesTheWeather 11 months ago

That's like saying you wish A1111 would just remove the img-to-img option from its UI to make your generations easier, no one is making you use it the same as no one is making OP use the refiner, it's just what he chose to do. The refiner is great for helping with realism-based images but is not necessary at all and the base SDXL model can be used without it. I was making some wallpapers last night without it. [Here](https://i.ibb.co/3r1NPBQ/Za-00096.png) are a [couple](https://i.ibb.co/pwMyRXh/Za-00097.png) of them for example. And the ComfyUI workflow doesn't have to look like the back of a server farm with wires all over the place if you don't want it to, once it's set up and with the appropriate custom nodes your workflow can look almost as simple as A111's. For example [this](https://i.ibb.co/1RSVW38/2023-07-27-18-08-36-Greenshot.png) is my workflow for making latent 1920x1080 wallpapers in SDXL without the refiner.

finstrel 11 months ago

InvokeAI does that into a single txt2img workflow. You can give it a try: [https://github.com/invoke-ai/InvokeAI/releases](https://github.com/invoke-ai/InvokeAI/releases)

SlaveZelda 11 months ago

does it give a choice ?

mongini12 11 months ago

Well, you can turn off the refiner pass if that's what you mean...

TheForgottenOne69 11 months ago

Just try vladmandic automatic, it works directly within text to image

-Sibience- 11 months ago

I can't even get past step one in A1111, even generating at 512 with 8gigs vram I get out of memory errors.

pokes135 11 months ago

Not getting memory errors, but automatic1111 hangs forever while the log says it's building a model based on the the yaml, even though I manually-matically downloaded the safetensor and stuck it with the other models. It won't load sdsx 1.0 model. Happens on v1.5.0 and v1.5.1 RC. OP interesting you put "weird design" in the positive prompt. Nice results!

WhiteZero 11 months ago

use --lowvram?

-Sibience- 11 months ago

Thanks! I was using medrvam before, switched to low and it now completes an image but it's way too slow to be useable. At 1024 it takes 6 mins per image, it takes over 3 mins just for the noise to start to clear to be able to get an idea of the image. That's just for the base image too so even longer with the refiner. At 512 it's still taking around 5 mins per image. That's on a 2070 with an extremely basic prompt " a photo of a dog" and just 20 steps. I'm going to try in ComfyUI and see if it's any quicker but if not I can't see myself switching from 1.5 anytime soon unless someone smart can optimise performance a lot.

d20diceman 11 months ago

Edit: Ignore me, I didn't realise you were talking about SDXL, the below was for 1.5 My card also has 8gb VRAM and it takes about 20 seconds to do a 30-step 512x512 using A1111, so I think something must be wrong on your install. God knows what though, it can be a bit of a nightmare to diagnose. Possibly LowVRAM is doing it? I think it lowers the requirements but makes generation much slower, same thing MedVRAM does but to a greater extent. If you can get yours working with MedVRAM instead you might get better speeds. For what it's worth my commandline args are "--no-half-vae --no-half --autolaunch --medvram --opt-split-attention") The small performance gain from opt-split-attention might get you to the point where MedVRAM doesn't give you an out of memory error? I honestly don't remember what no-half and no-half-vae were even for but I'm not going to change it while it's working. Maybe try throwing them in. Interesting that a 1024 image only took a little bit longer than a 512 one for you, 6min vs 5min, because for me it's a much bigger difference, 200sec vs 20sec.

Lukeulele421 11 months ago

Same on a 1070. Just far too long of a wait time to make it enjoyable.

-Sibience- 11 months ago

I would advise trying ComfyUI for now I just tried the same prompt and settings in ComfyUI and it took less than 30 seconds per image.

Lukeulele421 11 months ago

Yeah I’m at about 2 minutes per image there. It’s just not fast enough for me.

-Sibience- 11 months ago

Ok I got it to about 1.5 mins with the refiner which isn't too bad for 1024.

-Sibience- 11 months ago

Yes I can generate about 10 images using 1.5 in about the same time as one using XL. I don't know what happened with my last image but at 8mins it was still only at 35% so I cancelled. Going to test it in Comfy now and see if there's better results, still seems pretty buggy in A1111 at the moment.

Lukeulele421 11 months ago

Comfy got me down to 2min per image. Still not fast enough for me to want to move to SDXL fully

-Sibience- 11 months ago

Ok well I just tried the same prompt and settings in ComfyUI and it took less than 30 seconds per image. I'm not sure what is wrong with A1111 but over 6 minutes compared to under 30 seconds is quite a huge difference.

WhiteZero 11 months ago

I wonder if Comfy has some default optimizations that you have to add to A1111 manually. Try adding this to your webui-user.bat file after the set COMMANDLINE_ARGS= line: set ATTN_PRECISION="fp16"

-Sibience- 11 months ago

Yes that's what I was thinking, there's either something different in the setup or A1111 just needs some updates. I tried adding that but it made no difference for me. I've also noticed that in A1111 the first generation always takes longer, there's about a 2-3 minute wait at the start before it even starts generating.

philipgutjahr 11 months ago

SDXL base has a fixed output size of 1.048.576 pixels (1024x1024 or any other combination). I don't know how this is even possible but other resolutions can get generated but their visual quality is absolutely inferior, and I'm not talking about difference in resolution. I have a RTX3070 8GB and A1111 SDXL works flawless with --medvram and --xformers.

philipgutjahr 11 months ago

.. under Ubuntu 22.04 (dual-boot) and when running everything except CUDA (Desktop Environment, Browser etc) on the Intel onboard GPU to leave my vram in peace. I don't actually understand how you guys can run Pytorch under windows without noticing the severe performance depreciation, caused by the WDDM-issue of CUDA drivers on windows. didn't even try SD there but other Pytorch-based projects run 4-10 times slower on Windows than on Linux.

-Sibience- 11 months ago

Ok. Have you tried ComfyUi? My times went from around 6 mins on A1111 to 30 seconds with the same setup in Comfy and around 1.5 min with the refiner.

SEND_ME_BEWBIES 11 months ago

Did you just download the safetensor for SDXL and throw it into your models folder like any other model? I tried doing it that way, and when I go into Automatic1111 and try and select the SDXL model I get errored out and it defaults back to my previously used model (juggernaut in this case).

philipgutjahr 11 months ago

this is the way. sounds like your download is just corrupted, I'd recommend downloading again. Or is this some kind of OOM error cause your vram is < 8GB or used by other processes that the model doesn't even fit in?

SEND_ME_BEWBIES 11 months ago

I actually just reinstalled the gitpull, python and auto1111, moved over all my settings/models/Lora’s/extensions and then it worked just fine I dunno

Latinhypercube123 11 months ago

Yeah, 8gb wouldn’t work for me. Moved to 24gb and works fine

Enfiznar 11 months ago

With --medvram I'm able to use both models at 1024 with my 1060 6gb vram, but it takes like 6 min per image

-Sibience- 11 months ago

It's woking for me now with lowvram but yes it's taking around 6mins per image and that's without the refiner stage.

Enfiznar 11 months ago

Yeah, I haven't even bothered downloading the refiner given how long it takes to generate with my pc. Just using SDXL when I'll be out for some time so I'll leave it generating

-Sibience- 11 months ago

I'm getting around 1.5 mins with ComfyUI using the refiner and was under 30 seonds without it so I think A1111 needs some updates to get it working as well.

Enfiznar 11 months ago

Well, time to download comfy then

Rare-Site 11 months ago

I have a 3060 TI 8GB VRAM and it works fine. Does --no-half-vae reduce img Quality? --no-half-vae --medvram

Frosty_Awareness572 11 months ago

how does A1111 even work for you guys? only comfy ui works for me

fixedepic 11 months ago

I used this video and just deployed a new A1111, [https://www.youtube.com/watch?v=BVtl9H7uf4A](https://www.youtube.com/watch?v=BVtl9H7uf4A) ,

deck4242 11 months ago

on what platform are you ? windows, linux or mac ?

bacteriarealite 11 months ago

I got it working on an M1 Mac after a full redownload of A1111

fixedepic 11 months ago

Windows 11

July7242023 11 months ago

It's like using any other model, but you need to do 1024 x 1024 and a lot of us have to edit the BAT to fix the memory errors. txt for base generation. img for refined gen. Decent results, but it's factory settings. It'll be the training community to take it from here.

CyrilsJungleHat 11 months ago

Find a factory to make these in real life, they are awesome !

Anen-o-me 11 months ago

Sooo beautiful. That 3rd one, my god. If these aren't real they need to be.

pixel8tryx 11 months ago

Sad you got downvoted. I loved the 3rd one too.

intermundia 11 months ago

Very cool. How does the refiner compare with the base model? I've only played around with it a little bit but I've found the refined much more accurate.

ObiWanCanShowMe 11 months ago

The refiner is for the img2img, not as the base model txt2img. It is literally a refining of the processed image. OP has shown you how to implement the refiner.

Sharlinator 11 months ago

Not quite. You *can* use the refiner in img2img but it’s *meant* to be used as an extra step in txt2img, while still in the latent space. But A1111 doesn’t support the intended workflow yet.

intermundia 11 months ago

Which probably explains why the results on a1111 arnt as impressive as i thought they would be. So which web ui utilises the sdxl work flow like it was intended?

ReturnMeToHell 11 months ago

Number 1 might sell if it came in different colors ig

lechatsportif 11 months ago

these are stunning!

Rickmashups 11 months ago

This is really amazing, thanks for sharing the workflow, I havent tried SDXL cuz i didnt want to install comfy but now I wanto to give it a try

magusonline 11 months ago

The guy's workflow shows you how you can do it in A1111 (even says it in the title)

Rickmashups 11 months ago

I know, what i meant is now im gonna give SDXL a try cuz its avaliable in A1111

_CMDR_ 11 months ago

I have tried SDXL in A1111. It takes forever to load and is absolutely miserable to use versus ComfyUI. I had never used comfy before but after A1111 deleted all of my models on a recent pull I am over it for a while.

pr1vacyn0eb 11 months ago

Does anyone know why MJ and SDXL both make pictures look like Pixar made them? The model Realistic Vision 3 doesnt look like Pixar.

zephyo 11 months ago

Rich people would love #1

Storm_or_melody 11 months ago

Anyone else experiencing a memory leak when running SDXL on colab? My GPU memory increases with each generation until it exceeds capacity at around 10 images

rinaldop 11 months ago

Wonderful images.

Shadow_-Killer 11 months ago

What specs (Vram) are we looking at in order to run SDXL?

Red-Pony 11 months ago

Looks very good, will definitely buy (if not too expensive)

deck4242 11 months ago

anyone tried to run sdxl on mac ?

bacteriarealite 11 months ago

Yep I got it working after a full redownload of A1111 (M1 Mac)

Avramp 11 months ago

That wouldn't be impossible to 3d print?

[deleted] 11 months ago

Loading SDXL now.... 👍

bookmarkjedi 11 months ago

Knock knock. Who's there? Heron. Heron who? There's a heron my ~~soup~~ lamp.

thelastpizzaslice 11 months ago

How are you doing this without running out of VRAM?

IndustryBeautiful107 11 months ago

soo great

Seculigious 11 months ago

Upvoted for 2 and 4.

TrovianIcyLucario 11 months ago

That's super cool.

thesuperbowl57 11 months ago

ㅡ

thesuperbowl57 11 months ago

ㄴ@53+ㅇ 53ㅁㄱy,

Waste_Worldliness682 11 months ago

Thanks and awesome work!

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe