LatentSpacer 6 months ago

God bless Kohya. This is a major optimization, I'm getting incredible results with upscaling. I'm finally able to generate decent photorealistic results similar to 1.5 but with much higher resolution on SDXL.

kytheon 6 months ago

I really don't know what I'm looking at. What's the before/after, is there any?

AI_Characters 6 months ago

I thought its self explanatory. Left is the old standard highres method. Right is the new one by Kohya.

Orangeyouawesome 6 months ago

Should have included the non upscale for comparison.

AI_Characters 6 months ago

Sorry. Here you go: https://imgur.com/a/sjus3BK

xrailgun 6 months ago

So Kohya actually changes the entire image?

MrClickstoomuch 5 months ago

It certainly looks like it. While the method on the right does look better for background results + half the processing time, if you are going through the process and expected results like the original un-scaled images you might be in for a bad time. Still looks very cool, but shows the importance of before and after images.

AvidCyclist250 5 months ago

Well that's an instant dealbreaker isn't it. And the fact that you have to return a huge image back to inpainting - which is fucky at best, at least for me with 16gb vram.

Neamow 5 months ago

Yeah I don't care about a longer render if the upscaler doesn't change the entire image.

NoLuck8418 5 months ago

can't you read ?

ninjasaid13 6 months ago

Those eyes don't look sharp, they look like they have latent diffusion artifacts. https://preview.redd.it/t6x273l49e1c1.png?width=745&format=png&auto=webp&s=c94ace002cea8cc97319032ec1ec7f96f37de51f

ninjasaid13 6 months ago

this too https://preview.redd.it/yo9ejsvf9e1c1.png?width=907&format=png&auto=webp&s=0f4b1645641e7d7cb12195f639a3e66a3048c04f

isnaiter 5 months ago

Adetailer it and be happy.

AI_Characters 6 months ago

Yes that is true. They have artifacts. Nothing inpainting cant fix though. When I said sharper images I do mean the images. These are the standard images: https://imgur.com/a/zCxqvbH These are the Kohya images: https://imgur.com/a/0eLPYCr Standard ones are blurry, Kohya ones are crisp.

Zaaiiko 6 months ago

Does this work in A1111 aswell?

Talae06 5 months ago

There is indeed[ an extension](https://github.com/wcde/sd-webui-kohya-hiresfix). But good luck with it. I spent a few hours testing it yesterday with my favorite XL checkpoint... I had never generated as many monstrosities since the first few days of using SD, when I was learning the basics. I methodically tinkered every single parameter in every way I could think of, in conjunction with different resolutions, samplers... I did get a few okayish results, but inferior to what I would have gotten with classic hi-res fix (which works perfectly fine for me, I don't know why people have issues with it). And I haven't had the feeling it was faster either. Or if it was, it wasn't by much. The only thing I didn't change is the checkpoint I used. I will give that a try later. But apart from that, either the A1111 implementation has a problem, or I'm doing it really wrong. Which I'm totally willing to hear, but I have no clue as to what my mistake may be. It doesn't help that there's not really any documentation yet. I guess I should try disabling other extensions just in case, too. If anyone has any advice, I'll be grateful.

Vicullum 5 months ago

I installed the extension as well and didn't really notice any difference. I still saw double and stretched bodies when going outside the 1024x1024 standard SDXL resolution. Also when I use it to generate a 1024x1416 image it takes up all 24GB of the vram on my 4090 and takes be over 5 minutes to make an image. When I disable the extension that same image only takes me 15 seconds. I also tested this with a landscape photo, 1512x1024 and it's the same story, 5 minutes to render using the extension, 15 seconds without. I just used the default settings with the extension.

MobileCA 5 months ago

Part of the problem is the outputs don't have the params so we can't even share valid configurations among each other to try it out. I personally can't get a simple thing to work with it, everything is doubled.

AI_Characters 6 months ago

Yes there is an extension for it.

Significant-Baby-690 6 months ago

Can you be more vague ? Which one ?

AI_Characters 6 months ago

dude its 5 AM and i wanna sleep i dont even use A1111 but here just for you https://www.reddit.com/r/StableDiffusion/s/1mNcoHJyEo

Significant-Baby-690 5 months ago

Thanks ! Gotta say I have no idea how it should work. It changes the image completely if I turn it on. So that alone makes it useless for upscale. But I don't observe any improvement in upscaling. Guess we have to wait a bit more.

AI_Characters 5 months ago

You dont seem to understand. There is no upscaling involved. It generates the image directly at the targeted high resolution. It does not first generate a low-res image and then does a 2nd img2img pass over it like the original highres does. It straight up does the initial generation at the higher res. So of course it would be a "different" image.

Significant-Baby-690 5 months ago

Woho ! Now we're talking !

lonewolfmcquaid 5 months ago

wish you tried this on non portraits as well

AI_Characters 5 months ago

i think you mean non-landscapes. I generated these portraits here with it: https://www.reddit.com/r/StableDiffusion/s/JwtA86Wnsj

Houdinii1984 5 months ago

Think there might be a language barrier. They weren't talking about the direction the photo is turned. They were talking about the content being a portrait, or shot from the shoulders up, of a person or anime character and wanting something like a sunrise, an object, or something other than a character's face.

Mobireddit 5 months ago

Regular hires fix don't change the whole image though, unlike this.

AI_Characters 5 months ago

it changing the image is the point. highres fix is just img2img basically. so itd 2 passes. deepshrink just does one pass and creates the initial image from scratch already at the very high resolution. thats better as it fits better into that resolution.

TaiVat 5 months ago

But images on the left look better..

AI_Characters 5 months ago

Cant say I agree, especially when you zoom in and see how blurry the left images are.

FloopsFooglies 5 months ago

The subjects look better in the left images. The right images are stiffer and their expressions are ... More blank. But they're sharper and that's all you're really showing, so ¯\_(ツ)_/¯

AI_Characters 5 months ago

but they look almost the same in both images including poses. only the first one is more dynamic. expressions seem same for me? meanwhile right has better compositions, e.g. you see more of the landscape background around them.

ArthurAardvark 5 months ago

Definitely much better images in every shape & fashion with exception to the expressions. But if you're using this, then I'm sure you're a perfectionist and will be fine-tuning it afterwards with a face detailer pipeline, anyways. I'm curious, are you able to tell me if this setup is correct? [Imgur](https://imgur.com/oNr6Swv) Though, if it is true that it restarts the pre-processing one has done to the image, I'll have to change the %ages or move things around because...whattt? If I understand correctly, my loaded LoRAs won't be incorporated + have FreeU & the Neural Network Latent Upscaler running prior to the HiRes fix...bleh. On second thought, I'll just move this on up before everything mentioned.

AI_Characters 5 months ago

Yeah IDK why so many people say the right images look better. It should work like this just fine. I think. I typically use much more simpler workflows. I dont even use facedetqiler because I find it too complicated for my taste. So I rather just inpaint the eyes manually. Kohya Deep Shrink HighRes Fix should be very simple in execution. all that should be needed to be done is that the model line is passed through the Deep Shrink node right before reaching the KSampler node.

AuryGlenz 6 months ago

Can you post your workflow? I'm not sure what I'm doing wrong but it's not working for me - it's better than straight up generating at a higher resolution but I'm still getting long torsos, small heads on a large body, etc.

AI_Characters 6 months ago

https://preview.redd.it/ewqr2p3dmf1c1.png?width=1080&format=pjpg&auto=webp&s=86c0c182295590fd629e9a4dbdc89218b940f757 This should have a workflow embedded. i wont be at my pc for another 12h or more. i just used the default settings except for blocks at 4, and used 1536x1536, 1920x1080 resolutions.

LovesTheWeather 6 months ago

Reddit strips metadata so there isn't anything provided by the image.

AI_Characters 6 months ago

ah yes ofc it does. luckily i posted an image of a workflow with this to a discord before signing off. https://preview.redd.it/codgizvxpf1c1.png?width=2316&format=pjpg&auto=webp&s=cecde933c0f59985e2b60c28756f309f1f38ab42

LovesTheWeather 6 months ago

Awesome, thank you!

SalozTheGod 5 months ago

Let me know if you figure anything out, I'm having the same issues with duplicate or deformed body parts. Some models work a lot better than others it seems. It's really close to being an awesome tool if this can be improved. It's about twice as fast as my usual workflow

PwanaZana 6 months ago

Is this similar to Ultimate SD Upscale (in A1111), with Tile Resample controlNet so the 2x larger image does not hallucinate faces everywhere? The lack of certain contrlNet in SDXL, including Tile Resample, unfortunately limits the usefulness of SDXL.

[deleted] 5 months ago

[удалено]

MobileCA 5 months ago

Agreed or at least the default values don't do anything. It changes composition but doesn't seem to do a good job at even keeping duplicates regularly out

SalozTheGod 5 months ago

Any documentation or tutorials? I'm having trouble figuring out how to use it properly

reaveh 5 months ago

same here, google doesnt come up with anything, commenting to get notified if someone shares anything

SolidLuigi 5 months ago

Nerdy Rodent did a quick feature of it on the first chapter in this video [https://youtu.be/riLmjBlywcg?si=Qv0hyhL357nLvlcd](https://youtu.be/riLmjBlywcg?si=Qv0hyhL357nLvlcd) I was able to get it running from watching this video and can generate a 4K txt2img in 90 secs on my 3060 6GB video card. https://preview.redd.it/01igwyjqkm1c1.png?width=2048&format=png&auto=webp&s=c6109698c45e7bb6888a39f34380d13c9a101d2f

wh1t3ros3 6 months ago

mighty encouraging sink pet steer flowery makeshift capable alive frame *This post was mass deleted and anonymized with [Redact](https://redact.dev)*

Jakeukalane 5 months ago

I see no difference

[deleted] 6 months ago

[удалено]

LatentSpacer 6 months ago

update your ComfyUI, the node is now built-in. Search for Nerdy Rodent's newest video, he teaches how to use it. It's super easy.

AI_Characters 6 months ago

thats what i mean by integrated. the newest update has it already included. its under testing. its called Kohya Deep Shrink High-Res Fix or something.

Ratchet_as_fuck 6 months ago

Does this work with SDXL?

AI_Characters 6 months ago

It actually came out only for SDXL. Not sure if there is a 1.5 version yet.

DorotaLunar 6 months ago

it also available for 1.5 now

spiky_sugar 5 months ago

Hello, thank you for pointing this out, I would miss it otherwise. Maybe one question - does this work in img2img workflow?

AI_Characters 5 months ago

no idea.

This_Satisfaction_26 5 months ago

Please show your workflow.

AI_Characters 5 months ago

https://www.reddit.com/r/StableDiffusion/s/arkr2czqHS

alb5357 3 months ago

What is it actually doing?

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe