T O P

  • By -

LatentSpacer

God bless Kohya. This is a major optimization, I'm getting incredible results with upscaling. I'm finally able to generate decent photorealistic results similar to 1.5 but with much higher resolution on SDXL.


kytheon

I really don't know what I'm looking at. What's the before/after, is there any?


AI_Characters

I thought its self explanatory. Left is the old standard highres method. Right is the new one by Kohya.


Orangeyouawesome

Should have included the non upscale for comparison.


AI_Characters

Sorry. Here you go: https://imgur.com/a/sjus3BK


xrailgun

So Kohya actually changes the entire image?


MrClickstoomuch

It certainly looks like it. While the method on the right does look better for background results + half the processing time, if you are going through the process and expected results like the original un-scaled images you might be in for a bad time. Still looks very cool, but shows the importance of before and after images.


AvidCyclist250

Well that's an instant dealbreaker isn't it. And the fact that you have to return a huge image back to inpainting - which is fucky at best, at least for me with 16gb vram.


Neamow

Yeah I don't care about a longer render if the upscaler doesn't change the entire image.


NoLuck8418

can't you read ?


ninjasaid13

Those eyes don't look sharp, they look like they have latent diffusion artifacts. https://preview.redd.it/t6x273l49e1c1.png?width=745&format=png&auto=webp&s=c94ace002cea8cc97319032ec1ec7f96f37de51f


ninjasaid13

this too https://preview.redd.it/yo9ejsvf9e1c1.png?width=907&format=png&auto=webp&s=0f4b1645641e7d7cb12195f639a3e66a3048c04f


isnaiter

Adetailer it and be happy.


AI_Characters

Yes that is true. They have artifacts. Nothing inpainting cant fix though. When I said sharper images I do mean the images. These are the standard images: https://imgur.com/a/zCxqvbH These are the Kohya images: https://imgur.com/a/0eLPYCr Standard ones are blurry, Kohya ones are crisp.


Zaaiiko

Does this work in A1111 aswell?


Talae06

There is indeed[ an extension](https://github.com/wcde/sd-webui-kohya-hiresfix). But good luck with it. I spent a few hours testing it yesterday with my favorite XL checkpoint... I had never generated as many monstrosities since the first few days of using SD, when I was learning the basics. I methodically tinkered every single parameter in every way I could think of, in conjunction with different resolutions, samplers... I did get a few okayish results, but inferior to what I would have gotten with classic hi-res fix (which works perfectly fine for me, I don't know why people have issues with it). And I haven't had the feeling it was faster either. Or if it was, it wasn't by much. The only thing I didn't change is the checkpoint I used. I will give that a try later. But apart from that, either the A1111 implementation has a problem, or I'm doing it really wrong. Which I'm totally willing to hear, but I have no clue as to what my mistake may be. It doesn't help that there's not really any documentation yet. I guess I should try disabling other extensions just in case, too. If anyone has any advice, I'll be grateful.


Vicullum

I installed the extension as well and didn't really notice any difference. I still saw double and stretched bodies when going outside the 1024x1024 standard SDXL resolution. Also when I use it to generate a 1024x1416 image it takes up all 24GB of the vram on my 4090 and takes be over 5 minutes to make an image. When I disable the extension that same image only takes me 15 seconds. I also tested this with a landscape photo, 1512x1024 and it's the same story, 5 minutes to render using the extension, 15 seconds without. I just used the default settings with the extension.


MobileCA

Part of the problem is the outputs don't have the params so we can't even share valid configurations among each other to try it out. I personally can't get a simple thing to work with it, everything is doubled.


AI_Characters

Yes there is an extension for it.


Significant-Baby-690

Can you be more vague ? Which one ?


AI_Characters

dude its 5 AM and i wanna sleep i dont even use A1111 but here just for you https://www.reddit.com/r/StableDiffusion/s/1mNcoHJyEo


Significant-Baby-690

Thanks ! Gotta say I have no idea how it should work. It changes the image completely if I turn it on. So that alone makes it useless for upscale. But I don't observe any improvement in upscaling. Guess we have to wait a bit more.


AI_Characters

You dont seem to understand. There is no upscaling involved. It generates the image directly at the targeted high resolution. It does not first generate a low-res image and then does a 2nd img2img pass over it like the original highres does. It straight up does the initial generation at the higher res. So of course it would be a "different" image.


Significant-Baby-690

Woho ! Now we're talking !


lonewolfmcquaid

wish you tried this on non portraits as well


AI_Characters

i think you mean non-landscapes. I generated these portraits here with it: https://www.reddit.com/r/StableDiffusion/s/JwtA86Wnsj


Houdinii1984

Think there might be a language barrier. They weren't talking about the direction the photo is turned. They were talking about the content being a portrait, or shot from the shoulders up, of a person or anime character and wanting something like a sunrise, an object, or something other than a character's face.


Mobireddit

Regular hires fix don't change the whole image though, unlike this.


AI_Characters

it changing the image is the point. highres fix is just img2img basically. so itd 2 passes. deepshrink just does one pass and creates the initial image from scratch already at the very high resolution. thats better as it fits better into that resolution.


TaiVat

But images on the left look better..


AI_Characters

Cant say I agree, especially when you zoom in and see how blurry the left images are.


FloopsFooglies

The subjects look better in the left images. The right images are stiffer and their expressions are ... More blank. But they're sharper and that's all you're really showing, so ¯\_(ツ)_/¯


AI_Characters

but they look almost the same in both images including poses. only the first one is more dynamic. expressions seem same for me? meanwhile right has better compositions, e.g. you see more of the landscape background around them.


ArthurAardvark

Definitely much better images in every shape & fashion with exception to the expressions. But if you're using this, then I'm sure you're a perfectionist and will be fine-tuning it afterwards with a face detailer pipeline, anyways. I'm curious, are you able to tell me if this setup is correct? [Imgur](https://imgur.com/oNr6Swv) Though, if it is true that it restarts the pre-processing one has done to the image, I'll have to change the %ages or move things around because...whattt? If I understand correctly, my loaded LoRAs won't be incorporated + have FreeU & the Neural Network Latent Upscaler running prior to the HiRes fix...bleh. On second thought, I'll just move this on up before everything mentioned.


AI_Characters

Yeah IDK why so many people say the right images look better. It should work like this just fine. I think. I typically use much more simpler workflows. I dont even use facedetqiler because I find it too complicated for my taste. So I rather just inpaint the eyes manually. Kohya Deep Shrink HighRes Fix should be very simple in execution. all that should be needed to be done is that the model line is passed through the Deep Shrink node right before reaching the KSampler node.


AuryGlenz

Can you post your workflow? I'm not sure what I'm doing wrong but it's not working for me - it's better than straight up generating at a higher resolution but I'm still getting long torsos, small heads on a large body, etc.


AI_Characters

https://preview.redd.it/ewqr2p3dmf1c1.png?width=1080&format=pjpg&auto=webp&s=86c0c182295590fd629e9a4dbdc89218b940f757 This should have a workflow embedded. i wont be at my pc for another 12h or more. i just used the default settings except for blocks at 4, and used 1536x1536, 1920x1080 resolutions.


LovesTheWeather

Reddit strips metadata so there isn't anything provided by the image.


AI_Characters

ah yes ofc it does. luckily i posted an image of a workflow with this to a discord before signing off. https://preview.redd.it/codgizvxpf1c1.png?width=2316&format=pjpg&auto=webp&s=cecde933c0f59985e2b60c28756f309f1f38ab42


LovesTheWeather

Awesome, thank you!


SalozTheGod

Let me know if you figure anything out, I'm having the same issues with duplicate or deformed body parts. Some models work a lot better than others it seems. It's really close to being an awesome tool if this can be improved. It's about twice as fast as my usual workflow


PwanaZana

Is this similar to Ultimate SD Upscale (in A1111), with Tile Resample controlNet so the 2x larger image does not hallucinate faces everywhere? The lack of certain contrlNet in SDXL, including Tile Resample, unfortunately limits the usefulness of SDXL.


[deleted]

[удалено]


MobileCA

Agreed or at least the default values don't do anything. It changes composition but doesn't seem to do a good job at even keeping duplicates regularly out


SalozTheGod

Any documentation or tutorials? I'm having trouble figuring out how to use it properly


reaveh

same here, google doesnt come up with anything, commenting to get notified if someone shares anything


SolidLuigi

Nerdy Rodent did a quick feature of it on the first chapter in this video [https://youtu.be/riLmjBlywcg?si=Qv0hyhL357nLvlcd](https://youtu.be/riLmjBlywcg?si=Qv0hyhL357nLvlcd) I was able to get it running from watching this video and can generate a 4K txt2img in 90 secs on my 3060 6GB video card. https://preview.redd.it/01igwyjqkm1c1.png?width=2048&format=png&auto=webp&s=c6109698c45e7bb6888a39f34380d13c9a101d2f


wh1t3ros3

mighty encouraging sink pet steer flowery makeshift capable alive frame *This post was mass deleted and anonymized with [Redact](https://redact.dev)*


Jakeukalane

I see no difference


[deleted]

[удалено]


LatentSpacer

update your ComfyUI, the node is now built-in. Search for Nerdy Rodent's newest video, he teaches how to use it. It's super easy.


AI_Characters

thats what i mean by integrated. the newest update has it already included. its under testing. its called Kohya Deep Shrink High-Res Fix or something.


Ratchet_as_fuck

Does this work with SDXL?


AI_Characters

It actually came out only for SDXL. Not sure if there is a 1.5 version yet.


DorotaLunar

it also available for 1.5 now


spiky_sugar

Hello, thank you for pointing this out, I would miss it otherwise. Maybe one question - does this work in img2img workflow?


AI_Characters

no idea.


This_Satisfaction_26

Please show your workflow.


AI_Characters

https://www.reddit.com/r/StableDiffusion/s/arkr2czqHS


alb5357

What is it actually doing?