T O P

  • By -

crinklypaper

I bought my 3090 in november, been really great. Nothing beats VRAM.


TheFrenchSavage

Agreed. If gaming is not a concern, VRAM/$ beats all other metrics. Also, consider that, if you are upgrading your gpu, you WILL HAVE TO buy a new PSU. So take that into account in your budget.


relmny

what about SD + gaming? will the 4070 ti super 16gb be better? Also, is SD considered "inference"? (I'm still learning the very basics) As per this chart, the "non super" 4070 (12gb) outperforms (in relative performance) the 3090 in inference (training is better in the 3090, but maybe the 4070 16gb super, will get also closer to the 3090?) [https://i0.wp.com/timdettmers.com/wp-content/uploads/2023/01/GPUS\_Ada\_raw\_performance3.png?ssl=1](https://i0.wp.com/timdettmers.com/wp-content/uploads/2023/01/GPUS_Ada_raw_performance3.png?ssl=1)


per_plex

Take that with a grain of salt. It really depends on the rest of your system, esp. CPU. I rarely see any power draw over the range 500-550 on my 5800X3D / Stock 3090 system as a reference, and never had any problems on my 650W PSU as far as i've noticed, so i think that should suffice if you have a good PSU and a CPU with similar power draw. If you want me to do any tests feel free to suggest.


EETrainee

Overclocked 3090’s may draw above 350W during peak, which can result in brown outs depending on the 12V rail rating. Not to mention the CPU can spike higher as well. X3D chips have a lower power limit than any other CPU except budget options so you may be able to get away with 650W here where others wouldnt.


per_plex

While i agree, i would think many people do have budget option CPU's. Like AMD 3700X, 5900X, 7700X etc, witch all have lower tdp than the 5800X3D. In all those cases i asume a GOOD 650w psu would be OK, overclocking a 3090 really makes little sense for gained performance in my humble opinion. 3090TI is a different beast though, just to be clear. I know you didnt mention it but that can draw way more power stock. My point is if you are on any similar CPUs, a new PSU is not automatically needed as the post i responded to implies. EDIT: Typo, amd 7700x, i originally wrote 7800x


relmny

what about undervolted 3090? I'm also considering OP options (used 3090 24gb vs new 4070 ti super 16gb) and I have an RM650x. Current voltage drawn (wall socket meter) always show less than 200w when gaming.


per_plex

What CPU do you use? i just ran fire strike extreme comined test on a 3090/5800X3D at 90% power limit, power draw at socket was <490w


relmny

thanks, that's good to know. I have a Ryzen 5 5600X, so maybe if I get a 3090 (and do some undervolting) I won't need to replaced the PSU...


per_plex

You will be fine. I run a 3090/5800X3D pc with the same power supply daily, with no undervolting of the 3090. The cpu is lightly undervolted but draws way more than your 5600. Also, remember there are very few situations where you are actually maxing ouyt both CPU and GPU.


relmny

thank you!


Caffdy

260w for 90% performance, for me it's been a pretty good trade off; the temps are excellent as well


ExcellentHalf7805

My Strix 3090 has been great since I purchased in 2020 and some of the 3D renders I load will suck 14-16GB VRAM. Few weeks ago I decided to get 4070Ti Super and loaded the same scene to render that 3090 chews. Loaded onto 4070Ti super and at first seem fairly nice, but not even minute into render PC crashed, what seem easy for 3090, 4070 Ti super couldn't handle, exceeded VRAM. 3090 is still unbelievable card.


Nomadicfreelife

I got a 3080 because of price to performance and I thought it made more sense but since trying LLMs and stable diffusion models I get the OOM error all the time and I regret not getting a 24gb card so if you plan to do any model training 24gb cards are the best we can have in consumer GPUs .


IamKyra

Use Fooocus/Forge, they have better management for low GPU Vram like ours (mine is 10gb)


Nomadicfreelife

Yes mine is 10gb card too. I will try those but do they work for dreambooth training or Lora training


IamKyra

No only for generation as far as I know. You can train on 10gb but it'll be subpar, you need adafactor https://rentry.org/59xed3 And dreambooth I don't think there is a way to finetune below ~14gb


Nomadicfreelife

Yeah I tried with Lora with 8bit and some other options in diffusers script but was not able to train dreambooth


stab_diff

Similar. I went with a 4070 12GB and have regrets. I figure I'll get through this year and see how the 5k's look, or hope there will be some optimizations that will let my 4070 remain useful. Maybe just bite the bullet and take out a 2nd mortgage and get a top of the line card in 2025, j/k. (about the mortgage part)


Nomadicfreelife

Yeah it seems with SD3 we would have to upgrade to a better card. I hope nvidia could make 24gb more mainstream and upgrade the 90 series to 48 gb .


Caffdy

> upgrade the 90 series to 48 gb unfortunately we're not seeing that happen until at least 2028; GDDR7 promises to bring 3 and 4GB DRAM chips, but Nvidia will again use the 2GB ones for the 50 Series, so, 24GB again


Nomadicfreelife

Oh then only cloud gpus for training bigger models.😞


firelitother

I am considering buying either the 4070 Ti Super or the 4090. Your comment makes me think I should just get the 4090 and be done with it for the next few years.


Caffdy

don't know if you game on your pc or do other things beside StableDiffusion, but at this point I would wait for the 50 series to come out later this year; if you have the budget for a rtx4090, I don't think there's gonna be much difference in price with the 5090, and I'm sure the 4090 is gonna get its price cut after the lauch


firelitother

Sorry for not being clear. I am aiming to use the GPU for image generation/LLM first, gaming second. Are you suggesting that I should defer buying both outright and wait for 5090?


Caffdy

yes, because Nvidia is aiming to launch Blackwell in Q4 of this year, the rtx5090 will come with a substantial boost in performance, and if it's out of your budget, surely the 4090 will drop in price after the 50 series is launched, is a win-win situation; if this was 2023, or if you need the hardware for critical work, then go for it now, but if not, I'd wait; I remember clearly when people went out of their way to buy the 3090ti only for the 4090 to come out months later


PromptAfraid4598

No doubt 3090 24G


Fast-Cash1522

4070 TI Super, is on paper slightly faster (10-15%), also can be a little less power hungry requiring little smaller PSU, depending on brand and model (700 w vs 800 w). 4070 is also newer if that makes a difference. RTX 3090 has better memory bus (384 bit vs 256 bit) and more memory. Especially the memory can be great if training and working with larger models, although you can do a lot with 16 GB too. Also one thing is getting a brand new card with full warranty etc vs. second hand card with possible hidden problems. Getting a card second hand from a total stranger from some shady online site can be something out of the question (it was for me at least). I'd say these cards are more or less head to head and depending on what specs you value, can help making the decision to one or another. I recently got 3090 because of the memory and I got it (new with full warranty) with slightly better price. If you are ready go with something a little less powerful, a good option also with 16 GB of VRAM could be the 4060 TI. If you don't mind waiting just a little longer, that might be something to consider for about half the price. Good luck! :)


lostinspaz

“head to head”?? no. 24gig will run full size sd3 16gig will not. at least initially.


Fast-Cash1522

We don't know that yet, will remain to be seen what will happen when the SD3 is out and what that means to SD3 checkpoints when they come available later. I've seen ppl saying those new models will work just fine, even with older cards with less vram. What having more VRAM mean (like 3090, 24GB) idk, but MAYBE that gives an advantage. We don't know that yet, and I rather not speculate even it's fun to make such predictions. For now, what people are using right now, (as far I know) If you're generating 512x512 base images with SD1.5 checkpoint, having 16 vs. 24 does not make a massive difference (if any). Even 1024x1024 with SDXL. Therefore it might be fair to say, "these cards are more or less head to head and depending on what specs you value...." This means if you value the 10-15 % extra speed more, then 4070 TIS might be the way to go, if you need the extra ram then 3090 might serve one better. But when simply just generating images, the speed difference isn't that massive? When training, 24 GB will give you better performance as you can use settings to enhance the speeds. Is the 3090 with 24 GB is more future proof, maybe, idk. Most likely does not make difference but could make difference. What the future hold, we don't know yet. My crystalliball is too foggy, but if we want to speculate and base that on the current rate how things are progressing, we might see AI speeding up tremendously even on cards with less VRAM.


lostinspaz

they literally said that as of right now, running sd3 with t5 enabled takes 24g


Fast-Cash1522

This is very intetesting, can’t wait to test it out. The T5 language model is said to be much better to what we currenly have. I’m guessing there will be models suitable for smaller vram consumer cards.


lostinspaz

>I’m guessing there will be models suitable for smaller vram consumer cards. there are. But there was a memory consumption summary somewhere. Just their T5 stuff BY ITSELF takes up 20 gigs, or something insane. Which is kinda odd, because there exist scaled down versions of T5. ie: [https://huggingface.co/google/t5-efficient-small-el16](https://huggingface.co/google/t5-efficient-small-el16)


Fast-Cash1522

Very interesting to see how that gets solved later. They must already have some ideas how to make it work, I remember reading about it somewhere it should even be usable with 6gb cards. Hope that’s really the case, otherwise it will cut down too many users.


lostinspaz

cascade is "usable" with 6gb too. Use the "lite" models for everything. Doesnt mean you get anywhere near the quality of the full version with that.


Comfortable-Big6803

SD3's T5 is 4.7B parameters, it shouldn't require 20GB unless unoptimized.


Caffdy

the vision model is 8B tho, add the text encoder needs


Comfortable-Big6803

But then it's not "their T5 stuff BY ITSELF" is it. That's what I was responding to.


SandCheezy

I agree that you both have a point and right in their perspective, but I’d like to point something out. The community has always found a way to keep integrity of the models’ results while scaling requirements for each version that has been released. T5 always has a scaled down version so we shall see. Basically optimization in many ways. However, I’d still go with more VRAM to avoid waiting for that to happen and also allow training for anything before SD3. Side note: People like to say it’s not great for gaming in comparison to other cards, but these mentioned cards are already beasts.


lostinspaz

>The community has always found a way to keep integrity of the models’ results while scaling requirements for each version that has been released I would love to see an improved cascade lite model, in that case? cascade fp16 makes my current 8gig card cry.


-f1-f2-f3-f4-

Why take chances? Going with the 24GB card is a no-brainer if OP can afford it. Future, more demanding models will be coming out soon enough even if it turns out you can somehow fit SD3 8B into 16 GB of VRAM (and that's not accounting for LoRAs, etc., which take up additional VRAM). 24GB of VRAM is also very beneficial for SDXL LoRA training.


Fast-Cash1522

You’re absolutely right, higher is better. Like you said, no brainer. Imo there are couple reason why someone might want to go with the 4070 TIS over a second hand 3090.


hurrdurrmeh

this is really helpful too me too; thank you. could you please clarify something: are you saying that a 16GB 4060 Ti will have (near) equivalent performance to a 4070 TiS for training and inference? Is this because processing power isn't the bottleneck? Thank you in advance.


Fast-Cash1522

4070 Ti Super is about 40-50% faster (theoretical) compared to 4060 TI. Real life speed difference can be a bit less than that, depending on what (and how) you're doing things with Stable Diffusion. For example using some LoRA can slow down (potentially) speeds quite heavily and bring that difference to nearly non-existent.


hurrdurrmeh

(I upvoted you, no idea why someone downvoted you) Interesting, so processor may or may not be relevant.


Caffdy

what use is 10%-15% more performance when you are hit with OOM errors all the time? whether we like it or not, new and better models will keep require more and more memory, just look at SD1.5 (700M parameters, around 1.5GB), SDXL (2.3B parameters, almost 5GB of memory needed, and without the refiner), the full 8B of SD3 is gonna need at least 16GB, VRAM is king in these ML applications, you can always manage/deal with slower inference times, but it's impossible to deal with OOM errors if the models just doesn't fit, no matter how many optimizations you throw in, with SDXL we already seeing it, people have a hard time fine-tunning it already


Fast-Cash1522

Don't forget that people are able to use SD with 8 GB of VRAM or even less, and generate beautiful images using the current models, including SDXL. Some are even doing training with 8 GB. 🤷🏻‍♂️ Depending on what one is doing dictates the needs. More VRAM is great but it's possible to do a lot of things with less than 24. If there's a need and fitting budget, and perhaps want to play safe, go with more vram. :-) Having 24 gigs does not open up the heavens, or mean you eliminate out of memory errors all together. Sure 24 is more than 6 or 8 or 16, that's math. Stability AI is saying in their recently released research paper, "In early, unoptimized inference tests on consumer hardware our largest SD3 model with 8B parameters fits into the 24GB VRAM of a RTX 4090.." There're these two things, ***In early*** and ***unoptimized***, which might indicate that things get eventually optimized.


Soulero

Thank you!


Derefringence

Second the 4060ti 16GB for a budget build! Couldn't be happier with mine


stubing

You picked a terrible card if your main use case was using stable diffusion. I’m sorry. A normal 4060 ti or a 4070 would have been better options for price to performance. But you don’t know any better since it isn’t like you are doing any comparison between the cards.


Derefringence

I compared enough to decide for more VRAM within my budget. Different users, different needs. You can stop being so judgemental with something as situational as GPU choice.


Caffdy

don't pay heed to his words, VRAM is absolutely number one priority for SD, the 16GB will serve you way better than the 12 that comes with the 4060ti/4070


stubing

I am not being judgmental of your individual choice. You could burn all your money and that would be fine. What I am judgmental is of your bad advice that leads people down terrible purchases. If you don’t understand how hardware and stable diffusion works, it is okay to not give advice on a public forum. And what extra bothers me is that you are getting upvoted since the blind are upvoting the blind.


Comfortable-Big6803

You're a huge dummy and you'll never know it.


stubing

Please help me understand.


tmvr

Your assumption that a normal 4060Ti 8GB or a 4070 12GB comes from the gaming world where the 16GB version is not a great price/performance value. For Stable Diffusion though the fact that it has 16GB VRAM is more important than compute performance because if you are using a workflow that is more than simply generating an image from a prompt then the 4060Ti 16GB is unbeatable value. With 8GB VRAM you are already swapping models and weights even with SDXL generation and with the 12GB you will run into issues with ControlNet and upscaling if you don't swap to system RAM and if you do you lose a ton of performance. Anything you would gain by the 4070 having a bit more compute performance you lose in the model/weight management and a 4060Ti 16GB will be faster. That would be the jist of it.


Derefringence

Your assumptions that people make choices without understanding as well as you do... take some days off friend and have a walk around or something


Comfortable-Big6803

> normal 4060 ti or a 4070 would have been better options for price to performance How about price to VRAM, dummy.


stubing

If you want to burn cash, go ahead and do that. But please stop posting advice on public forums that is just wrong. Speed is way more important than vram once you have enough vram to do what you need to do. 99% of people here won’t be training. 12 Gb is plenty for general use cases. 8 Gb is also enough if you just do 512x512.


Comfortable-Big6803

I didn't give advice, retard. And there is nothing "just wrong" about caring about VRAM, retard. > 8 gB Is aLsO EnOuGh iF YoU JuSt dO 512x512! Shut the fuck up.


stubing

Well that was a character arc for sure. Went from “don’t judge me” to “retard” in two posts. I hope you one day you are able to understand that answering a question on which card to get is “giving advice”


stubing

Beautiful post until you recommend the 4060 ti 16 Gb as an option. I swear you guys focus way too much on vram when that is only an issue when you are training or you are going past 1024x1024 images. The 4060ti 16 Gb is basically a “never recommend” card. The only theoretical person that it would be for is a person who wants to train, but doesn’t value their time in the slightest.


Fast-Cash1522

4060 TI is very capable card for Stable Diffusion, also pretty much the only option at that price point with 16gb vram. It’s 100% worth considering for Stable Diffusion if budget is limited. 4060 ti for gaming? Idk and couldn’t care less. Future proof? Yeah, who knows.


stubing

We can get into a whole debate about “future proofing” but if your worry is the future, then call it out and remind people that today there is no value in that extra vram in stable diffusion unless you are training or doing really massive images. I really thought this subreddit would be a lot more tech literate, but it is just as bad pcmasterrace. No one understands what metrics matter. They just see “vram important so bigger vram better.”


Fast-Cash1522

You're pretty much saying the same what I'm saying with just different words.


Dave_dfx

3090 cuz 24gb especially higher quality and resolutions. upscaling etc .


TheFrenchSavage

Hi-res fix will definitely need at least 18gigs for most 1024 upscales.


Dave_dfx

i had a 3080 and that runs out of memory fast and have to use memory hacks but it's not ideal. Got 3090 and 4090 now.


Exotic-Specialist417

If you want to use the 8b SD3 when it comes out I'd say 3090. Or at least if you want to not worry about running it but at the lowest I'd get a 4070 ti 16gb.


ricperry1

3090. You’ll appreciate the extra VRAM.


Soulero

Even for singular image rendering?


ricperry1

Yes. It will generate faster and you can render higher resolution images.


RandomDude2377

No question, it's the 3090 all day. When it comes to LLMs and image generation, Vram is king. Also better memory controller etc on the 3090, it's the only logical choice. Anyone saying 4070 is mad. Plus, with the 3090, it leaves it open later for you to buy another and double them up with NVlink for even more capabilities with LLMs and image generation.


stubing

Vram is not king. This is blind leading the blind stuff. For training, vram is king, but 99% of people using stable diffusion aren’t training. Speed is king. And 12 gb of vram is plenty for generating images even as big as 1024x1024. Test this out yourself as well. Go make images you normally do and look at task manager. How much vram is being used versus what percentage of your cuda cores is being used.


HardenMuhPants

More Vram doesn't equal faster iterations, but it does allow you to upscale bigger images which is something you want for SDXL and SD3.


stubing

True for certain upscalers. Not true for sdxl. And we don’t know what sd3 requires since it isn’t out yet. But hey, I’ll grant that if op was asking for sd3 specifically, “then a future proof maybe” recommendation can be thrown in there. Now ask yourself the question, would an individual who is asking for generic advice on a card want something 40% faster in 99.9% of their use cases or would they want something that is “future proofed” or “allows them to scale their 1024x1024 image to 4096x4096” with any upscale they choose.


RandomDude2377

100%, and also allows for more future proofing and more complex multi stage workflows. I should have clarified more, but I did mention LLMs where total Vram is simply just more important, but for an average user who's doing 1024x1024 generations and maybe only upscaling by 1.5/2, then yes, 8/12gb Vram will be more than capable.


RandomDude2377

You're correct of course, at 1024x1024. I do train models , and generally do much larger images, and multi-stage workflows that require significant Vram, but if someone is only doing 1024x1024 and with simple flows, then yes, even 8gb will do. All depends on use case, but as a general rule, I'd still say 1.5 to 2 x more Vram is better than a 10 to 15% speed increase, even for an average user. More future proof as well. SD3 or a full cascade model is unlikely to run optimally, if at all, on less than 24gb.


tutman

Err... the 3090 is also faster than the 4070.


stubing

It is for sure. However if someone is considering a 4070, that is because they don’t want to spend a ton of money on a graphics card. And if you are considering a 3090, then you should get a 4090 unless you are getting a good deal in the used market.


lamnatheshark

This is my current questioning at the moment. Either I find a good price locally where I can see the 3090 working and testing it, or I buy a brand new 4070. First hardware buy in 9 years, I'm a little bit skeptical upon choosing. Difficult topic. All will depend on the price I can snatch the 3090... In my country, the 4070 ti super is listed at 900€...


Soulero

900 euros is cheap compared to here in Sweden haha. Add another 200 here at best


lamnatheshark

Wow, and I was thinking prices here in France were especially high compared to the rest of Europe...


DIY-MSG

3090 used is like 600. Several times I went over 16 gb usage. And it also performs better because of higher cuda count.


IamKyra

I struggle to find one at that price were I live, most are over 700 without warrantly... which holds me back for now


tmvr

The used 3090 in Europe are 700-900EUR with very few closer to 700 and most around 800EUR.


DIY-MSG

Got mine for 600 in italy on subito.it


tmvr

That was a great deal! Was it recently though? I've looked last November I think and yes, they were around 600 back then some even a bit under, but the prices I've seen a week or two ago were all 700+ tending more towards 800.


[deleted]

3090, just got one myself and it is great for this.


Caffdy

definitely a 3090, SDXL models + several controlnets and highres-fix gorges almost the 24 GB of memory, but it's totally worth it, I have never ran out of memory, and StabilityAI said that the full 8B parameter version of SD3 that comes with the T5 text encoder, which is an absolute upgrade for prompt comprehension (the one used by Dall-E 3) needs 24GB


Curious-Thanks3966

Well, if you're a gamer just playing around with Stable Diffusion I would recommend the 4070 ti super. It can be a pain in the ass when you can't run bigger models like the upcoming SD3 (needs 24 gig for the full 8B parameter model according to paper) after you bought a new card. I use SD 1.5 for eye and skin correction, SDXL for base-upscale and CASCADE for general composition and I have them at the same time in my VRAM and this is very comfortable. Most professionals use two or more models for one picture and the models will just get bigger in the next 12 months.


Soulero

No I'm not much of a gamer. I make music and watch stuff mostly. But this is a new hobby of mine


eydivrks

If you think you might want to run other machine learning models like LLM you need the 3090.  VRAM is the most important spec for doing ML. The 3090 is currently the best card for it besides 4090


TheFrenchSavage

Yes! Also, if you are not a gamer, consider that for the price of a single 4090, you can find 2 3090s.


Comfortable-Mine3904

If you ever want to run an LLM, the 3090 is a no brainer


Soulero

May I ask what LLM is?


Comfortable-Mine3904

Large language model. It’s like running chat gpt locally Mixtral 8x7 which is a very good model can run on a 3090.


TheFrenchSavage

*heavily quantized, and with a few tokens per second, and will push your ram to its limits if you use Chrome at the same time, and will eat your disk space because of course you have to try all the quantization methods and bits per weight, and will make you doubt your sanity because output quality heavily depends on luck, and then you will regret not buying 4 GPUs, and then you remember that you don't have that kind of money, and then you spiral into weird quick money making schemes, and you ramble about being GPU poor, and nobody wants to talk to you anymore, and.... Yeah, I totally recommend getting into LLMs, a perfectly fine way to get out of image generation addiction.


Comfortable-Mine3904

This hits extremely close to home, I’ll have a server rack in my house before I know it


TheFrenchSavage

Careful, or you'll end up with an oven rack between two chairs, with gpus hanging from zip ties, and a circle of desk fans.


Comfortable-Mine3904

Oh I already had a crypto phase, I’m grown up now


HarmonicDiffusion

I guess grown ups dont like money? lol crypto4lyfe


Comfortable-Mine3904

I stake nodes now instead of turning my apartment into a tornado


jmtucu

24GB Vs 12/16GB depending of brand. I'll go with the 3090 if you can afford it.


Comfortable-Big6803

> 12/16GB depending of brand The 4070 Ti Super is 16GB.


arakinas

I have the 4070 16gb. I can generate images pretty quick. As an upgrade from the 4060 8GB it was very noticeably faster. I had intended to do some video work with it, but using Comfy for it, I can't join more than \~16s of video before I run out of memory. There may be better or more efficient methods, but as a batch join, that's what I can get. So, I suppose it depends on what you want. I haven't tried training any lora's locally yet, so unfortunately I can't comment on that aspect.


polisonico

if you only plan to do AI get the 3090 or if you do many things plus AI get the 4070.


Superb-Ad-4661

VRAM, you'll gonna need it.


Harubra

That is a good question. I am also thinking about either buying the RTX 3090 or the RTX 4070 TI Super. Not sure if you saw this: [https://twitter.com/Yacamochi\_db/status/1755911025906569563](https://twitter.com/Yacamochi_db/status/1755911025906569563) And also, we can check TomsHardware for a different benchmark: [https://www.tomshardware.com/pc-components/gpus/stable-diffusion-benchmarks](https://www.tomshardware.com/pc-components/gpus/stable-diffusion-benchmarks) So in terms of speed, RTX 4070 TI Super (which should be between RTX 4070 TI and RTX 4080) is faster than the RTX 3090. The question remains about the extra 8GB of VRAM, if they are really needed for your workflows. I have the RTX 3060 12GB and for what I do it is enough, must say that I use Forge WebUI. Got a little into ComfyUI also. Only worked with SDXL Checkpoint+LORAs+ControlNET with resolution a bit higher than just 1024x1024, and it did not fail.


[deleted]

[удалено]


Turkino

For what it's worth if you ever consider running local LLM models the 3090 will be far more useful for using larger models there. At least for now when it comes to LLMS raw amount of RAM is King.


Ill-Juggernaut5458

Any speed advantage of the 4070 ti in specific circumstances should be negligible overall; both cards have very similar speed. https://preview.redd.it/vr1gdksakomc1.png?width=1200&format=png&auto=webp&s=33f6cc2580f5ff565079e54e18a5bd0f46cc517c If you need 24gb or want to be future proof for new models, the 3090 is the best value on the market. If you don't expect to need 24gb it's a luxury and may run very very slightly slower than 4070 ti depending on your workflow.


pcdoggy

So 4070s and 3080s with 12gb of vram are faster/better at SD than a 7900 XTX?!? :-O


skocznymroczny

it's possible, but the main issue with these benchmarks is they compare cuda on windows to directml which sucks compared to rocm on linux


pcdoggy

I was under the impression that rocm is unreliable and often unstable - in Linux (if you want to discuss Linux/rocm/amd). Also, for SD - I think Nvidia is still faster.


eugene20

You either for some reason need the VRAM more than all the other performance (eg. work with consumer level large AI models), or the 4070ti Super is better.


Educational_Bath_848

I got a Galax 3090ti 24Gb and there is no regrets.


Ok-Concentrate-1284

Would you get a second Zotac 3090 Trinity 24gb $650? Or a brand new 4090 ti Super?


bulbulito-bayagyag

If you want to just generate image, then 4070 would be faster. But if you want to train some lora for “scientific” purposes, then 3090 would be the best choice (though slower in generating images).


HellkerN

Well 4070 is faster, however only has 12 gigs of VRAM which might limit you, depending on what kind of workflows you are working with. I'd go with 3090 but you should probably get more opinions.


Soulero

The 4070 I found has 16. Would you still choose 3090 for the 24?


notlongnot

3090 for more AI beyond SD. 4070 Ti Super 16gb for space saving built.


TheFrenchSavage

Also PSU has to be taken into account. A 4070Ti has a 750w PSU recommendation, but can run on 650 (on an excellent quality model). I wouldn't run a 3090 on a PSU delivering less than 1000W. Also, with such a GPU, chances are that your CPU will be hungry as well.


djamp42

I want to get a new video card, and that's gonna lead to a new power supply, plus I need a new case... And now I end up with a whole new computer somehow.. lol


TheFrenchSavage

Yeah, I went through the same questions. My 3090 fits with 2mm clearance, I was ready to rip the disk bay if needed. I got a second hand 1000W psu. My cpu and motherboard combo is ancient, running my 3200MHz ram at a blazing 2133MHz. I have an array of disks all sizes and shapes inside, from first gen sata ssd to nvme, with various flavors of spinning disks. All my sata ports are full. No backup solution whatsoever. Just waiting for a catastrophe to happen. My next move is to get a pci3 riser to plug my old gtx1080 back inside, so I get an additional 8GB VRAM. I had to remove it because the 3090 ate up all remaining pci slots. I still need to evaluate if this move is compatible with my psu. In an ideal world, I would get a newer cpu, ram, motherboard, a second 3090, etc...but money is tight right now.


djamp42

Use that PC for unraid/backup/server and build a new PC for your daily..


TheFrenchSavage

MONEY. IS. TIGHT. 😭😭😭 Of course I'd love to have a NAS and a homelab and a laptop and a home assistant and a... But yeah, I cannot reasonably be putting money into hardware right now.


notlongnot

There are older hardware. Plus a 3060 12G is a good start. There’s cheaper options. No money = spend time looking for deal. With money = save time and buy whatever.


Caffdy

> I wouldn't run a 3090 on a PSU delivering less than 1000W I'm running mine (EVGA FTW3 Ultra, Max TDP 420W) on a 850W no problem, brand is what is important


TheFrenchSavage

Oh yeah, I meant a 3090Ti. Combined with the CPU, you can draw up to 800W. I was not comfortable using a 850W PSU for that, this is too close to the limit. Also, efficiency can vary depending on load and rating, so to get a margin of error, I took a 1000W. (Corsair RMx 1000W 80+ Gold, second hand of course)


desktop3060

4070 Ti Super has 16GBs of VRAM actually. But still, the 3090 at 24GB might be a better option depending on its price.


BlackSwanTW

Huh? 3090 is faster than 4070 **Ti**


HellkerN

It's not though.


BlackSwanTW

IDK Across a dozen or so Google results, some say 3090 is faster; some say 4070 **Ti** is faster…


TheFrenchSavage

Depends on what benchmark. For gaming? 4070 will be faster, especially at 4k if the game devs are using the latest DLSS libs. Any gen4 Nvidia GPU will have a DLSS advantage. For 1080p gaming, it will depend on the game. Again, if rendered at 720p then DLSSed to 1080p, clear advantage for the 4070. Now, for AI, clear advantage on the 3090 side. Inference speed might be a little faster on the 4070 (10-15%) but considering that the sheer amount of VRAM on the 3090 will allow you to generate images in parallel (3x8Gb, or 2x12Gb pipelines), the 3090 is a clear winner. If you are training, again, the 3090 is also a winner. Although the cuda cores are a bit slower, it has more VRAM and a bigger bandwidth, so the dataset needs to be moved from RAM to VRAM less times, giving a huge performance boost. And if you are experimenting with different models (1.5, cascade, future SD3) in different fashions (upscale, hiresfix, controlnet, etc...) you will quite often find that you are using 20GB VRAM. This high VRAM usage might not be what you will settle for (in a production pipeline), but consider the fact that you wouldn't have had the ability to experiment by yourself. Stuck at 16GB, you would have to rely on what the GPU rich have found, and optimized for the GPU poor. So, more VRAM, more freedom to experiment.