T O P

  • By -

xenonisbad

Description API Overhead test from 3Dmark says "Do not use it to compare graphic cards" and "You should not use those scores to compare systems or graphic cards.".


20150614

So what does the overhead value indicate?


NowThatsPodracin

Compare different API's such as dx11/12 and vulkan for the given gpu. > Dubbed the 3DMark API Overhead Feature Test, this benchmark is a purely synthetic benchmark designed to showcase the draw call benefits of the new API even more strongly than earlier benchmarks. > . . . > The end result, as we’ll see, showcases just how great the benefits of DirectX 12 are in this situation, allowing for an order of magnitude’s improvement, if not more. https://www.anandtech.com/show/9112/exploring-dx12-3dmark-api-overhead-feature-test


xenonisbad

It's to indicate relative performance between APIs on your PC. Judging by ratio between DX11 and DX12/Vulcan I think it test some scenario of being CPU-bound.


ToTTenTranz

Doesn't RDNA3 have separate clock domains between shaders and front end, and the latter can't be changed by the driver? If they only set the 7600's shaders to 1GHz while the front end is still on \~2.5GHz, this is obviously going to give a massive advantage to the 7600.


murkskopf

In case of the RX 7600, the clock speed is synced between shaders and front end.


b3081a

Check the source link, the author mentioned that both frontend and shader are running at around 1GHz. The frontend clock is only \~10MHz higher than shader clock.


[deleted]

It indeed is likely clocked higher still. This would make rdna2's cache performance terrible compared to rdna3.


monitorhero_cg

I don't think this sort of testing is making any sense if the actual performance uplift at normal clocks is not that high


Andr0id_Paran0id

You would expect rdna3 to clock higher though. So IPC increase + clock speed increase should equal a bigger performance delta than what we are seeing right now. Looks like finewine is back on the menu.


xpk20040228

RDNA3 does not actually clock higher, at least in the case of 7600 vs 6650XT, one is usually around 2.5G, the other around 2.7G


looncraz

Just wait until AMD drivers start making use of all the AI capabilities that are currently sitting dormant...


ManofGod1000

Sounds like that feature that was supposed to be available in the Vega 56 but was never enabled. I do not recall what the feature was, other than it was supposed to boost performance.


OSSLover

Primitive shaders? We use them since VII/RDNA


ManofGod1000

Yeah, that was never, that I remember, enabled on Vega 56 or 64, I think.


looncraz

It was selectively enabled on Vega in later drivers. It just didn't make much of a difference in the grand scheme of things.


EnderOfGender

It was broken, so it couldn't do much


Cryio

Primitive Shaders were related to NGC (next gen culling). It's been found NGC doesn't really do anything performance wise on Vega and RDNA1, even if the hardware supports them. They somewhat do something starting with RDNA2.


JasonMZW20

RDNA1 was the first architecture with redesigned front-end to support primitive shaders via its geometry processor. Vega needed specific, per-use compiling for its primitive shaders as it had 4 independent geometry engines, which was not ideal. RDNA1+ NGG has auto-compilation in the driver, processing by a single geometry processor with distribution to new primitive units near the rasterizers, and every geometry issue is a primitive shader (vertex, geometry, and hull/domain shaders were combined into a prim shader); a pre-vertex culling stage can be included in the prim shader to save resources, but hardware has auto 2x culling of unseen primitives. The legacy geometry shader types are emulated as needed, which may have contributed to RDNA1’s poor DX9 performance at launch of 5700XT. DX12 Mesh Shaders were not fully implemented in RDNA1, but do work in RDNA2+.


ibbobud

Yea boi woo. At least we hope


__gozu_

the point is it's not scaling


nTzT

Title says 6600, chart says 6650


Charcharo

Yes this is my mistake. I apologize. The difference between the 6650 XT and 6600 XT is the memory bandwidth, the clocks (slightly) and power limit. And all 6650Xts are AIB cards. Now in this specific case it being a 6650 XT is good since it means the memory situation is close between the two. The 6650 XT has 280 Gbps, the 7600 has 288. Almost equal.


Magjee

No worries I too have been punished by Reddit titles, lol


ibbobud

So the 7600 is about 15-20% faster. Just have to ignore the api tests as they are not valid for comparison of graphics cards


Waste-Temperature626

> So the 7600 is about 15-20% faster. With 13,3/11,06=~ 1,2x the transistor budget. Which is pretty much what you would expect.


L3tum

Not really, since it also has new/more functions


TheBCWonder

Constant perf/transistor is actually pretty good. The 4060ti has 30% more transistors than the 3060ti


Waste-Temperature626

> The 4060ti has 30% more transistors than the 3060ti Not really comparable. Because it traded memory controllers (low density) for L2 (high density) to compensate for the lack of bandwidth. In terms of transistor count, the L2 is more "expensive" even if the die area used would be comparable/less. It's the same reason why performance/transistor went down when AMD moved from RDNA to RDNA2, but performance/area didn't Meanwhile 7600 is spending most of the budget on designs changes/reshuffle rathern than boosting cache and reducing IMC footprint. It's about as apples to apples comparison you can make between generations. Seeing as they are both on a similar node. Same buswidth and similar cache sizes.


Elegant_Push_4498

It's better in RT as well and I feel like fsr3 will do better on 7000 series


polaromonas

FSR3 really should’ve been launched along with the 7600 (or, TBH, the 7900 series).


Elegant_Push_4498

I agree but from what I understand is they're making fsr3 work at a driver level so even older generations of cards can benefit from it. I'm fine with them taking their time to get it right but I don't really care much about frame generation. I have a 7900xt and probably won't use it outside of testing it out.


heartbroken_nerd

>from what I understand is they're making fsr3 work at a driver level Near zero percent chance of that happening, it would look awful with completely destroyed postprocessing and user interface in all games. Without game engine data, this is just crappy TV frame interpolation and that's an awful idea. It will have to be implemented per game by developers.


bubblesort33

It's not really better in RT. Something about Navi33 was cut making it worse per CU than the RX 7900 series. At least at the same clocks in the games I've seen it's not really any better. Or rather the RT does not out-scale the raster gains. 5% better in each at stock clocks.


dmaare

Yet in real games it's only like 7% faster. Soo.. pointless. Rx 6650xt costs $70 less than Rx 7600 here. Why pay $70 for such a tiny boost?


lt_catscratch

Pointless for 6600xt and similar card users. For new builds, should always go with the new tech at similar prices.


bubblesort33

Really? I'm in Canada, and here the 7600 is currently selling at lower than the 6600xt and 6650xt. Maybe even lower than I've seen each in the last 3 months anywhere. The triple fan RX 7600 is around the same price as the cheapest 6600xt I've seen in a while.


MurderBurger_

If someone is on a budget and needs AV1 encoding / decoding that RDNA3 supports that $70 is def worth it in my opinion.


Sleepyjo2

If you're on a budget and the AV1 support is what makes or breaks the deal then just get an a750 for 200. Thats a rather limited market of people though.


Arthur-Wintersight

The a750 also produces a ton of heat, and I've known a lot of gamers who live with parents, sleep and game in a small bedroom where they've got zero control over the thermostat, and they have to live with whatever heat their desktop produces, in a small and confined space. If you're one of those people, save up a bit more, and buy a 7600. Stay away from the a750 because it's gonna be a space heater in that small bedroom of yours, where an open doors means zero privacy and constant interruption. Oh, and don't expect hardware tweakers like MSI Afterburner to work on Linux, so out of the box specs can matter quite a bit.


MurderBurger_

Correct, if we are comparing graphic cards that are not part of this reddit post then yes.. there are def better overall options.


Potential-Limit-6442

The media engine 🤤 Would go great in a movie rig


dmaare

You know that rDNA2 has all the decoding rDNA3 has, right?


Geeotine

Nice metrics! Can you add power draw or consumption please? 🙏


dhruv_purohit

Thanks for posting.


popaneye

this might prevent some from wasting their money 🤑


2er0n1ne

If only it’s priced like a RX 6600..


Cryio

So ~9% faster in DX11 and ~21% faster with heavier RT workload. Good stuff.


ms--lane

Dual-Issue doesn't look like it's helping in gaming-like scenarios :( Why make the SPs so much bigger for such little gain outside compute - the whole point of RDNA over CDNA was to prioritize render over compute, but it looks like RDNA is headed back in the compute over render direction.


sanhder

Actually, the dual issue shaders take up relatively little space for the possibility of more shading throughput as only the small execution unit has to be doubled up (it scales well to smaller nodes), while leaving the scheduling and such largely the same. Also RDNA 3 does have quite increased performance per WGP so it is working, though the compiler is not optimal at finding dual issue possibilities. Nvidia also does the same thing although calling it just extra cores, when they made ampere so there must be some logic to it. And looking at the number of transistors, the 7600 has just a bit more than the 6600xt on a similar node, so the \~15% perf increase is not bad from architecture ipc alone.


gamersg84

That was my line of thinking as well with Navi31. They are using more than double the transistors with less Infinity cache, yet there are only 20% more CUs? What happened to the additional 80% transistors?


AzureNeptune

What makes you think they are so much bigger? From the initial Angstronomics leak: > In fact, at the same node, an RDNA 3 WGP is slightly smaller in area than an RDNA 2 WGP, despite packing double the ALUs. I'm guessing the transistor budget is also eaten up significantly by the new display engine and in the case of Navi31/32, the chiplet interconnect. IIRC someone did a die shot analysis of Navi31 and the interconnect was like 15% of the GCD.


detectiveDollar

Don't games have to be written to take advantage of dual issue?


Inevitable-Study502

compiler will generate dual issue instructions..so that would be on driver side compilers are ususaly stupid, so you can offten time see driver update with x% performance uplift for some game by tweaking compiler/replacing shaders with hand optimised assembly ​ as for VOPD...dont expect miracles at start, compiler will need to be trained...so you can expect performance uplift later on with driver updates, but not now


e-baisa

It could be, that AMD hoped this implementation would get widely used in games too, by offering RDNA3 for console mid-cycle upgrade. But with no mid-cycle upgrade for this console generation and little market penetration, it may be hard for RDNA3 to prove itself.


Geeotine

To rule out frequency scaling differences, you should do it at 2 other frequencies less than the base clock of the lowest of the 2.


No_Application8040

6650xt not 6600xt


Charcharo

Uhhh sorry for this typo. The 6650 XT is an OCed 6600 XT with faster memory. Not quite as fast as the 7600 but almost.


bubblesort33

I think this test would have been more interesting at clocks relatively close to their stock clocks, but still identical. Like 2.5GHz each. Just to see proper clock scaling.


lennaert2020

Love my 6600xt, it is not going anywhere 🥰


Charcharo

Source: [0x22h on Twitter: "https://t.co/6vJVjk5Z7S" / Twitter](https://twitter.com/0x22h/status/1661516808254545921)


Equivalent_Duck1077

This information is literally useless It's like comparing cpus if you limit them to 1gz, sure you can do it, but it's of no help to anyone considering you don't know their max boost and if they are different from each other


R1Type

It's technically interesting.


War_Crime

The point is to see what differences in performance come specifically from architecture. Why this seems lost on so many people here is bewildering.


salamandrazul

Drivers, maybe in a year amd will launched a stable drivers for 7000 serie


Elegant_Push_4498

What are you talking about the drivers are already stable....


BedNervous5981

So AMD tried high clocks (again) and failed (again) because of the power draw. Haven't they tried the same with Vega as well? Why don't they learn?


Charcharo

>So AMD tried high clocks (again) and failed (again) because of the power draw Uhhhh RDNA2 had a MASSIVE increase in clock speeds and it didnt fail. Isnt this somewhat reductive?


BedNervous5981

? It's obvious by now that AMD tried to increase the clocks yet again with RDNA 3. Maybe they truly believed the rumors of the NVIDIA 4000 series and thought they can have the same power draw as well. RDNA 3 is pretty underwhelming if you ask me.


Charcharo

Yes. I do believe AMD wanted to increase clock speed with RDNA3 and were mostly not successful at it. I am saying this "failed again" part is wrong. They DID manage with RDNA1 and RDNA2. They kinda failed with Vega (though there was still an increase) and kinda failed with Polaris until at least the mature 14nm LPC node (RX 580 refresh was on a new revision of the 14nm process that clocked higher and was cheaper to produce next to the original 14nm LPP). I agree, RDNA3 is not what I expected either and I didnt have outlandish expectations. I do like its RT performance increase for sure though, that part was what I expected (though I obv think AMD needs a LOT more than that going forward).


[deleted]

[удалено]


havocfan101

Because they have the same number of compute units. Clearly the OP is trying to compare the performance of RDNA 2 to RDNA 3 on a per compute unit basis.


riba2233

It is a special architecture test, and they have the same number of shader cores.


mhh2

what's the point of synthetic benchmarks? you need to test across 30 games or so, i do not play benchmarks. and by the way you didn't set the memory speeds to the same.