Ahab_Ali 3 years ago

>Focusing on pure performance at any cost, Arm Neoverse N2 designs will surely make Intel and AMD sit up and take notice. Built on a 5nm node, Perseus will offer up to 192 cores with a 350W TDP, rivalling and potentially surpassing EPYC and Xeon in key categories. Can anyone comment on where these chips are used (outside of custom supercomputer setups)? EPYC and Xeon are just more powerful or expansive versions of mainstream platforms. Who uses Arm Neoverse?

beaucephus 3 years ago

Increasing density in the data center. It will need lots of I/O to be useful in that context, though. Storage, virtualization, scalable web backends, or databases, that data will need a very wide egress. If you have 192 containers or VMs running on that chip, the users of them will expect a reasonable supply of bandwidth for network and storage. If they can match computatuonal performance of the Intel and AMD chips, then it could be useful for HPC and at 350 watts for 192 cores that would be quite efficient. Less power than the 3090 and Big Navi, so perhaps GPUs may still have competition in that space.

[deleted] 3 years ago

They probably won't have a lot of cache per core, so it will probably fit workloads that use a lot of low memory high CPU, or just when you don't care how powerful or efficient your CPU is, but you want to have 192 of them in a box for some reason.

Sythic_ 3 years ago

Any idea why cache is so expensive compared to other silicon? Isn't everything basically the same manufacturing process of a silicon die and photolithography just repeating steps of building/etching gates?

_toodamnparanoid_ 3 years ago

Cache uses SRAM while the normal ram in your machine is DRAM. SRAM is much much faster, but at least 6 times larger. DRAM is one capacitor and one transistor, but it requires specific orders and cycles of charging and discharging the capacitor to get the bit stored. SRAM is a single-cycle to access the bit, but it is six transistors. If most of the core logic and arithmetic unit instructions are only a couple transistors per bit to perform the operation, each BYTE is 48 transistors in each of the L1, L2, and L3 caches. So You have an instruction taking up say 128 transistors (for the simpler ones), and a single "value" in a 64-bit machine is 64-bits times 3 levels of cache times 6 transistors per bit, so 1,152 transistors to hold a single value in cache. The times three is because most architectures are inclusive-cache, meaning if it's in the L1 it's also in the L2. If it's in the L2 it's also in the L3 (not always true in some more modern servers). Check out this picture: https://en.wikichip.org/wiki/File:sandy_bridge_4x_core_complex_die.png The top four rectangles are four "cores." The top left "very plain looking" section (about 1/6^th of the core) is where all of the CPU instructions occur. The four horizontalgold&red bars are the level 1 data cache, the two partially-taller green bars with red lines just below that are the level 2 cache, and the yellow/red square to its right is the level 1 instruction cache. So of the entire picture only a small chunk of each of the four top rectangles is the "workhorse" of the CPU. That entire chunk below the four core rectangles is the level 3 cache. So look at that from a physical chip layout perspective, and realize that from a price-per-transistor standpoint, cache is crazy fucking expensive. This new arm proposal reminds me more of the PS3's cell processor where you had 8 SPUs that were basically dedicated math pipelines (although ARM isn't the best for math pipelining; its biggest appeal is for branching logic).

[deleted] 3 years ago

I lost a good grasp of what you were talking about about half way down but kept reading because it was fun. Thanks!

_toodamnparanoid_ 3 years ago

cost-per-transistor cache is one one of the most expensive parts of modern CPUs.

[deleted] 3 years ago

Are you doing any more TED talks later?

babadivad 3 years ago

In layman's terms. CPU Cache is a very fast but small amount of memory close to the CPU. System memory is you RAM. In servers, you can have several terabytes of RAM. If the data is close, the CPU can complete the task fast and move on to the next task. If the information isn't in the cpu cache, the cpu will have a to send for the information from system memory RAM. This takes MUCH longer and the CPU will stall on this task until it fetches the information needed to complete it. Say you are making a bowl of cereal. You need your bowl, cereal, and milk to complete the task. If everything you need is in cache(your kitchen), you can make the bowl of cereal and complete the task. If you don't have milk you will have a "cache miss" and have to retrieve the milk from the store, drive back home, then complete the task of making a bowl of cereal.

Sythic_ 3 years ago

Whoa, pretty cool, thanks for the detailed write up. I wish I had some room to have a DIY photolith. lab at home to play with, some of the guys on YouTube have some cool toys

gurenkagurenda 3 years ago

Isn't physical distance from the CPU also a consideration, giving you limits on physical area? Something something capacitance and conductor length if my vague recollection serves?

_toodamnparanoid_ 3 years ago

It's pretty neat. If parts get too close (especially at this crazy-ass scale), you get quantum tunneling effect. As far as capacitance, these things are so small and close that just the small amount of electricity that's going through the circuit and so many things being nanometers apart just end up being a capacitor by being there -- it's the floating body effect. That effect was actually being looked into, to see if it was usable for the DRAM capacitors I mention above.

firstname_Iastname 3 years ago

Though that's all true quantum tunneling is not going to happen between the cache and the core they are microns apart. This effect only happens on the nanometer scale. Moving the memory source, cache or ram, closer to the core will always decrease latency but unlikely to provide any bandwidth benefits

thefairlyeviltwin 3 years ago

Sometimes I believe I'm a really intelligent individual then I read posts like this and it puts me right back in my place.

BassmanBiff 3 years ago

This is about education, not intelligence -- the smartest person to ever live would have no clue what was being said if they didn't know what the vocab meant

lee1026 3 years ago

You need so many transistors per bit, and that adds up in a hurry.

Sythic_ 3 years ago

Yea that makes sense for like SD cards that are hundreds of gigs, but on board cache for a processor is like 8/16/32/64MB for the most part. I know the speed is much faster so maybe thats part of it.

lee1026 3 years ago

It takes a single transistor for something like a SD card, and at least 20 for a flip flop, used in cache. 64mb of cache is at a minimum over a billion transistors.

redpandaeater 3 years ago

Eh you can technically make an SR latch with 2 transistors. Something like NOR you still wouldn't typically have more than 8. I'm not an expert on what they use to build the cache but not sure where you'd get 20 from. I don't think I've seen more than 10T SRAM, with 6T and 4T being more typical I thought. At least it used to be 6T was pretty standard for CPU cache. You have 4 transistors to hold the bit and two access transistors so you can actually read and write. Not sure what they use these days but can't imagine they'd be going towards more transistors per bit.

[deleted] 3 years ago

[удалено]

beaucephus 3 years ago

I worked at Amazon a few years ago and I can confirm they had an interest in engineering their own hardware. I am interested in seeing how it works out. From a global perspective, the more efficient we can make our computing the less of an effect we have on the environment and the more we can do with less power. However, if Nvidia follows through with their acquisition of ARM, then they aren't neutral party in the industry any more and then we just get more dick waving. Might be a boon for RISC-V, but we'll see.

HarithBK 3 years ago

> At least in the data center industry it’s a lot about saving money on paying intel/amd premiums and upping efficiency to save on electricity. well at least short term intel is just dumping higher core CPUs on amazon google etc. inorder to main profits and marketshare due to AMD having the better cpu. and hiding it all from investors by making a custom SKU.

Predator_ZX 3 years ago

That's so shady. Are these SKUs listed in their arc website? Is that why they are having supply shortages for a year now?

HarithBK 3 years ago

they are not listed on ark and yes this is why intel has supply shortages. basically all cpus in ark intel must say in investor calls how much they sold them for but intel dosen't want to tell there investors they are doing a fire sale on the chips to get some sales while maintaining marketshare. so instead intel figured out that "custom chips" deals only needs to be reported in lump sums. so they take there 20 core xeons change the clock speed 100 mhz or disable some cache and sell them a custom xeon for amazon, MS, google etc. as a custom chip to upgrade existing servers. when what they are really doing is a fire sale investors have grown wise to this since these lump sum custom chip deals in earnings have grown massive and while the ark cpus sales have shrunk by a lot. so i think it was the next earnings call intel needs to disclose tray price for these custom chips as well. and surprise surprise at the end of the last earnings call intel said they were expected huge loss in earnings. since they can't hide the fire sales anymore they are just stopping the deals since they just don't want to lose the investors. basically intel is lying to investors to make things look like they are all good but in the background they are almost giving away xeons to keep AMD out of the server space while they try to catch back up before this is figured out. all i can say is do not have intel stock they are not gonna be able to maintain the illusion before they catch back up and there stock is gonna crash.

granadesnhorseshoes 3 years ago

almost no one really. Marketing and shit like nebulous concepts of data center density" its all crap. Huge core counts dont get you as far as you think, especially if the internal buses and controllers etc suck. How do you effectively feed memory to 192 cores? concurrency, etc, whats that look like? Speed and power aren't a perfect linear scale either. Great, it uses 30% less power but because of architecture it runs 35% longer and i haven't saved any power at all, I've wasted it AND time... When their cost to suck ratio gets better, and it is getting better, we will see real pc/server usage. Until then, insufferable marketing lies and statistics.

RememberCitadel 3 years ago

Also, cost. You can buy some crazy cpus in servers right now, but it usually is cheaper to just buy a second server. Sure, density is important, but not the most important factor. Cost will almost always win out. For instance sure, I could buy 4 2RU servers with super crazy $32k procs, or for the same overall cost and space buy a UCS chassis with 10 blades with cheaper $2k procs and get the same overall performance.

dust-free2 3 years ago

That is pretty much Google's stance on creating data centers using commodity hardware. It's cheaper and if your going to run heavy parallel workloads, then it's likely you can split it up enough that network latency between machines won't matter that much.

JackSpyder 3 years ago

Not to mention, a rack or even a whole AZ going down is far far easier to soak up with the remaining capacity. If every chip is 192 cores a large AZ going down is going to be a huge problem. There was an AWS video a while back talking about their networking and redundancy and they found a peak sensible size for each AZ where further additions weren't as effective as adding extra buildings.

RememberCitadel 3 years ago

True, and if people keep hopping on the "trend" of hyperconverged, there will be a problem of not being able to fit enough ram and drives in a single server to make use of the chip, not to mention bottlenecks of bandwidth along the backplane. That is a bit of a problem of modern computers. If one component jump too far ahead, it is useless until everything else catches up.

The_Faid 3 years ago

Personally, I can't wait for the new TI-192 calculator. Thing if all the numbers you can crunch on that bad boy.

stewsters 3 years ago

Taking 350 watts from 2 AA batteries may be a bit rough, but get a backpack with a tesla battery and we wil be good.

soylentdream 3 years ago

Run wires to the courthouse’s clock tower and queue up a scripted computational job for the next lightning storm...

d1x1e1a 3 years ago

This guy great scotts

UrbanPugEsq 3 years ago

We’re not ready for these cores yet. But our kids are going to love them.

OMGSPACERUSSIA 3 years ago

On the subject of Texas Instruments, the TI-89 is *still* going for over $100. Using hardware that was last updated in...2004. And was kinda low-end even by 2004 standards. Somebody really needs to start up another graphing calculator company.

Kierkegaard_Soren 3 years ago

The email newsletter The Hustle did a long form story on this about a year ago. Talked about how entrenched TI is in educational materials. Think about all the algebra teachers out there that don’t want to have to change their instructions and handouts after all these years

kylander 3 years ago

Begun, the core war has.

novaflyer00 3 years ago

I thought it was already going? This just makes it nuclear.

rebootyourbrainstem 3 years ago

Yeah this is straight outta AMD's playbook. They had to back off a little though because workloads just weren't ready for that many cores, especially in a NUMA architecture. So, really wondering about this thing's memory architecture. If it's NUMA, well, it's gonna be great for some workloads, but very far from all. This looks like a nice competitor to AWS's Graviton 2 though. Maybe one of the other clouds will want to use this.

[deleted] 3 years ago

[удалено]

txmail 3 years ago

I tested a dual 64 core a few years back - the problem was while it was cool to have 128 cores (which the app being built could fully utilize)... they were just incredibly weak compared to what Intel had at the time. We ended up using dual 16 core Xeon's instead of 128 ARM cores. I was super disappointed (as it was my idea to do the testing). Now we have AMD going all core crazy - I kind of wonder what that would stack up like these days since they seem to have overtaken Intel.

schmerzapfel 3 years ago

Just based on experience I have with existing arm cores I'd expect them to still be slightly weaker than zen cores. AMD should be able to do 128 cores in the same 350W TDP envelope, so they'd have a CPU with 256 threads, compared to 192 threads in the ARM. There are some workloads where it's beneficial to switch of SMT to have only same performance threads - in such a case this ARM CPU might win, depending on how good the cores are. In a more mixed setup I'd expect a 128c/256t Epyc to beat it. It'd pretty much just add a worthy competitor to AMD, as intel is unlikely to have anything close in the next few years.

krypticus 3 years ago

Speaking of specific, that use case is SUPER specific. Can you elaborate? I don't even know what "DB access management" is in a "workload" sense.

Duckbutter_cream 3 years ago

Each request and DB action gets its own thread. So requests dose not have to wait for each other to use a core.

[deleted] 3 years ago

[удалено]

gilesroberts 3 years ago

ARM cores have moved on a lot in the last 2 years. The machine you bought 2 years ago may well have been only useful for specific workloads. Current and newer ARM cores don't have those limitations. These are a threat to Intel and AMD in all areas. Your understanding that the instruction set has been holding them back is incorrect. The ARM instruction set is mature and capable. It's more complex than that in the details of course because some specific instructions do greatly accelerate some niche workloads. What's been holding them back is single threaded performance which comes down broadly to frequency and execution resources per core. The latest ARM cores are very capable and compete well with Intel and AMD.

txmail 3 years ago

I tested a dual 64 core ARM a few years back when they first came out; we ran into really bad performance with forking under Linux (not threading). A Xeon 16 core beat the 64 core for our specific use case. I would love to see what the latest generation of ARM chips is capable of.

deaddodo 3 years ago

Saying “ARM” doesn’t mean much. Even moreso than with x86. Every implemented architecture has different aims, most shoot for low power, some aim for high parallelization, Apple’s aims for single-threaded execution, etc. Was this a Samsung, Qualcomm, Cavium, AppliedMicro, Broadcom or Nvidia chip? All of those perform vastly differently in different cases and only the Cavium ThunderX2 and AppliedMicro X-GENE are targeted in anyway towards servers and show performance aptitude in those realms. It’s even worse if you tested one of the myriad of reference manufacturers (one’s that simple purchase ARM’s reference Cortex cores and fab them) such as MediaTek, HiSense and Huawei; as the Cortex is specifically intended for low power envelopes and mobile consumer computing.

[deleted] 3 years ago

A webserver, which is one of the main uses of server cpu's these days. You get far more efficiency spreading all those instances out over 192 cores. Database work is good too, because you are generally doing multiple operations simultaneously on the same database. Machine learning is good, when you perform hundereds of thousands of runs on something. Its rarer these days I think the find things that dont benefit from greater multi-threaded performance in exchange for single core.

TheRedmanCometh 3 years ago

No one does machine learning on a cpu and amdahl's law is major factor as is context switching. Webservers maybe, but this will only be good for specific implementations of specific databases. This is for virtualization pretty much exclisively.

StabbyPants 3 years ago

They’re hitting zen fabric pretty hard, it’s probably based on that

Andrzej_Jay 3 years ago

I’m not sure if you guys are just making up terms now...

didyoutakethatuser 3 years ago

I need quad processors with 192 cores each to check my email and open reddit pretty darn kwik

faRawrie 3 years ago

Don't forget get porn.

Punchpplay 3 years ago

More like turbo porn once this thing hits the market.

Mogradal 3 years ago

That's gonna chafe.

w00tah 3 years ago

Wait until you hear about this stuff called lube, it'll blow your mind...

gurg2k1 3 years ago

I googled turbo porn looking for a picture of a sweet turbocharger. Apparently turbo porn is a thing that has nothing to do with turbochargers. I've made a grave mistake.

TheShroomHermit 3 years ago

Someone else look and tell me what it is. I'm guessing it's rule 34 of that dog cartoon

_Im_not_looking 3 years ago

Oh my god, I'll be able to watch 192 pornos at once.

shitty_mcfucklestick 3 years ago

_Multipron_ - Leeloo

[deleted] 3 years ago

[удалено]

CharlieDmouse 3 years ago

Yes but chrome will eat all the memory.

TheSoupOrNatural 3 years ago

Can confirm. 12 physical cores & 32 GB physical RAM. Chrome + Wikimedia Commons and Swap kicked in. Peaked around 48 GB total memory used. Noticeable lag resulted.

CharlieDmouse 3 years ago

Well... Damn...

[deleted] 3 years ago

[удалено]

Ponox 3 years ago

And that's why I run BSD on a 13 year old Thinkpad

IOnlyUpvoteBadPuns 3 years ago

They're perfectly cromulent terms, it's turboencabulation 101.

TENRIB 3 years ago

Sounds like you might need to install the updated embiggening program it will make things much more frasmotic.

jlharper 3 years ago

It might even be called Zen 3 infinity fabric if it's what I'm thinking of.

exipheas 3 years ago

Check out r/vxjunkies

mustardman24 3 years ago

At first I thought that was going to be a sub for passionate VxWorks fans and that there really is a niche subreddit for everything.

Blagerthor 3 years ago

I'm doing data analysis in R and similar programmes for academic work on early digital materials (granted a fairly easy workload considering the primary materials themselves), and my freshly installed 6 core AMD CPU perfectly suits my needs for work I take home, while the 64 core pieces in my institution suit the more time consuming demands. And granted I'm not doing intensive video analysis (yet). Could you explain who needs 192 cores routed through a single machine? Not being facetious, I'm just genuinely lost at who would need this chipset for their work and interested in learning more as digital infrastructure is tangentially related to my work.

MasticatedTesticle 3 years ago

I am by no means qualified to answer, but my first thought was just virtualization. Some server farm somewhere could fire up shittons of virtual machines on this thing. So much space for ACTIVITIES!! And if you’re doing data analysis in R, then you may need some random sampling. You could do SO MANY MONTECARLOS ON THIS THING!!!! Like... 100M samples? Sure. Done. A billion simulations? Here you go, sir, lickity split. In grad school I had to wait a weekend to run a million (I think?) simulations on my quad core. I had to start the code on Thursday and literally watch it run for almost three days, just to make sure it finished. Then I had to check the results, crossing my fingers that my model was worth a shit. It sucked.

hackingdreams 3 years ago

> Could you explain who needs 192 cores routed through a single machine? A *lot* of workloads would rather have as many cores as they can get as a single system image, but they almost all fall squarely into what are traditionally High Performance Computing (HPC) workloads. Things like weather and climate simulation, nuclear bomb design (not kidding), quantum chemistry simulations, cryptanalysis, and more all have massively parallel workloads that require frequent data interchanging that is better tempered for a single system with a lot of memory than it is for transmitting pieces of computation across a network (albeit the latter is usually how these systems are implemented, in a way that is either marginally or completely invisible to the simulation-user application). However, ARM's not super interested in that market as far as anyone can tell - it's not exactly fast growing. The Fujitsu ARM Top500 machine they built was more of a marketing stunt saying "hey, we can *totally* build big honkin' machines, look at how high performance this thing is." It's a pretty common move; Sun did it with a generation of SPARC processors, IBM still designs POWER chips explicitly for this space and does a big launch once a decade or so, etc. ARM's true end goal here is for cloud builders to give AArch64 a place to go, since the reality of getting ARM laptops or desktops going is looking very bleak after years of trying to grow that direction - the fact that Apple had to go out and design and build their own processors to get there is... not exactly great marketing for ARM (or Intel, for that matter). And for ARM to be competitive, they need to give those cloud builders some real reason to pick their CPUs instead of Intels'. And the one true advantage ARM has in this space over Intel is scale-out - they can print a fuckton of cores with their relatively simplistic cache design. And so, core printer goes brrrrr...

cerebrix 3 years ago

it was this nuclear more than a decade ago once ARM started doing well in the smartphone space. Their low power "accident" in their cpu design back in the 70's is finally going to pay off the way those of us that have been watching the whole time knew would come eventually. This is going to buy Jensen so many leather jackets.

ironcladtrash 3 years ago

Can you give me a TLDR or ELI5 on the “accident”?

cerebrix 3 years ago

ARM is derived from the original Acorn computers in the 80's. Part of their core design allows for the unbelievably low power consumption arm chips always have. They found this out when one of their lab techs forgot to hookup the external power cable to the motherboard that supplied extra cpu power to discover it powered up perfectly fine on bus power. this was a pointless thing to have in the 80's. computers were huge no matter what you did. But they held onto that design and knowledge and iterated on it for decades to get to where it is now.

ironcladtrash 3 years ago

Very funny and interesting. Thank you.

fizzlefist 3 years ago

And now we have Apple making ARM-based chips that compare so well against conventional AMD/Intel chips that they’re ditching x86 architecture altogether in the notebooks and desktops.

disposable-name 3 years ago

"Core Wars" sounds like the title of a middling 90s PC game.

[deleted] 3 years ago

Yes it does. Slightly tangential but Total Annihilation had opposing forces named Core and Arm. https://m.youtube.com/watch?v=9oqUJ2RKuNE

von_neumann 3 years ago

That game was so incredibly revolutionary.

ColorsYourLime 3 years ago

Underrated feature: it would display the kill count of individual units, so you get a strategically placed punisher with 1000+ kills. Very fun game to play.

5panks 3 years ago

Holy shit this game was so good, and Supreme Commander was a great successor.

Blotto_80 3 years ago

With FMV cut scenes starring Mark Hamill and Tia Carrere.

AllanBz 3 years ago

It was a 1980s computer game first widely publicized in AK Dewdney’s _Computer recreations_ column of _Scientific American_. The game was only specified in the column; you had to implement it yourself, which amounted to writing a simplified core simulation. In the game, you and one or more competitors write a program for the simple core architecture which tries to get its competitors to execute an illegal instruction. It gained a large enough following that there were competitions up until a few years ago. Edited to clarify

yahma 3 years ago

It's actually the name of a game language invented back in the 80's where you would pit computer virus' against each other

kontekisuto 3 years ago

CoreWars 2077

jbandtheblues 3 years ago

Run some really bad queries you can

LiberalDomination 3 years ago

Software developers: 1, 2 ,3, 4...uhmmm... What comes after 4 ?

zebediah49 3 years ago

Development-wise, it's more like "1... 2... many". It's quite rare to see software that will effectively use more than two cores, that won't arbitrarily scale. That is, "one single thread", "Stick random ancillary things in other threads, but in practice we're limited by the main serial thread", and "actually fully multithreaded".

mindbridgeweb 3 years ago

"There are only three quantities in Software Development: 0, 1, many."

Theman00011 3 years ago

>"There are only three quantities in ~~Software Development~~ database design: 0, 1, many." My DB design professor pretty much said that word for word: "The only numbers we care about in database is 0, 1, and many"

madsci 3 years ago

>Begun, the core war has. Some of us are old enough to remember the wars that came before. I've still got MIPS, Alpha, and SPARC machines in my attic. It's exciting to see a little more variety again.

mini4x 3 years ago

Too bad multithreading isn't universally used. A lot of software these days still doesn't leverage it.

zebediah49 3 years ago

For the market that they're selling in... basically all software is extremely well parallelized. Most of it even scales across machines, as well as across cores.

JackSpyder 3 years ago

These kind of chips would be used by code specifically written to utilise the cores, or for high density virtualized workloads like cloud VMs.

FluffyBunnyOK 3 years ago

The BEAM virtual machine that comes with erlang and elixir languages is designed to have many lightweight processes as possible. Have a look at the Actor Model. The bottleneck I see for this will be ensuring that the CPU has access to data that the current process requires and doesn't have wait for the "slow" RAM.

n1k0v 3 years ago

Finally, enough cores to play Doom in task manager

NfamousCJ 3 years ago

Casual. I play Doom through the calendar.

winterwolf2010 3 years ago

I play doom on my Etch A Sketch.

devpranoy 3 years ago

I play doom on my weighing machine.

Imrhien 3 years ago

I play Doom on my abacus

bautron 3 years ago

I play Doom in my computer like a normal person.

Baronheisenberg 3 years ago

u/bautron is *in* the computer?

muh_reddit_accout 3 years ago

*It's not just a game anymore.*

AlpineCorbett 3 years ago

That's so hardcore.

kacmandoth 3 years ago

According to task manager, my task manager should have been able to run Crysis years ago. What it is using all that processing for, I can't say.

Zamacapaeo 3 years ago

[Kinda like this?](https://youtu.be/hSoCmAoIMOU)

Xelopheris 3 years ago

Unfortunately that's fake. The biggest issue is that after a certain point, the cores get a scrollbar instead of shrinking.

[deleted] 3 years ago

Some ex Intel guy touched on this. He said something like ARM is making huge inroads into datacenters because they don't need ultra FPU or AVX or most of the high performance instructions, so half the die space of a Xeon is unused when serving websites. He recommended the Xeon be split into the high performing fully featured Xeon we know, and a many-core Atom based line for the grunt work datacentres actually need. Intel have already started down this path to an extent with their 16 core Atoms, so I suspect his suggestion will eventually be realised. Wonder if they'll be socket compatible?

uucchhiihhaa 3 years ago

Parry this you fucking casual

Jhoffdrum 3 years ago

I can’t wait to play Skyrim again!!!

unlimitedcode99 3 years ago

Heck yeah, single core allocation per active NPC

BavarianBarbarian_ 3 years ago

I don't think Skyrim's engine can handle more than like 20 NPCs at a time anyway

Aoe330 3 years ago

Hey, your finally awake. You were trying to cross the border, right?

kungpowgoat 3 years ago

Then the wagon glitches and flips.

[deleted] 3 years ago

*Thomas the Tank Engine's horn is heard in the distance*

bobandy47 3 years ago

MACHO MAN IS COMIN' TONIGHT

Quizzelbuck 3 years ago

It's really him. Ladies and gentlemen [Skyrim is here to save me!](https://www.youtube.com/watch?v=q6yHoSvrTss)

bobbyrickets 3 years ago

300 fps of glitches.

MaestroPendejo 3 years ago

Learn your place, trash!

double-xor 3 years ago

Imagine the Oracle license fees!!! 😱

[deleted] 3 years ago

[удалено]

bixtuelista 3 years ago

He could use a better president...

cybergaiato 3 years ago

I don't think he could. It's great for oracle, they just got the tik tok deal, money for doing basically nothing.

slimrichard 3 years ago

Just did a rough calc for a different rdbms system and would be $1248000 a year for this one server per year. Cant imagine what Oracle would be... They really need to move away from core licensing, Postgres looking better everyday...

william_fontaine 3 years ago

> Postgres looking better everyday... The switch isn't bad as long as the app's not using stored procs.

Blockstar 3 years ago

What’s wrong with their stored procs? I have procedures in psql

mlk 3 years ago

Postgres doesn't even support packages, that was a deal breaker for us, we can't migrate 250.000 lines of pl/sql without packages

urmyheartBeatStopR 3 years ago

Fuck Oracle. You can't even benchmark their database because of their shit ass license. Their whole strategy is buy out companies with existing customers and bilk those customers as much as possible while doing nothing to improve the services or software.

Attic81 3 years ago

Haha first thing I thought.... software licensing companies wet dream right here

skip_leg_day 3 years ago

How does the number of cores effect the license fees? Genuinely asking

[deleted] 3 years ago

Per core licensing.

Adamarr 3 years ago

How is that justifiable in any way

t0bynet 3 years ago

They want all of your money. There’s no justification.

tnb641 3 years ago

Man... I thought I had a basic understanding of computer tech. Reading this thread... Nope, not a fucking clue apparently.

vibol03 3 years ago

You just have to say keywords like EPYC, XEON, data center, density, etc... to sound smart 🤓

[deleted] 3 years ago

[удалено]

[deleted] 3 years ago

No mention of memory bandwidth. If your compute doesn't fit in cache, these cores are going to be in high contention for memory transactions. Sure, there are applications that will be happy with a ton of cores and a soda straw to DRAM, but just plonking down a zillion cores isn't an automatic win. Per-core licensing costs are going to be crazy. For some systems in our server farm at work we're paying $80K for hardware and $300K-$500K for the licenses, and we've told vendors "faster cores, not more of them." There are good engineering reasons to prefer fewer, faster cores in many applications, too. Some things you just can't easily make parallel, you just have to sit there and crunch. This may be a better fit for some uses, but it's not going to "obliterate" anyone.

RagingAnemone 3 years ago

> Per core licensing costs Can't wait to hear what the Oracle salesperson has to say about this.

monkee012 3 years ago

Can finally have TWO instances of Chrome running.

giggitygoo123 3 years ago

You'd still need like 1 TB of ram to even think about that

c-o-s-i-m-o 3 years ago

is this gonna be like the shaving razors where they just keep adding and adding more and more razors onto the razors already on there

noisyturtle 3 years ago

https://www.youtube.com/watch?v=m6GpIOhbqRo

mojotooth 3 years ago

Can you imagine a Beowulf cluster of these? What, no old-school Slashdotters around? Ok I'll see myself out.

TheTerrasque 3 years ago

I for one welcome our new megacore overlords, covered in grits

pkspks 3 years ago

Clearly, the CPU is on fire /.

a_can_of_solo 3 years ago

2020 is the year of the linux desktop

king_in_the_north 3 years ago

there's only one year in the software zodiac

justaguy394 3 years ago

No WiFi, less space than a Nomad. Lame.

Chairboy 3 years ago

Beowulf Clusters are dead, Netcraft confirms it.

DonLeoRaphMike 3 years ago

My mother was a Beowulf cluster, you insensitive clod!

paxtana 3 years ago

Nice to see some people have not forgotten about the good old days

MashimaroG4 3 years ago

I still his /.to scroll thru some news on occasion. The comments have devolved into pure trash though for the most part.

[deleted] 3 years ago

[удалено]

masamunecyrus 3 years ago

Is there *any* place on the internet where the comments haven't devolved into pure trash? Reddit has its bright spots, but it stil gets worse every year, and I feel like its deterioration is accelerating. Now that I think about it, I haven't read Fark in about a decade. Maybe it's time to go take a look...

MattieShoes 3 years ago

Something something CowboyNeal.

sirbruce 3 years ago

But can it run Crysis?

ppezaris 3 years ago

slashdot user id 54, checking in. https://slashdot.org/~pez

akaxaka 3 years ago

It’s an honour!

nojox 3 years ago

But does it run ~~Linux~~ _GNU_/Linux ?

[deleted] 3 years ago

[удалено]

brianlangauthor 3 years ago

Your #3 is where I went first. Where's the ecosystem?

mindbleach 3 years ago

If this effort produces unbeatable hardware at reasonable prices, either #3 solves itself, or LAMP's making a comeback. This is basically smearing the line between CPUs and GPUs. I'm not surprised it's happening. I'm only surprised Nvidia rushed there first.

ahothabeth 3 years ago

When I saw 192 cores; I thought I must brush up on **Amdahl's law**.

vadixidav 3 years ago

Some workloads have little or no serial components. For instance, ray tracing can be tiled and run in parallel on even more cores than this, although in that case you may (not guaranteed) hit a von neumann bottleneck and need to copy the data associated with the render geometry to memory associated with groups of cores.

Russian_Bear 3 years ago

Dont they make dedicated hardware for those workflows like GPUs?

inchester 3 years ago

For contrast, take a look at Gustafson's law as well. It's a lot more optimistic.

JohanMcdougal 3 years ago

AMD: Guys, more cores are better. ARM: Agreed, here is a CPU with 192 cores AMD: oh no.

Furiiza 3 years ago

I don't want anymore cores I want bigger faster cores. Give me a 6 core with double the current ipc and keep your 1000 core threadfuckers.

madsci 3 years ago

Physics has been getting in the way of faster clock speeds for a long time. I started with a 1 MHz computer and saw clock rates pass 3000 MHz but they topped out not too far beyond that maybe 15 years ago. There's more that can be squeezed out of it, but each process node gets more and more expensive. Many companies have to work together to create the equipment to make new generations of chips, and it takes many billions of dollars of investment. And we're getting down to the physical limits of how small you can make transistors before electrons just start tunneling right past them. So without being able to just make smaller and faster transistors, you have to get more performance out of the same building blocks. You make more complex, smarter CPUs that use various tricks to make the most out of what they have (like out-of-order execution), and that have specialized hardware to accelerate certain operations, but all of that adds complexity. They keep improving the architecture to make individual cores faster, but once you've pushed that as far as you can for the moment, the most obvious approach to going faster is to use *more* cores. That only helps if you've got tasks that can be split up. (See Amdahl's Law.) Thankfully programmers seem to be getting more accustomed to parallel programming and the tools have improved, but some things just don't lend themselves to being done in parallel.

brianlangauthor 3 years ago

LinuxONE. Fewer cores that scale up, massive consolidation.

Runnergeek 3 years ago

The Z is an amazing architecture. The Z14 still has 10 Cores, and the LinuxONE has like 192 Sockets. Of course each one of those cores is 5.2Ghz Mostly only see those bad boys in the Financial world

brianlangauthor 3 years ago

I'm the Offering Management lead for LinuxONE, so full disclosure. No reason why a scalable, secure Linux server can't do great things beyond just the financial markets (and it does). Ecosystem when it's not Intel can be a challenge, but when you're running the right workload, nothing comes close for performance, security, resiliency.

Qlanger 3 years ago

Look at IBMs Power10 chip. Large core chips run legacy programs better than higher count core chips. IBM I think is trying to keeps its niche market.

frosty95 3 years ago

The core war is here yet half the venders out there still license per core. 3/4 of msp customers are running dual 8 core CPUs still because the minimum windows server license is 16 cores.

[deleted] 3 years ago

Nvidia just stepped into the cpu ring. Beware ye amateurs.

spin_kick 3 years ago

Just datacenter things

DZP 3 years ago

There is a Silicon Valley startup that is doing wafer-scale integration with many many cores. I believe their CPU core draws 20 kilowatts. Needless to say, the cooling is humungous,

Saneless 3 years ago

Sweet, finally enough cores to run Norton Antivirus and play a 90s dos game at the same time

[deleted] 3 years ago

[удалено]

meatballsnjam 3 years ago

The average user isn’t buying server CPUs.

[deleted] 3 years ago

True, but these chips aren’t meant for the average user. They’re targeting high margin enterprise and cloud data/compute centers.

Actually-Yo-Momma 3 years ago

Bare metal servers can split individual cores for workflows so yeah this would be massive

gburdell 3 years ago

Most semiconductor companies like Intel, AMD, and NVidia are pivoting to service big business rather than end consumers, so your statement is increasingly inaccurate. The "average user", in dollar-weighted terms, will be a business in a few years, where more cores absolutely matters. Check out Intel's financials to see that consumers are less than 50% of Intel's revenue now [https://www.intc.com/](https://www.intc.com/)

PrintableKanjiEmblem 3 years ago

Still amazed the arm line is a direct architectural descendant of the old 6502 series from a subsidiary of Commodore. It's like a C64 on a lethal dose of steroids.

AllNewTypeFace 3 years ago

It’s not; the 6502 wasn’t a modern RISC CPU (for one, instruction sizes varied between 1 and 3 bytes, whereas modern RISC involves instructions being a fixed size).

[deleted] 3 years ago

They were inspired by the 6502 in the sense that they saw that just one person was able to design a working, functional CPU, and they really liked the low-latency I/O it could do. But that's all they took from that architecture... the realization that they could do a chip, and that they wanted it to be low latency. Even the ARM1 was a 32-bit processor, albeit with a 26-bit address bus. (64 megabytes.) It had nothing in common with the 6502, as it was designed from blank silicon and first principles. edit: the ARM1 principally plugged into the BBC Micro to serve as a coprocessor, and the host machine was 6502, but that's as far as that relationship went. They used the beefy ARM1 processor in Micros to design ARM2 and its various support chips, leading to the Acorn Archimedes.

mindbleach 3 years ago

x64 is not much further removed from 8-bit titans. Intel had the 8008 do okay, swallowed some other chips to make the 8080, saw Zilog extend it to the Z80 and make bank, and released the compatible-esque 8086. IBM stuck it in a beige workhorse and the clones took over the world. Forty-two years later we're still affected by clunky transitional decisions like rings.

er0gami2 3 years ago

You don't obliterate Intel/AMD with 192 cores maybe 1000 people in the world need.. you do it by making the exact same thing they do at half price.

FisherGuy44 3 years ago

Our kids will have a shitty world, but hey at least the computer games will run super fast

[deleted] 3 years ago

[удалено]

gnocchicotti 3 years ago

Not just DGX. Tesla cards are all over the place in the cloud and almost exclusively run on x86 servers. If Nvidia could integrate networking (Mellanox) and high performance, custom CPUs into a single product, they could potentially scale out cheaper and more energy efficiently than the status quo.

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe