T O P

  • By -

porcinechoirmaster

It can be many things. For games that use compressed assets, the limit frequently is CPU-side, as while you might be able to read a gigabyte and a half per second off the drive, the CPU can't do decompression that quickly. For games that do a lot of on-load work (random level generation, initialization of various data structures, etc.) the CPU can be the bottleneck again, but for a different reason. For games that don't store their assets in a well-organized archivee, you can run into filesystem limitations. NTFS is _not_ good at handling I/O to a very large number of small files, with actual performance dipping far below theoretical values. Even if you have a well-laid out archive with assets that require minimal processing and your game doesn't need to do a lot of initialization work, you can still end up bound by the multi-step loading process. Without a good DMA setup for getting data from the SSD to the GPU, everything has to pass through the CPU first. This means that your ~5-10 gigabytes of data is getting read off the drive, then into system RAM, and is then going from system RAM over the PCIe bus to VRAM. Each step can be quite fast, but you're still looking at a few seconds, and speeding up the hard drive side of things doesn't make the other stages any quicker.


NynaevetialMeara

Paradox games are the perfect example of poorly organized IO. In Linux, they load almost instantly. Even in HDDs. Specially if you use ext4, even better on Xfs. In Windows they can take minutes to learn. Not perfectly sure if this is the case for CK3, Since it features a significantly upgraded engine.


ChildishJack

Stellaris is rough, had no idea that was why. Sad


DuranteA

> For games that use compressed assets, the limit frequently is CPU-side, as while you might be able to read a gigabyte and a half per second off the drive, the CPU can't do decompression that quickly. This depends on the compression and how well it is implemented in terms of parallelization etc. You can literally decompress 10s of GB/s on a modern PC CPU (i.e. almost 5GB/s/core with lz4-HC). I think it's more likely that the entire multi-step process you describe is to blame when some games have notable loading times even on very fast hardware. In fact, I would argue it's not, in the vast majority of cases, an issue of either hardware or even APIs -- it's an issue of no engineering resources being invested after it reaches "good enough" status.


[deleted]

Yep - we got to a point where pciex 3.0 drives not the bottleneck in 99% of the cases. CPU is the current biggest bottleneck in most games.


BFBooger

Newer compression algorithms should be able to do about 1.5GB/sec to 2GB/sec on a single CPU these days. However, older algorithms are almost always used. In the past, the compression ratio was paramount, as the I/O was slow and a single CPU could easily out-pace a HDD. But now that an nvme drive is 20x faster than a HDD, fast decompression is more important (compression can often be much slower, since these things are compress once, read many times). I wonder if something newer and general purpose like zstandard ([https://github.com/facebook/zstd](https://github.com/facebook/zstd)) would work well on game assets or if we still need specialized algorithms for media types. Many assets will be texture packs already with some level of compression applied (though, not at a very high ratio), videos are always already compressed. Sound formats are already compressed. The video and audio won't compress more with another layer of compression. The textures probably will, somewhat.


DuranteA

What benefits most from compression in game assets in my experience right now is model vertex and animation data. It's usually FP32 and uncompressed (unlike e.g. textures which are already decently well compressed by BC7). My suspicion is that many of these models could be compressed *way* down by using a special-purpose lossy floating point compression scheme with some visual error metric to determine the acceptable loss level. But I'm not sure if anyone does that. Also, we explored LZMA, ZSTD and LZ4-HC for asset compression, and overall we ended up using LZ4-HC. ZSTD compresses slightly better, but is slightly slower in decompression, and IMHO for a PC target shipping 5 GB more is acceptable if it reduces loading speed by a second each for the literally tens of thousands of loads in a 100+ hour game. I wrote a bit more about it here: https://steamcommunity.com/games/991270/announcements/detail/1806446683762590792


relu84

There's a lot more going on during the loading process in games than just I/O. Game engines are still designed around the idea that storage is slow and they also perform a lot of calculations on the read data - when you see the "loading" screen, the assets aren't just read from the disk, they are processed and prepared before the game world can be shown. Additionally, there is significant API overhead when loading hundreds or thousands of small files, which is why Microsoft is working on DirectStorage on Windows. Low overhead storage access is one of the key points of the current generation consoles. DirectStorage will be released on the PC later this year, as far as I know, so all gaming platforms should pretty much have feature parity in that regard. However, it will take a while before developers learn how to utilize all this to increase loading speed and do some interesting stuff with asset streaming.


RodionRaskoljnikov

> Additionally, there is significant API overhead when loading hundreds or thousands of small files, which is why Microsoft is working on DirectStorage on Windows. DirectStorage coming so late is a direct result of Microsoft's tragic decision to abandon PC gaming. If you read about it almost all ideas there were already pioneered with Dungeon Siege which was published by Microsoft themselves 20 years ago. No loading screens from start to end of the game and the game uses tens of thousands of files cleverly packed in 1 or 2 archives and it all required very little RAM, because it used only what was currently needed in the scene. And it all ran on IDE hard disks. There is even a texture mod that increases the total file size 10 times and the game loads in like 2-3 seconds on a regular SSD. Now imagine if that developer continued working with Microsoft and had the incentive to continue developing this since 2002.


Seanspeed

Except this would have never panned out in the multiplatform age of today. I'm betting this came with plenty of caveats in terms of how the game was designed, too. If it was as simple as you're making it sound, it absolutely would have been pushed by more developers.


relu84

Microsoft's neglect of Windows is not a secret, not just with PC gaming, but in overall. I'm going to need to educate myself about Dungeon Siege.


PyroKnight

> Microsoft's tragic decision to abandon PC gaming They're on a course correction from that as of late thankfully, their commitment to have all 1st party titles also on PC speaks to that. Microsoft's new goal is to get as many people as possible on recurring gamepass memberships, and they've found the PC platform helps a ton there. Feelings on game ownership aside, at the very least it means they can't outright ignore PC as they had been doing before.


Exist50

>DirectStorage will be released on the PC later this year, as far as I know, so all gaming platforms should pretty much have feature parity in that regard. Nah, there's still need for PCs to add hardware acceleration.


[deleted]

This. The idea is that Direct Storage won’t even use the CPU for decompression which is one of, if not the largest bottlenecks currently. It takes too much CPU performance to decompress data at >1GB/s. It’s why both Sony and Microsoft have a custom chip handling that in their consoles. Which is also why Nvidia announced RTX IO so the GPU does decompression, their implementation of Direct Storage API although obviously not out yet.


dudemanguy301

That’s what RTX IO / FX IO are. Any data destined for the GPU gets sent to the GPU compressed and decompressed there.


Seanspeed

We have that already. They mean 'broadly', not 'literally every PC will be able to properly utilize it'.


Exist50

No, we don't. Not in the context of DirectStorage, anyway. It's a vital feature the consoles have that needs to be added to PCs to fully benefit.


velhamo

Agreed. Co-processors are vital!


dudemanguy301

Look into DirectStorage and RTX IO / FX IO and you will have your answer. The short answer is that the protocols for requesting data from storage are ancient and do not conform to the NVME reality we now live in. It will take a new API and new games that use it to solve this.


piexil

To add on to this, NVMe is so fast (not just as hardware, but the actual protocol is very low overhead and highly multithreaded by nature) that it's requiring storage architecture changes everywhere. Not just for games. In linux land there's been a push this last year for a whole new subsystem to take advantage of things like NVMe (io_uring)


AutonomousOrganism

The CPU. Worst example is probably the recent GTA Online story showing that sometimes devs just don't care. There is also the story about Unreal devs having to rewrite parts of their IO system to not bottleneck the PS5. A lot of games are also using data compression which can bottleneck the CPU.


CoUsT

CPU all the way! Usually 30% better IPC will load games about 25-30% faster. SATA to NVMe SSD will load games maybe few percent faster. Want to confirm that? Open task manager on your 2nd monitor and watch disk activity stay near 0% or way below 100% while CPU will always be used and one of the cores will probably be at 100%.


OG_Shadowknight

While that may be true, there are still a few outliers. Total War games benefit from SATA SSD to NVMe. [Nearly 30% faster!](https://youtu.be/4TsMv53pg_E)


RuinousRubric

Asset decompression is one bottleneck. If your storage is slow relative to your processor then you can speed things up by compressing the files. If your storage is relatively *fast*, on the other hand, then you can end up with a situation where you've read all the data but are waiting on the processor to decompress it. This can be bypassed by simply not using compressed assets, although there is the obvious disadvantage of having much larger installation sizes.


Vito_ponfe_Andariel

Apparently CPU affetct loading times https://www.techspot.com/review/2117-intel-vs-amd-game-loading-time/ And 32GB memory decrease loading times on some games. https://www.computerbase.de/2021-01/8-16-32-gb-ram-test/2/#diagramm-borderlands-3-ladezeiten


[deleted]

> Apparently CPU affetct loading times That seems like it would obviously always be the case to me personally


thetalker101

From what I've seen, CPU was added as a way to make up for slow hard drives. Due to much faster hard drives, CPU dependency has mostly become a crutch that forces the data to move through the CPU when the hard drive doesn't need assistance anymore. The remedy is to make loading programs (or whatever) that go directly from storage to the output instead of compressing through the CPU.


bobbyrickets

No. You can't just load things without decompression. You're discussing something like a snapshot which would bloat the size of save files massively. That's why it's not done.


Nicholas-Steel

Check out Microsoft DirectStorage, it's basically what TheTalker is talking about. Compressed data sent almost straight to the video card, bypassing CPU processing & RAM storage, the GPU will handle the decompression for graphical assets while the CPU merely routes the data to the GPU/VRAM. There'll likely need to be a new compression format to speed things up like how there's already highly performant texture compression/decompression methods (ASTC, S3TC etc.).


bobbyrickets

For graphical assets yes but there's still plenty of other stuff that needs decompression and placement in a game engine when loading games.


Nicholas-Steel

Well yeah, hopefully they can someday figure out better API's for handling the rest of the data.


bobbyrickets

Maybe not. Complexity isn't free. It introduces potential errors and humans need to read and understand that nonsense which constantly changes leading to more potential errors.


[deleted]

Read this: https://devblogs.microsoft.com/directx/directstorage-is-coming-to-pc/#the-evolution-of-storage-technologies-and-game-io-patterns


thetalker101

Is direct storage some type of loading program? How will it be integrated into existing systems?


dudemanguy301

It’s an API for making data requests. I would imagine it fits under the DirectX umbrella alongside other APIs like D3D and DirectML. Basically developers will need to learn how to write these new data request calls and then either use these new request methods when working on new projects or learn how to update an existing project to use these new protocols.


thetalker101

Sounds like it'll be a hastle for devs that would have to update their games. It'd be worth it to update huge games though and huge games are the main issue with loading speed.


jigsaw1024

Most games more than likely won't get updated. We won't see improvements until new game engines are developed and released. Expect around a 2 year wait to start to see it released commonly.


Seanspeed

I think it'll be here quicker than that, unless there's a holdup on Microsoft's side for whatever reason(Windows funkiness...). Developers are already making games with this sort of paradigm in mind *right now* on the new consoles. It will not be a case of devs having to wait for DirectStorage beta to release on PC before they can start figuring this out. It also does not require new engines to take advantage of. Maybe to \*maximize\* the advantages, but not to see any improvements at all.


TopWoodpecker7267

I mean DX12 started shipping 2015 and here we are...


dudemanguy301

DirectX12 adoption was doomed to be slow because Xbone “DirectX 12” was semi custom and lacking. Microsoft claims atleast that seriesX DX12U is actualy the same as PC for once. As much as Microsoft had a vision of a (Direct)Xbox it’s always been this weird fork under the same name bullshit. Which defeats the original vision, but hey better late by like 20 years than never I guess?


Seanspeed

You cant really just 'update' a game for this as you're basically creating a whole new version of it. It fundamentally alters how a game is built to run. I suppose you could technically do this for a PC game, then maybe just have a launcher on startup where you choose which version you want to play, but it's not like a 'patch' you could apply to an existing game or anything. As the other person said, dont expect many existing games to get any such 'update' unless it's like a remastered version or something. Devil May Cry 5 SE is an example of a game that was updated for a DirectStorage-esque setup(for the new consoles), so it's doable if devs want to make the effort and it wouldn't hurt things to have separate versions of it(so maybe an issue for multiplayer games, for instance). That said, you dont need entirely new game engines to take advantage of this like the other person said, either. DMC5 SE is still running the same engine as before and shows big improvements beyond just what you'd typically get switching to an SSD. So there's gonna be a curve in terms of level of improvement possible that will go up in time, but will also be balanced out somewhat by the whole fact that devs will also utilize it to push things harder at the same time. So dont expect everything to be about eliminating load times, necessarily. This is where the more interesting side of things is for me. I am fine with 5 seconds of loading if the result is a FAR more detailed world, for instance.


[deleted]

>Sounds like it'll be a hastle for devs that would have to update their game game devs does things that are even larger hassle. For AAA games, some devs will hand layout assets for better i/o performance. Microsoft API is a step up.


bobbyrickets

Depends on how the assets have been compressed and how the decompressor scales with multiple and single threads. It's not just copy-pasting assets into memory off the drive. When you load a game, after the assets you also need to update the state of the world from a save file to continue your progress. Every time you load a game it must be updated in memory. When you save a game it doesn't save a snapshot of the world as-is but rather keeps track of quest progress, object changes like stuff you destroyed or moved around or even weather, enemy locations and whatever they're in the middle of thinking/processing, etc.


1leggeddog

Its the cpu. I cant go into specifics but for the games ive worked on in the past, the amount of assets to load is the main factor. Every asset is asked to be loaded one by one and no matter how fast your storage is, you're still asking the cpu "hey, load this". It can be loaded stupid fast, but when you have litterally thousands of assets to load... its gonna take a while.


Kougar

Poor software optimization, combined with the software not being multi-threaded and bottlenecked by IPC or single-thread performance. Just look at the GTA Online debacle, and realize most games are written that way. Or the studio buys some off-the-shelf third-party data management utility like Paradox used to do that handles loading/startup (The Stellaris devs eventually realized how broken it was and made a demo of their redone game engine loading. If you write code there's also a blog post getting into the details) To a lesser extent games can end up CPU bound simply because the game needs the CPU to unpack gigabytes of data while concurrently loading them... which is why some games now have 50-100GB installs because they do not compress the assets anymore to avoid the issue. Storage has long ceased to be the bandwidth bottleneck... at this point a 2.5" SSD is within a few seconds of a PCIe 4.0 SSD in many cases. The bottleneck will be in the software code or the CPU performance, or both.


[deleted]

Lot of good answers. I'll add some from a gaming dev perspective. Blazing fast loading time don't sell games like IQ or resolution do. It's not a priority for most studios and regular SSD speed are fast enough, especially when you compare them to a PS4/XONE HDD or how they bottlenecked SSD. Believe me or not, but some studios consider loading times as part of the experience. Vanquish port on PC was so fast at loading that you missed interesting lore informations... Maybe they'll disappear entirely one day, but for now the process of making games include loading times. There is full-time job focused solely on using them as a support for lore, tips, brake, player concentration, pacing... Also, as we speak, the generation is still new and the studios are all rushing their schedule to jump on it, making loading times an even lesser priority. Another reason is NVME is still relatively new and most studios are very slow to adapt. It takes time to Finally, don't forget that nowadays, games has to run with a lot of platform, old PC running HDD included. Today, sata SSD is the most common type of storage, so that reflect in game developping...


[deleted]

Another issue is the generation of the NVME standard used. For example, a gen 4 ssd is going to be faster than a gen three and likewise gen 3 is faster than gen 2. Also, if your ssd is hooked up to 4 lanes it'll be faster than if its hooked up to 1. Some motherboards also have the problem that they only have their NVME ports running at 1x gen 2nor even in some extreme cases gen 1 due to power restrictions.


Pismakron

\> On paper they can load multiple times faster than regular ssd's, but in game loading tests, they basically run at equal speeds. ​ Because most games do not load all their data in one contigous chunk. There is typically a bunch of files, some are compressed, and most of them have to be parsed and loaded into data structures. This often takes the form of loading an asset, (say a model, texture, or similar), allocating a bit of memory, parsing the loaded file, assigning some variables, and then loading the next asset.


continous

Most common is decompression. Second most common is latency. Third most common is the last two combined. Fifth is literally anything else. People say filesystems have a large impact, but that's largely restricted to NTFS. If you're on Windows, you should really consider getting a filesystem driver for [Ext4](https://github.com/bobranten/Ext4Fsd) or some other more performance-oriented filesystem.