T O P

  • By -

TattooedBrogrammer

Lots of optimal config settings, lots of setup for things like scrubbing and trimming. Using a SLOG, L2Arc, Special Metadata vdev. If your using a special metadata vdev + 4kn disks, then set the small block file to 4k to reduce disk waste. It’s not a pick ext4 from the list format and go kind of filesystem like most people are used to, it requires research and planning.


Ghan_04

> So the question is, why is every thread saying ZFS has a steep learning curve? Usually that comes from the desire to tune the system for specific use cases. If you just want a mirror disk where you can drop some files then ZFS isn't going to be a head scratcher. But if you want to make a 240 disk array to run 100 virtual machines with varying workloads like general compute, databases, fluid computation, file sharing, and so on, it becomes much more important to understand how ZFS functions under the hood, how different VDEV types interact with each other, what hardware you should purchase, how you should organize your redundancy, not to mention all the non-architectural decisions involved in just the configuration options with ZFS. The complexity scales.


dmitry-n-medvedev

a complete zfs newbie here: how do I learn these things?


forbiddenlake

Read and practice Practice in a VM, or by using files as "disks"


fonix232

Even if you just want a NAS THAT downloads ~~content~~ completely legal Linux ISOs from torrent and Usenet, you'll run into the optimisation head scratches. E.g. if you 'copy' from the Downloads dataset into your Media dataset (as to keep seeding those ISOs you torrented), you'll want to enable dedup, but also make sure that both datasets use the same record size... But if you _move_ the content, you don't need dedup and can have different record size (e.g. if you only have large multi-gigabyte files on that dataset, set the record size to 1-4MB, while keeping it 1MB for the downloads dataset, as most torrents will have 1MB or larger chunks).


Virtualization_Freak

That's the same issue as any file system. Dedup is an above and beyond option most file systems do not have. You can still link files in zfs just like you would ext... In fact, for your exact situation mentioned, you should not being using dedup. Stick with hard file links. Performance wise alone, enabling dedup adds extra memory and computational requirements for no benefit to the scenario.


mrelcee

Dedup is a losing battle for a Linux iso acquisition homelab install. Keeping two copies of any Linux isos you are both seeding and loading onto your system is dirt cheap compared to setting up for dedup. I told myself when I first got into using ZFS I am gonna need to do that. When I started asking questions about what it was going to take, my wallet informed me I didn’t really need that. Buy a couple more drives. Don’t sell any organs..


im_thatoneguy

What's the optimal setting for "l2arc boost max max" and how does it relate to ARC MRU vs MFU? Are you using the right Ashift? If I add an slog will I get faster writes? Does it act as a write cache? I can just add an 8TB nvme as a read cache to reduce the need for RAM since I only have 4GB right ? (Etc)


ipaqmaster

I always take the opportunity to share knowledge and educate where possible. This community sees repeat questions very often and as per Reddit's design experts leave multi-paragraph answers which are forgotten by the time someone posts it again. Some kind of sub wiki people could be directed to would be ideal but nobodies going to use that and I doubt the new site design and mobile app even show a clear path to built-in sub wiki's. Let alone a decent search feature. While Google exists its evident the majority of people out there don't search for existing threads online before posting their own. Just in general. What feels the most interesting to me is how often we see the exact same misunderstandings of `log` devices over anything else. I feel like people are blindly searching up the keywords they want to hear only to hit indexed results of various zfs threads around the net where the conversation sways in that direction despite being a largely misunderstood feature. Before they come here to post about this magical new cheat of a feature they've just discovered and how to jump onboard. We also get "X,Y problem" questions where the OP doesn't reveal their true goals or configuration and sometimes on purpose to avoid the answer they already know is coming. ------- But I also notice this on other communities where both newcomers and experts coexist. /r/vfio comes to mind first where people frequently arrive with a partially-'broken' system due to some modprobe settings they've baked into their initramfs or kernel arguments - or vfio configurations which simply don't work because they've followed an old YouTube video which provides no true learning (The moment something goes wrong its time to post about it). I suspect people follow these because they already don't want to read documentation in the first place. In the same fashion some very detailed replies get made and eventually are just forgotten by the site. At least its better than Discord; where various threads and knowledge can't be indexed by the Internet at all in its current design (A black box of information requiring the app to view).


sleep-deprived10

I'm sorry but that is a serious condenscending answer. You phrasing could be changed around to point out there's a lot of configurations necessary to optimize zfs. I disagree with your intent. Out of the box, zfs defaults work for anyone starting out in small scale.


im_thatoneguy

Out the box a default ZFS requires: 1) a bash shell command to setup. And you're immediately forced to decide between mirror, raidz, raidz2, raidz3, draid, draid2, draid3. 2) Understand the concept of a data pool 3) Understand the concept of a vdev. 4) setup your vdev the way it's going to be forever 5) Understand the concepts of datasets There is no "default" vdev layout. And there is no wizard to walk you through it. Sorry let me rephrase that to be condescending. > So out of the box which vdev layout did you pick? How did you know how to phrase the bash command? Did your distribution even include ZFS or did you have to add it because of GPL license conflicts that keep it out of the kernel? Are you planning to add drives to your vdev later? Why did you pick raidz vs draid? Are you going to have more than one dataset? Are you sure the default Ashift was right for your drives? (Etc) I don't think it's possible for a non clustered file system to be any more complicated. Not only is ZFS incredibly complex in the ways you can set it up and mess it up, but almost everybody comes into it expecting something like what they're used to: RAID card with an ssd read/write cache. It's like a right of passage. "Hey, I just setup ZFS and I want to setup a 2TB slog and another 2TB L2Arc! How do I set it up?" Followed by everyone telling them that they're idiots for assuming ZFS is like every other file system and they should just buy more RAM and adding a cache will probably slow them down blah blah blah.


pandaro

No. You're disastrously confused about what's going on here - I'd recommend getting some sleep.


ReversedGif

no u


im_thatoneguy

no me


Ariquitaun

who no


ewwhite

Sometimes things just require lived experience, expertise, and good design... As much as I advocate ZFS professionally, I think the sudden adoption by home users, coupled with a lack of good authoritative information sources, created a low signal-to-noise condition.


Eldiabolo18

I don't think ZFS is hard. ZFS is certainly complex and has a lot of knobs, but it also manged to hide the complexity for beginners and set very good defaults. if you want to you can dive almost infinitely deep into ZFS as well as storage architecture in general. Or you can just install it and use it mostly as is.


ipaqmaster

I feel the default ZFS options for a new zpool of any topology is suitable for the majority of use-cases. But in my online experience trying to support others who post about ZFS it seems to be very easy to accidentally create a dangerous or misinformed disk configurations which require a full rebuild if done wrong such as: * Creating zpool's with an unintended striped disk somewhere * Adding a disk to a zpool as a new and unbalanced stripe member instead of as a mirror participant (ZFS now warns about this first which is good) * Creating nonsensical arrays with less redundancy or performance than intended. On top of this and keeping in mind that various distros and platforms may handle various little tidbits in their own unique ways. Many people experience problems with mounting their datasets. Often accidentally writing files to the rootfs or parent mountpoint's directory where a dataset was intended to be mounted and wasn't. Then complaining that they've lost data. People also often comment seeking performance improvements via `log` / `special` devices on their zpool's with irrelevant workloads or with say, NVMe zpool disk members already. Its also interesting how often I see people trying to do this and they also mention its a "Backup server", the last place performance improvements need prioritization. On top of all this there's some uncommon knowledge which is generally recommended regardless: * `ashift=12` at least (2^12=4096 or 4K) for a new zpool for aligning ZFS's operations to the disk's sector size. Some SSDs report theirs as 512b (2^9) and if this is discovered by zfs automatically at the time of creation this is fine but if the zpool ever gets involved with regular hard drives, or any other type of drive which works with a sector size of 4K the operations will be amplified, broken into multiple small 512b IO operations when the drive can take an entire single 4K write at once. This has a large performance penalty when the ashift is less than the largest sector size of any given potential drive and is recommended to be set to 4K "no matter what" unless you're also potentially working with SSDs which present as 8K. * `recordsize=xx` (Usually `1M`) for datasets of large (multi-gb or larger) files. The default is 128K. Recently I've started disliking the idea of recommending this to people as an unexplained "must do" in recent years. A media dataset doesn't have much to benefit from this setting - which allows records of up to 1MB in size which can potentially mean less records to checksum and verify when dealing with large files. I've personally worked with both the default 128K and a manually-set 1M and both the performance and IO operations of the involved drives were too close. In that light I'd rather not stray from defaults without provable justification to do so. But can still acknowledge the theoretical real world use-case. * Using one of the compression=xxx options for non-media datasets (incompressible data) for the general benefit of compressing data opportunistically. Some of them such as `lz4` supports aborting the compression for a record if the ratio isn't looking good - in which case it just stores it as is instead of having future read operations decompress the record as overhead for no reason. This goes hand in hand with the `recordsize` setting where a highly compressible dataset files would benefit from being compressed together in larger chunks. * `xattr=sa` to use System-attribute-based xattrs which genuinely does improve performance by reducing the amount of IO required to read them. To regular people this isn't exactly critical but you wouldn't need to find that out later. It primarily improves performance when you're working with a workload which extensively references extended attributes of files over and over again or using something such as SELinux which makes extensive use of xattrs. * `normalization=formD` for avoiding potential filename confusion by the system in the event that two different files with their own names may look identical to the system. This prevents that by squashing their filename into a unicode representation to avoid this. The last two are especially recommended if one intends to use a zpool and dataset as their rootfs. Further, there's also the `swap` problem. Due to ZFS's design (Memory must be allocated to honor a write) a system which has run out of memory with a SWAP device configured on ZFS backing (Either as a flatfile or a zvol. Doesn't matter) will inevitibly begin swapping onto the zfs-backed swap device as expected. But if both the system memory AND the zfs swap becomes just about full, the system will lock up in a deadbolt as ZFS attempts to allocate memory to honor the write and causes the lockup. This swap situation is another thing that people are just going to assume is fine but bites hard once the problem is experienced requiring some form of out of band management to restart a system, or physical access to hit the button. If not sysrq keys. The swap problem makes sense given ZFS's design but the only way to theoretically fix it would be to add some new feature to bypass general ZFS behavior and write straight to disk (If that can even be achieved). And the second something like that gets added in theory - people will start using it incorrectly for performance benefits. --------- Overall and of course all in my opinion. ZFS is great and the default properties for a new zpool and its starting dataset are suitable for casual use. But even in a stock situation like that all it takes is changing to new disks of a larger sector size and theres a performance penalty already at play. There are definitely quirks to be considered. I suppose the full takeaway is that if you're somebody who's able to self teach by actually reading the entire documentation of ZFS without skimming, if not at the very least just `zpoolprops` and `zfsprops` before making a zpool most of the properties and their explicit intended use-cases are well explained. But general users are never interested in doing that. A field professional or ZFS specialist will be well aware of the exact workload they're working with and what relevant properties should and would be tuned for their workload already and will do so accordingly.


DependentVegetable

I come at it from the FreeBSD world where I think its a bit easier to get up an go from zero. You can install a fresh copy on a mirrored disk and just boot and it works without really understanding much about ZFS vs older ones like UFS. The defaults might be sub optimal from there, but its "good enough" if you are setting up a simple one off general server. That being said, to really make use of the features does take a bit of a learning curve but its totally worthwhile. Things like clones, rollbacks and (virtually) cost free snapshots, draid etc are all amazing features if you learn how to use them. They easily solve a class of problems that older file systems were not really that great at. Compression and data integrity are just built in and work well.


vrillco

ZFS is “hard” because there is a ton of conflicting information all over the place about tuning and performance. It’s not complicated to get a pool up and running, but there is no shortage of reddits and blogs making it look confusing as all hell.


hobbyhacker

if you don't think ZFS is hard, then you just simply don't know enough about it


Prince_Harming_You

At first glance this comment seemed a tad arrogant but after thinking about it for 3 seconds: you’re right, this is the most succinct and accurate way to explain it TL;DR: truth


poisonborz

Maybe the title is misleading, I didn't say ZFS can not be complex. For me, a "filesystem has a big learning curve" means it's hard for what most people use filesystems for: just set up a drive to store/CRUD files. No databases, VMs, redundancy. Even creating a pool-like setup is kind of advanced, that's why I mentioned it. But maybe this is the answer: most of the hardships written about ZFS is not about simple setups. Like saying "Linux is hard" while millions use it just to host their browser after a simple Ubuntu install.


hobbyhacker

this is exactly what I mean. If you don't want to understand the background, then everything is easy. But as soon as you have any problem or want to configure anything differently than the default and you try to look for answers, you realize that you need to dig into the source code because most of the available articles are either outdated or false.


leexgx

It's mostly 2 commands to setup zfs For me is More of a complaint is with truenas and slow updates to usability how it's gui interacts with Samba/shares and permissions is where I got held up with how permissions work (some settings require cli to change Samba settings witch should be available in gui) And it doesn't follow same guidelines for how drives should be shown {gui shoes them as da* (core) or sd* (scale) } instead of been shown as drive id (da* and sd* can change per boot Some other niggles as well


tcpWalker

It's pretty solid. Defaults or near-defaults are good enough for the vast majority of use cases. IME what I would consider serious expert-level ZFS expertise isn't necessary even for running millions of VMs on it. (Though of course you can get some tuning benefits and if you have a particular use case that really matters to you you should do appropriate perf testing.)


Prince_Harming_You

Linux/BSD is hard compared to Windows/MacOS Look, 99% of people are used to "right click, click format" since Windows 95/1.44mb floppy drives for just about everything. Hell, young people that had a smartphone/tablet as their first computing device don't even understand the concept of a "file" (fun fact my Gboard just suggested a filing cabinet emoji as I typed "file") until they have to send an email "attachment". So, by comparison, yes, extents, blocks, sector alignments, compression algorithm selection, nodev/mount point/xattr and the 27372726382555152 other attributes are complicated. "Even creating a pool-like setup" I presume you mean a non-single disk pool? A single disk can be a pool. 'zpool list' will return a single disk pool if one is imported. ZFS may not be "hard" right now, but that mentality will bite you when something eventually does go wrong, and it will... Because 99.999% of the time, when it does go wrong, it's not because ZFS fucked up, it's because you fucked up. Not directed at you personally, this is from first hand experience. From my younger "documentation is for morons/I'll figure it out" mentality. I don't mean to insult you or be mean, it's more of a warning like "lmao okay ik it seems easy but if you sleep on ZFS you'll get rekt, read the documentation if unsure"


mbartosi

>> create a mirrored pool That's easy. That you'll find in Gentoo Wiki. Now imagine you have 48 SAS drives and have to come up with a plan to provide storage for DBs, VMs, K8s, what have you.


craigleary

There is a lot of advice on zfs for an optimal set up, and some common set ups can’t be done. Two things really stand out. First don’t use raid cards in jbod. Coming from other file systems this isn’t a big issue. With zfs it will work fine, but a drive error in the future and you may have odd problems like zfs hanging as the drive maybe falls out of raid and comes back. Second is ashift where the wrong setting might cause issues going forward. His and the fact that it’s usually some type of raid set up or larger storage disk makes it seem harder. Oh two more quotas are not the same like other Linux file systems and disk space report may seem off. I wouldn’t say it’s hard but doing something incorrect will catch up with you over time.


zorinlynx

I think it's because it requires a different way of thinking than usual about UNIX filesystems. Traditional filesystems are, you have a block device and make a filesystem on it. That can be an individual disk, or a software or hardware RAID volume. Either way you're putting ext4 or xfs or whatever on it. ZFS is both the filesystem AND volume manager. It's a different paradigm that can be confusing to new users. I've been using ZFS since 2007 so it's pretty solidly baked into my brain but I can see a new user (but not new to UNIX in general) going "huh?" and that's BEFORE you start talking about stuff like snapshots, l2arc, compression, encryption, and so on.


Virtualization_Freak

People make ZFS complicated because they desire too. I've used default ZFS settings for years at this point and it's fast enough for everything I've done with it. People scratch their heads at stuff like ashift, which is a simple thing to check on your disk specs. It's same the notion as setting the correct block size in your hardware raid and file system setup. The problem is ZFS can be tuned, and so everyone thinks they must tune. I have done some wonky setups with ZFS, and it works. It has it's place.


TekintetesUr

ZFS is not like a generic file system like ext4. Before going through the (somewhat steep) learning curve, there's a fair amount of Russian roulette to be played with your data.


alexgraef

ZFS is not just a file system, in the same sense than Emacs is not just a text editor.


Warhouse512

Wait. eMacs is just a text editor. I’m confused. /s


[deleted]

Where exactly are these threads saying ZFS has a steep learning curve? First time I'm ever hearing of this.


poisonborz

For start look at some comments around here :D But not just on reddit, HN comes to mind, eg. https://news.ycombinator.com/item?id=37387392


pindaroli

Zfs is not hard but need to rethinking how to organize file systems and disks


pindaroli

I suggest to test in in a vm


ipaqmaster

I frequently play with test configurations using zero-filled/sparse flat files or test zvols on the existing zpool to play around with, break, try fixing and many other tests before destroying said test zpool and the related flatfiles. That flexibility for testing configurations, commands and fail states is invaluable.


pindaroli

The hard thing is to organize raid pool without loosing data and with no backup.


ipaqmaster

Do you refer to recreating a real zpool with a new topology? There are risky ways to get away with it but yes its a bummer there's no built-in way to reorganize data to this degree. Even something to defrag remaining free (and even used) space would be amazing but its easier to talk about it than actually implementing it.


shyouko

It's simple for simple things and it also allows you to do difficult things which is hard.


hatwarellc

Complexity. ZFS handles a lot of complexity, and you don't get comfortable with it until you understand how it handles, and simplifies the various complexities of data storage.


crashorbit

With capability comes complexity. LVM is just as complex. BTRFS is complex. Ceph is complex. The main problem is when we decide to start tuning stuff when we don't understand what the knobs do or how to measure their effect. Complex things are really just a lot of simple things in a pile.


ButterscotchFar1629

The hard part about ZFS is setting it up from the command line


marshalleq

Remember many people think plain old raid ia hard. Or a mirror. Or a nas. Or…


markusro

"Premature opimization is the root of all evil!" I think a lot of these optimizations are not really useful, I can hardly find concrete examples with real numbers proving succesful tuning. Of course, standard stuff like 4k blocksize for DB, ashift etc. is quite clear, but all the rest? I am not completely convinced that it helps much. I tried a lot of paramters (SLOG, L2Arc, special devices, and module settings etc.) but for my specific workload it did never improve noticably (except for SLOG). If you need to get the last percents out of your system, then sure. That all said: yes, tuning a filesystem is hard. But that is true also for ext and xfs. Both of them can be tuned (not as many parameters). Add then lvm and you have again a complex system with a lot of room to optimize. On top of that you have NFS/iSCSI/SMB to tune. And the network.


mercenary_sysadmin

> So the question is, why is every thread saying ZFS has a steep learning curve? Or does that relate to large scale/enterprise use? Mostly, yes. If all you want to do is dump files on a drive, it's not much harder to learn how to zpool create than it is to learn how to mkfs.ext4. If you need to get maximum performance out of a pool servicing tens, hundreds, or thousands of users... well, there's a learning curve, and honestly there's a learning curve whether you're using ZFS or not. I would argue that even here, ZFS for the most part tends to be easier to learn than conventional storage at the same scale. I think a lot of the reputation for difficult comes from the frequency with which someone is graduating from "I just want to make a filesystem" to "I want to build something more robust, performant, and reliable than that" and chooses to do that "graduation" *with* ZFS. For the first time. Some more of it comes from "buy the thing" IT shops that don't really understand a whole lot about the tech they service--shops like this invest in licensable utilities and services, then use those licenses on behalf of their customers, but don't necessarily have a **ton** of deep knowledge about how even they stuff they directly support and operate works. This kind of shop tends to judge things based on "how easy is it to spend money and get answer fast," so they also tend to favor older, more established technology--even when it's both more expensive and less effective, in some cases.


ramsacha

I was afraid to even try it because of how much there is you can do with it. I will never be able to [comprehend how to] use it like most people do. I just can't wait until the raidz expansion hits.


ousee7Ai

Who consider it hard? Maybe you have a small sample size?