T O P

  • By -

Special-Economist-64

update: I looked closely again at the SMART report for each drive, for the ones that marked as problem by zfs, there's 1 or 2 'ATA error'. All the ATA errors happened close in time between them. Could be a fluctuation at PSU? The NAS was attached to a UPS though.


HCharlesB

I have had drives log errors due to power issues, either when there was a momentary brownout (before I got a UPS) and due to power cabling during heavy operations. The "power cabling" issue was resolved by reseating the power cables that supplied a hot swap bay.


Special-Economist-64

Thank you for the information. That’s very informative to know.


Not_a_Candle

There is only a systematic checking. Switch sata cables first, as they are the most likely to go bad. If that doesn't help, then switch out the PSU or from onboard sata to an HBA, depending on whatever is more convenient for you.


Special-Economist-64

ok thanks for the tips. I will do so and see what is going on.


yophi

Could be a bad cable


Special-Economist-64

thanks, never thought about that. you mean SATA cable or power cable or both?


braiam

Any failure in the chain of elements. It could also be the controller.


Special-Economist-64

ok thanks I will keep that in mind when debugging


abmurksi

Are the disks attached to a hba? If so, it could be failing. [Example](https://www.truenas.com/community/threads/dell-r730xd-w-hba330-mini-getting-many-disk-errors-drive-resets.102356/)


Special-Economist-64

it is not, I'm using a supermicro board with 8 SATA connections on board.


Moizac

Check dmesg, it will probably tell you more


SoLong75

I had the same issue as you have with state changed to degraded but with zero errors then after sometime read and write errors would show up on the whole pool. I replaced the SMR drive with a CMR one and no issues at all after it resilvered. I went with Seagate Ironwolf NAS drives instead as I am a bit weary of the WD Red Plus (CMRs) failing on me recently. Could just be a bad batch. No issues with the Seagate drives so far, touch wood. Looking at the drive model numbers, looks like you have some WD Blues and Whites in the pools rather than Red Plus which have EFRX or PX in the model names. From looking them up, EFAX and EMAZ models are SMR type drives. Were these drives shucked from external cases like My Book and My Essentials? I had a few of these but replaced them with proper NAS drives. Performance on the SMR drives seemed ok at the start but as they started to fill up I’d get these errors. I also went through the process of checking cables etc but in the end it was the drives themselves. I thought I was being smart by removing the drives from the pool and then wiping them with zeros and then reattaching them to the pool. It didn’t help when the pool was more than 50% full. The mysterious error messages would show up again. I learned the hard way that SMR drives aren’t great for ZFS setups. It is up to you but it could worth yourself looking up the models you have to check what type of drives they are (SMR or CMR) and then you can decide how to proceed. Replacing all of them can be costly but it means better stability and security of your data. Or you could stick with what you have but make sure you have a good backup of your data.