Odd Storage Spaces Direct Repair Behaviour on Reboot

M.a.r.k.T. _ · Sep 27, 2018

2 nodes, both 2016 1607 fully patched

The S2D is a single 20 TB pool (mirror) with a couple of virtual disks (6 TB and 4 TB).

This configuration works. Except when rebooting a node (which takes a minute). According to Taking a Storage Spaces Direct server offline for maintenance

"When the server resumes, any new writes that happened while it was unavailable need to resync. This happens automatically. Using intelligent change tracking, it's not necessary for all data to be scanned or synchronized; only the changes. ... The BytesTotal shows how much storage needs to resync. The PercentComplete displays progress."

To reboot a node, we do 2 things:

1. Suspend the node

2. Enable-StorageMaintenanceMode on the StorageScaleUnit as per https://support.microsoft.com/en-us...-timeout-c00000b5-after-an-s2d-node-restart-o

Both command work fine and take only a minute or so. We then reboot the server. However, after bringing the node back online again, a repair commences. Whilst this is normal, it appears to be repairing a garganuan amount of data! In 1 minute, for example, S2D claimed that it needed to repair 1.83 TB, and this is on a file server where the daily rate of change is a tiny fraction of this.

It would appear that either "it's not necessary for all data to be scanned or synchronized; only the changes" is incorrect and it actually resyncs a ton more than that, or maybe the document is right and that's what is supposed to happen, but for some reason S2D thinks that more has changed than actually has.

(The resync takes about 3 hours, so we know that it really is transferring 1.83 TB over the network.)

Continue reading...

Odd Storage Spaces Direct Repair Behaviour on Reboot

M.a.r.k.T. _

Similar threads