For the latest best practice, visit https://learn.microsoft.com/en-us/azure-stack/hci/manage/maintain-servers


When taking an S2D server offline for patching or other reasons, it is not only taking away the compute and memory for that server but also a portion of the storage pool. Care must be taken to keep your data safe and ensure quick resumption of production-level readiness to your cluster.


Visit Microsoft for the full description and latest information: https://docs.microsoft.com/en-us/windows-server/storage/storage-spaces/maintain-servers


Key Steps to reboot servers:

1. Open PowerShell as Admin.


2. Check to make sure the virtual disks are healthy by running Get-VirtualDisk.


3. Run Suspend-ClusterNode -Drain to move the VMs to another node.


4. Run to cleanly put the storage into maintenance mode. At this point writes to this node’s storage are still active until step 5 has been completed.

Get-StorageFaultDomain -type StorageScaleUnit | Where-Object {$_.FriendlyName -eq “<Node Name>”} | Enable-StorageMaintenanceMode


5. Run to verify the disks for the node are in maintenance mode. You should see “In Maintenance Mode, OK” under Operational Status.

Foreach($Node in (Get-ClusterNode).Name){$Node;Get-StorageNode -Name $Node*|Get-PhysicalDisk -PhysicallyConnected}


6. Reboot server.


7. Once you’re ready to put the server back into production, open PowerShell as Admin.


8. Run to put the storage back into production.

Get-StorageFaultDomain -type StorageScaleUnit | Where-Object {$_.FriendlyName -eq “<Node Name>”} | Disable-StorageMaintenanceMode


9. A storage job will initiate in the background to repair and resync the data. To check on the status, run (as Admin) Get-StorageJob  If it returns to a command prompt that means there are no jobs running. Do not reboot the next node until all of the jobs have been completed.


10. Run Get-VirtualDisk to verify the virtual disks are healthy after storage jobs complete. Wait until steps 9 and 10 have been completed before live migrating VMs back to this node as storage jobs will consume system resources potentially affecting the response time of your applications.


11. Run Resume-ClusterNode -Failback Immediate to put the cluster node back into production to handle VM workloads.


www.dataonstorage.com | 1-888-725-8588 | sales@dataonstorage.com 

Copyright © 2020 DataON. All Rights Reserved. Specifications may change without notice. DataON is not responsible for photographic or typographical errors. DataON, the DataON logo, MUST, and the MUST logo are trademarks of DataON in the United States and certain other countries. Other company, product, or services names may be trademarks or service marks of others.

09/23