NetApp
ONTAP Storage Failover and Giveback Runbook
Scope
This article covers planned storage failover testing or maintenance where one node in an HA pair is taken over and later given back. NetApp documents the key commands as storage failover takeover and storage failover giveback: storage failover takeover and storage failover giveback.
Prechecks
system health status show
storage failover show
storage failover show-giveback
cluster show
network interface show -role data
volume show -state !online
Confirm:
| Check | Requirement |
|---|---|
| SFO state | Enabled and possible |
| Partner health | Partner node healthy enough to serve storage |
| Data LIFs | Failover targets valid |
| Workloads | Owners aware of possible pause |
| SnapMirror | No critical unexpected lag or unhealthy state |
CLI Takeover Process
Use normal takeover for planned maintenance unless support directs otherwise:
storage failover takeover -ofnode node01
Monitor:
storage failover show
storage aggregate show
volume show -node node02
Complete maintenance. Then check giveback readiness:
storage failover show-giveback
Give back:
storage failover giveback -ofnode node01
Monitor until complete:
storage failover show
storage failover show-giveback
system health status show
REST API Validation Process
Use REST to collect pre and post evidence:
curl -k -u admin:'<password>' \
"https://cluster.example.com/api/cluster/nodes?fields=name,state,ha,uptime"
curl -k -u admin:'<password>' \
"https://cluster.example.com/api/storage/aggregates?fields=name,node,state,space"
curl -k -u admin:'<password>' \
"https://cluster.example.com/api/network/ip/interfaces?fields=name,location,state,enabled"
For takeover and giveback execution, use the documented CLI or System Manager workflow unless your automation standard has validated the matching REST operations for your ONTAP release and platform.
Best Practices
- Avoid
-forceand veto overrides unless support or a documented emergency procedure requires them. - Do not start giveback until the partner is ready and EMS reasons for vetoes are understood.
- Watch client-facing protocols, not just aggregate ownership.
- Capture the exact start and end time for both takeover and giveback.
- Keep node management access available during the whole change.
Backout
If giveback fails, do not repeatedly force it. Capture:
storage failover show-giveback
event log show -severity ERROR
system health alert show
Resolve the veto or health issue, then retry giveback. Use override options only when the operational risk is understood and approved.