NetApp + Automation
Automate NetApp Volume Provisioning with Ansible
Why Automate Volume Provisioning
Manual volume creation through ONTAP System Manager or CLI is acceptable for one-off tasks, but breaks down at scale. When your storage team provisions 10-50 volumes per week, manual workflows introduce risk:
- Naming inconsistency: Different admins apply different conventions, making volumes hard to search or audit.
- Policy drift: Snapshot policies, export rules, and QoS settings vary based on who created the volume.
- No audit trail: CLI history logs don't capture the business context or requestor information.
- Slow delivery: Ticket-to-delivery time averages 2-4 hours when humans are in the loop.
Ansible automation solves these problems by codifying your standards into reusable playbooks. Once deployed, provisioning becomes a self-service workflow that maintains consistency, logs every change, and completes in under 5 minutes.
Prerequisites and Environment Setup
Before running Ansible against ONTAP, ensure these components are in place:
Software Requirements
- Ansible 2.14+ on your control host (supports Python 3.9+)
- NetApp Ansible Collection: Install with
ansible-galaxy collection install netapp.ontap - NetApp ONTAP Python Library: Install with
pip install netapp-lib - HTTPS access from your Ansible control node to ONTAP cluster management LIF
ONTAP Configuration
Create a dedicated service account with least-privilege access. Avoid using the admin account in automation.
Here's an example role definition that grants only the required permissions:
# Create custom role for Ansible automation
security login role create \\
-role ansible_provisioner \\
-cmddirname "volume create" \\
-access all
security login role create \\
-role ansible_provisioner \\
-cmddirname "volume modify" \\
-access all
security login role create \\
-role ansible_provisioner \\
-cmddirname "volume show" \\
-access readonly
# Create service account
security login create \\
-user-or-group-name ansible_svc \\
-application http \\
-authentication-method password \\
-role ansible_provisioner
Define Your Standards
Document these defaults before automating. Inconsistent standards create more problems than they solve:
- Naming convention:
{app}_{env}_{purpose}(e.g.,oracle_prod_data) - Aggregate selection: Map workload types to specific aggregates (SSD vs. HDD)
- Snapshot policy: Default to hourly for production, daily for dev/test
- Tiering policy:
autofor cold data,nonefor hot workloads - Space guarantee:
nonefor thin provisioning (recommended)
Production-Ready Playbook
This playbook goes beyond the basic module call. It includes validation, logging, and error handling that you need in production environments.
---
- name: NetApp ONTAP Volume Provisioning
hosts: localhost
gather_facts: false
vars:
ontap_hostname: "cluster1.example.com"
ontap_username: "ansible_svc"
ontap_password: "{{ vault_ontap_password }}" # Use ansible-vault in prod
svm: "svm_data01"
vol_name: "{{ app_name }}_{{ environment }}_vol"
size_gb: 500
aggregate: "aggr1_ssd"
snapshot_policy: "hourly"
tiering_policy: "auto"
tasks:
- name: Validate volume naming convention
assert:
that:
- vol_name | length <= 64
- vol_name is match('^[a-z0-9_]+$')
fail_msg: "Volume name must be lowercase alphanumeric with underscores, max 64 chars"
- name: Check aggregate free space
netapp.ontap.na_ontap_info:
hostname: "{{ ontap_hostname }}"
username: "{{ ontap_username }}"
password: "{{ ontap_password }}"
https: true
validate_certs: false
gather_subset:
- aggregate_info
register: ontap_info
- name: Verify sufficient capacity
assert:
that:
- ontap_info.ontap_info.aggregate_info[aggregate].aggr_space_attributes.size_available | int > (size_gb * 1024 * 1024 * 1024 * 1.2)
fail_msg: "Insufficient space on aggregate {{ aggregate }}. Provision failed."
- name: Create volume with policy defaults
netapp.ontap.na_ontap_volume:
state: present
name: "{{ vol_name }}"
vserver: "{{ svm }}"
aggregate_name: "{{ aggregate }}"
size: "{{ size_gb }}"
size_unit: gb
space_guarantee: none
percent_snapshot_space: 5
snapshot_policy: "{{ snapshot_policy }}"
tiering_policy: "{{ tiering_policy }}"
hostname: "{{ ontap_hostname }}"
username: "{{ ontap_username }}"
password: "{{ ontap_password }}"
https: true
validate_certs: false
register: volume_result
- name: Log provisioning event
lineinfile:
path: /var/log/ansible/storage_provisioning.log
line: "{{ ansible_date_time.iso8601 }} | CREATED | {{ vol_name }} | {{ size_gb }}GB | {{ svm }} | {{ aggregate }}"
create: yes
- name: Send notification
debug:
msg: "Volume {{ vol_name }} successfully provisioned on {{ svm }} ({{ size_gb }}GB)"
Essential Guardrails and Validation
These guardrails prevent common mistakes that lead to production incidents:
1. Pre-Flight Capacity Checks
Always verify aggregate capacity before provisioning. The playbook above checks that available space exceeds the requested size by 20% to account for metadata and growth. Failing fast prevents partial provisioning failures.
2. Naming Convention Enforcement
The regex validation (^[a-z0-9_]+$) ensures volume names are CLI-safe and sortable. Reject uppercase,
spaces, and special characters at automation time rather than discovering issues during mount operations.
3. Change Logging
Append every provisioning event to a timestamped log file. When troubleshooting performance issues weeks later, this log correlates volume creation timing with application deployments. Integrate with Splunk or ELK for centralized visibility.
4. Credential Management
Never hardcode passwords in playbooks. Use Ansible Vault to encrypt sensitive variables:
# Create encrypted vault file
ansible-vault create vars/ontap_secrets.yml
# Content of vault file:
vault_ontap_password: "SecurePassword123!"
# Reference in playbook
vars_files:
- vars/ontap_secrets.yml
5. Idempotency Testing
Run the playbook twice against the same volume name. The second execution should report ok rather than
changed. This idempotent behavior prevents accidental volume overwrites and makes automation safe to retry.
Troubleshooting Common Issues
Error: "Insufficient privileges for user"
Cause: Service account lacks required role permissions.
Solution: Verify role assignment with security login show -user ansible_svc.
Grant missing command directory access using security login role create.
Error: "Aggregate does not exist"
Cause: Typo in aggregate name or aggregate not accessible from SVM.
Solution: List available aggregates for the SVM with vserver show-aggregates -vserver svm_data01.
Update playbook with correct aggregate name.
Playbook Hangs on ONTAP Connection
Cause: Network firewall blocking HTTPS (port 443) to cluster management LIF.
Solution: Test connectivity with curl -k https://cluster1.example.com/api/cluster.
If timeout occurs, work with network team to allow Ansible control host to reach ONTAP API.
Next Iteration: Full Stack Automation
This playbook handles the core volume provisioning, but production workflows require additional steps:
- NFS Export Policy: Use
netapp.ontap.na_ontap_export_policy_ruleto grant host access - SMB Share Creation: Integrate
netapp.ontap.na_ontap_cifsfor Windows workloads - LUN Provisioning: Extend playbook with iSCSI/FC LUN creation for block storage
- Ticketing Integration: POST completion status to ServiceNow or Jira via REST API
- DNS Registration: Automatically create A records for NFS mount points
Combine these modules into a unified "request-to-delivery" playbook that transforms a storage ticket into a fully configured, production-ready volume in under 5 minutes with zero human intervention.
💬 Discussion
Have questions or feedback about this guide? Found a better approach?