Skip to main content

NetApp + Automation

Automate NetApp Volume Provisioning with Ansible

Why Automate Volume Provisioning

Manual volume creation through ONTAP System Manager or CLI is acceptable for one-off tasks, but breaks down at scale. When your storage team provisions 10-50 volumes per week, manual workflows introduce risk:

Ansible automation solves these problems by codifying your standards into reusable playbooks. Once deployed, provisioning becomes a self-service workflow that maintains consistency, logs every change, and completes in under 5 minutes.

Prerequisites and Environment Setup

Before running Ansible against ONTAP, ensure these components are in place:

Software Requirements

ONTAP Configuration

Create a dedicated service account with least-privilege access. Avoid using the admin account in automation. Here's an example role definition that grants only the required permissions:

# Create custom role for Ansible automation
security login role create \\
  -role ansible_provisioner \\
  -cmddirname "volume create" \\
  -access all

security login role create \\
  -role ansible_provisioner \\
  -cmddirname "volume modify" \\
  -access all

security login role create \\
  -role ansible_provisioner \\
  -cmddirname "volume show" \\
  -access readonly

# Create service account
security login create \\
  -user-or-group-name ansible_svc \\
  -application http \\
  -authentication-method password \\
  -role ansible_provisioner

Define Your Standards

Document these defaults before automating. Inconsistent standards create more problems than they solve:

Production-Ready Playbook

This playbook goes beyond the basic module call. It includes validation, logging, and error handling that you need in production environments.

---
- name: NetApp ONTAP Volume Provisioning
  hosts: localhost
  gather_facts: false
  
  vars:
    ontap_hostname: "cluster1.example.com"
    ontap_username: "ansible_svc"
    ontap_password: "{{ vault_ontap_password }}"  # Use ansible-vault in prod
    
    svm: "svm_data01"
    vol_name: "{{ app_name }}_{{ environment }}_vol"
    size_gb: 500
    aggregate: "aggr1_ssd"
    snapshot_policy: "hourly"
    tiering_policy: "auto"
    
  tasks:
    - name: Validate volume naming convention
      assert:
        that:
          - vol_name | length <= 64
          - vol_name is match('^[a-z0-9_]+$')
        fail_msg: "Volume name must be lowercase alphanumeric with underscores, max 64 chars"
    
    - name: Check aggregate free space
      netapp.ontap.na_ontap_info:
        hostname: "{{ ontap_hostname }}"
        username: "{{ ontap_username }}"
        password: "{{ ontap_password }}"
        https: true
        validate_certs: false
        gather_subset:
          - aggregate_info
      register: ontap_info
    
    - name: Verify sufficient capacity
      assert:
        that:
          - ontap_info.ontap_info.aggregate_info[aggregate].aggr_space_attributes.size_available | int > (size_gb * 1024 * 1024 * 1024 * 1.2)
        fail_msg: "Insufficient space on aggregate {{ aggregate }}. Provision failed."
    
    - name: Create volume with policy defaults
      netapp.ontap.na_ontap_volume:
        state: present
        name: "{{ vol_name }}"
        vserver: "{{ svm }}"
        aggregate_name: "{{ aggregate }}"
        size: "{{ size_gb }}"
        size_unit: gb
        space_guarantee: none
        percent_snapshot_space: 5
        snapshot_policy: "{{ snapshot_policy }}"
        tiering_policy: "{{ tiering_policy }}"
        hostname: "{{ ontap_hostname }}"
        username: "{{ ontap_username }}"
        password: "{{ ontap_password }}"
        https: true
        validate_certs: false
      register: volume_result
    
    - name: Log provisioning event
      lineinfile:
        path: /var/log/ansible/storage_provisioning.log
        line: "{{ ansible_date_time.iso8601 }} | CREATED | {{ vol_name }} | {{ size_gb }}GB | {{ svm }} | {{ aggregate }}"
        create: yes
    
    - name: Send notification
      debug:
        msg: "Volume {{ vol_name }} successfully provisioned on {{ svm }} ({{ size_gb }}GB)"

Essential Guardrails and Validation

These guardrails prevent common mistakes that lead to production incidents:

1. Pre-Flight Capacity Checks

Always verify aggregate capacity before provisioning. The playbook above checks that available space exceeds the requested size by 20% to account for metadata and growth. Failing fast prevents partial provisioning failures.

2. Naming Convention Enforcement

The regex validation (^[a-z0-9_]+$) ensures volume names are CLI-safe and sortable. Reject uppercase, spaces, and special characters at automation time rather than discovering issues during mount operations.

3. Change Logging

Append every provisioning event to a timestamped log file. When troubleshooting performance issues weeks later, this log correlates volume creation timing with application deployments. Integrate with Splunk or ELK for centralized visibility.

4. Credential Management

Never hardcode passwords in playbooks. Use Ansible Vault to encrypt sensitive variables:

# Create encrypted vault file
ansible-vault create vars/ontap_secrets.yml

# Content of vault file:
vault_ontap_password: "SecurePassword123!"

# Reference in playbook
vars_files:
  - vars/ontap_secrets.yml

5. Idempotency Testing

Run the playbook twice against the same volume name. The second execution should report ok rather than changed. This idempotent behavior prevents accidental volume overwrites and makes automation safe to retry.

Troubleshooting Common Issues

Error: "Insufficient privileges for user"

Cause: Service account lacks required role permissions.
Solution: Verify role assignment with security login show -user ansible_svc. Grant missing command directory access using security login role create.

Error: "Aggregate does not exist"

Cause: Typo in aggregate name or aggregate not accessible from SVM.
Solution: List available aggregates for the SVM with vserver show-aggregates -vserver svm_data01. Update playbook with correct aggregate name.

Playbook Hangs on ONTAP Connection

Cause: Network firewall blocking HTTPS (port 443) to cluster management LIF.
Solution: Test connectivity with curl -k https://cluster1.example.com/api/cluster. If timeout occurs, work with network team to allow Ansible control host to reach ONTAP API.

Next Iteration: Full Stack Automation

This playbook handles the core volume provisioning, but production workflows require additional steps:

Combine these modules into a unified "request-to-delivery" playbook that transforms a storage ticket into a fully configured, production-ready volume in under 5 minutes with zero human intervention.


💬 Discussion

Have questions or feedback about this guide? Found a better approach?

Join the discussion on GitHub or contact us directly.