Pure Storage

Snapshot Policy Strategy for Pure FlashArray

Published February 14, 2026 | 6 min read

Define Business Objectives First

Snapshot policies should never be designed in isolation from business requirements. Before opening the Pure UI or running a CLI script, document these three critical inputs:

Recovery Point Objective (RPO): Maximum tolerable data loss measured in time. A 15-minute RPO means you can afford to lose up to 15 minutes of transactions.
Recovery Time Objective (RTO): How quickly you need to restore from snapshot. Snapshots enable faster recovery than tape or replication in most cases.
Retention Requirements: Compliance, audit, or operational needs that dictate how long snapshots must be retained (e.g., 30 days for general operations, 7 years for financial records).

Map these requirements to service tiers rather than creating per-application policies. Typical organizations need 3-5 tiers (e.g., Critical, Standard, Low, Archive). Each tier gets a reusable snapshot policy that standardizes RPO/RTO across similar workloads.

Service Tier Snapshot Matrix

Here's a production-tested four-tier model that balances protection with capacity efficiency:

Tier 1 - Mission Critical (RPO: 15 min)
  Frequency: Every 15 minutes
  Retention: 96 snapshots (24 hours)
  Daily consolidation: 1 snapshot at midnight, keep 7 days
  Weekly: 1 snapshot Sunday midnight, keep 4 weeks
  Use case: Transactional databases, ERP systems, production VMs

Tier 2 - Production Standard (RPO: 1 hour)
  Frequency: Every hour
  Retention: 48 snapshots (2 days)
  Daily consolidation: 1 snapshot at midnight, keep 14 days  
  Weekly: 1 snapshot Sunday midnight, keep 8 weeks
  Use case: Application servers, file shares, general production

Tier 3 - Dev/Test (RPO: 24 hours)
  Frequency: Daily at midnight
  Retention: 7 snapshots (1 week)
  No long-term retention
  Use case: Development environments, test instances, temp data

Tier 4 - Archive/Compliance (RPO: 24 hours)
  Frequency: Daily at midnight
  Retention: 30 daily snapshots
  Monthly: 1 snapshot on 1st of month, keep 12 months (or longer based on compliance)
  Use case: Audit logs, financial records, regulatory data

Capacity Planning Considerations

Pure Storage snapshots are space-efficient due to redirect-on-write architecture—only changed blocks consume capacity. However, high-change workloads (databases with heavy write activity) can accumulate significant snapshot overhead.

Rule of thumb: Estimate 10-20% additional capacity for snapshot overhead on active workloads. Monitor actual consumed space using purearray list --space and adjust retention if snapshot growth exceeds projections.

Implementation with Pure CLI

Pure FlashArray snapshot scheduling uses protection groups and schedules. Here's how to implement the Tier 1 policy from the matrix above:

# Create Tier 1 protection group
pureprot create --volumes vol-db-prod-01,vol-db-prod-02 protgrp-tier1-critical

# Configure 15-minute snapshots, keep for 24 hours
pureprot schedule --enabled --replicate-enabled false \\
  --replicate-frequency 900 --replicate-at 0 \\
  --snapshot-frequency 900 --snapshot-at 0 \\
  --all-for 86400 \\
  protgrp-tier1-critical

# Add daily consolidation snapshot (keep 7 days)
pureprot schedule --enabled --replicate-enabled false \\
  --snapshot-frequency 86400 --snapshot-at 00:00 \\
  --days 7 \\
  protgrp-tier1-critical-daily

# Add weekly snapshot (keep 4 weeks)
pureprot schedule --enabled --replicate-enabled false \\
  --snapshot-frequency 604800 --snapshot-at Sun:00:00 \\
  --weeks 4 \\
  protgrp-tier1-critical-weekly

REST API Automation

For Infrastructure-as-Code workflows, use the Pure Storage REST API. This Python example creates a protection group with scheduled snapshots:

import requests
import json

array_url = "https://flasharray.example.com/api/2.0"
api_token = "your-api-token"

headers = {
    "Content-Type": "application/json",
    "Authorization": f"Bearer {api_token}"
}

# Create protection group
protgrp_data = {
    "names": ["protgrp-tier2-standard"],
    "volumes": {"names": ["vol-app-prod-01", "vol-app-prod-02"]}
}
response = requests.post(f"{array_url}/protection-groups", headers=headers, json=protgrp_data)

# Add hourly snapshot schedule (Tier 2)
schedule_data = {
    "enabled": True,
    "snap_frequency": 3600,  # 1 hour in seconds
    "snap_at": 0,
    "all_for": 172800  # Keep for 48 hours
}
requests.post(f"{array_url}/protection-groups/protgrp-tier2-standard/schedules", 
              headers=headers, json=schedule_data)

Operational Best Practices

1. Naming Conventions

Use descriptive, sortable names that include the tier level: protgrp-tier1-db-prod, protgrp-tier2-filesvr. This makes auditing easier when you have 50+ protection groups.

2. Test Restore Procedures

Snapshots are not backups until you've proven you can restore from them. Schedule quarterly DR drills where you:

Instantiate a volume from a random snapshot
Mount it to a test host
Validate application-level consistency (can the app start? Is data intact?)
Measure time-to-restore and document findings

3. Monitor Snapshot Growth Trends

Set alerts when snapshot-consumed space exceeds thresholds. In the Pure GUI, navigate to Storage → Volumes and sort by Snapshots column. Investigate volumes where snapshots consume >30% of total volume capacity— this may indicate excessive change rate or retention period misalignment.

4. Application-Consistent Snapshots

For databases (Oracle, SQL Server, PostgreSQL), coordinate snapshots with application quiesce commands:

SQL Server: Use VSS integration with Pure Storage VSS Provider
Oracle: Put tablespaces in hot backup mode before snapshot, then end backup mode
VMware: Leverage vSphere snapshot integration with Pure Storage plugin

Common Mistakes to Avoid

Over-Snapshotting Low-Change Workloads

Taking 15-minute snapshots of a file share that changes once per day wastes both snapshot slots and administrative overhead. Match snapshot frequency to workload change rate and business RPO—not arbitrary "more is better" thinking.

Treating Snapshots as Backups

Snapshots protect against logical corruption (accidental deletes, ransomware, bad updates) but not against array-level failures or site disasters. Always pair snapshot policies with array replication (ActiveCluster) or external backups to tape/cloud for comprehensive data protection.

Ignoring Compliance Retention

Regulatory requirements (HIPAA, SOX, GDPR) may mandate specific retention periods. Document these requirements in your policy matrix and implement automated retention enforcement. Never rely on manual snapshot management for compliance-critical data—automation prevents gaps.

No Snapshot Naming Standards

Manual snapshots without standardized naming become unmanageable quickly. Enforce formats like {volumename}.{timestamp}.{purpose} (e.g., vol-db-prod.20260214-1500.pre-patch).

Monitoring and Alerting

Integrate snapshot health monitoring into your observability stack:

Alert on failed snapshot schedules: Use Pure1 or Prometheus exporters to detect protection group failures
Track snapshot age: Trigger warnings if most recent snapshot is older than expected RPO
Capacity trending: Dashboard snapshot-consumed space over time to forecast capacity needs

# Prometheus query example: snapshot age threshold
time() - pure_volume_snapshot_created_timestamp > 3600  # Alert if no snapshot in past hour

Next Steps: Automation and Integration

Once base policies are established, enhance with these advanced capabilities:

Policy-as-Code: Store protection group definitions in Git, deploy via CI/CD pipelines
Self-Service Portals: Let developers create volumes with auto-assigned snapshot policies based on tags
Automated Testing: Script restore validation tests that run monthly and report success/fail metrics
Replication Orchestration: Pair snapshot policies with async replication schedules for DR automation

A mature snapshot strategy isn't just about taking snapshots—it's about integrating data protection into your operational workflows, monitoring effectiveness, and continuously optimizing based on real-world restore needs.

💬 Discussion

Have questions or feedback about this guide? Found a better approach?

Join the discussion on GitHub or contact us directly.