flowchart LR
subgraph HA["High Availability"]
direction TB
AZ["Availability Zones\n(zone failure)"]
AS["Availability Sets\n(rack failure)"]
LB["Load Balancer\n(instance failure)"]
FD["Front Door / TM\n(region latency/failure)"]
end
subgraph DR["Disaster Recovery"]
direction TB
ASR["Azure Site Recovery\n(regional outage — running workloads)"]
BAK["Azure Backup\n(data loss / corruption / deletion)"]
GEO["Geo-Replication\n(DB data in secondary region)"]
end
FAIL["Failure Event"] --> SEV{"Severity"}
SEV -->|Single instance / zone fails| HA
SEV -->|Full region / data corruption| DR
Dimension
HA
DR
Trigger
Hardware fault, zone outage
Full regional outage, ransomware, deletion
RTO
Seconds (auto-failover)
Minutes to hours
RPO
Near-zero (sync replication)
Seconds to hours
Cost
Higher (always-on replicas)
Lower (standby / storage)
Manual intervention
❌ None
Usually ✅ (trigger failover)
Services
AZs, Load Balancer, VMSS, SQL AG
ASR, Backup, Geo-replication, Traffic Manager
Backup vs ASR
Decision Factor
Azure Backup
Azure Site Recovery
Primary purpose
Point-in-time data recovery
Continuous replication for DR failover
Protects against
Accidental deletion, corruption, ransomware
Regional outage, site failure
RTO
Hours (restore from vault)
Minutes (replicated VMs pre-staged)
RPO
Hours (last backup)
Seconds (continuous replication)
Running during DR?
❌ (restore to new resource)
✅ (failed-over VMs are live)
Replication frequency
Scheduled (daily/hourly/weekly)
Continuous
Data stored
Recovery Services or Backup Vault
Recovery Services Vault (metadata only)
Soft delete / immutability
✅
❌
Cross-region
✅ (GRS vault + manual CRR)
✅ (built-in to ASR)
On-premises support
✅ (MARS, MABS)
✅ (VMware, Hyper-V, Physical)
Cost
Lower (backup storage)
Higher (compute pre-staged + storage)
⚠️ Key Rule: If the scenario says “recover deleted files” or “restore to a point in time” → Backup. If it says “keep running in a secondary region during an outage” → ASR.
Recovery Objectives Spectrum
Service / Config
Typical RPO
Typical RTO
Notes
AZs + Zone-redundant SQL
~0 seconds
~0 seconds
Synchronous HA — no DR needed
SQL Active Geo-Replication
< 5 seconds
< 30 seconds
Async replication, manual failover
SQL Auto-Failover Group
< 5 seconds
< 30 seconds
Listener survives failover
Azure Site Recovery (Azure-to-Azure VM)
Seconds
Minutes
Replicated disks pre-staged
ASR (VMware to Azure)
~15 seconds
< 2 hours
Process server + mobility service
Azure Backup (VM, daily)
Up to 24 hours
Hours
Restore from vault
Azure Backup (SQL, log backup 15 min)
15 minutes
Hours
Transaction log backup
Azure Backup (Blob, operational)
Seconds (point-in-time)
Minutes
Continuous protection
Cross-Region Restore (manual)
48 hours data lag
Hours
Manual trigger required
Load Balancing Selection
Scenario
Answer
Distribute HTTP traffic across VMs in one region
Application Gateway (L7)
Distribute TCP/UDP traffic across VMs in one region