Skip to content

Operations Runbook

ItemValue
Management Groupsmb-rf (SMB Ready Foundation)
Primary Regionswedencentral
Resource Groupsrg-hub, rg-spoke, rg-monitor, rg-backup, rg-migrate, rg-security
MG Policy CountSee Policy Catalog
Sub Policy CountSubscription-scoped policies (backup, defender, budget)
ResourceNameResource GroupCriticality
Azure Firewallfw-hub-smb-swcrg-hub-smb-swcHigh
VPN Gatewayvpng-hub-smb-swcrg-hub-smb-swcHigh
Key Vaultkv-smbrf-swc-*rg-security-smb-swcHigh
Log Analyticslog-smbrf-smb-swcrg-monitor-smb-swcMedium
Recovery Vaultrsv-smbrf-smb-swcrg-backup-smb-swcMedium

Morning check (~5 minutes):

  1. Verify Azure Service Health — no active incidents in swedencentral
  2. Check Firewall status — running and healthy (if deployed)
  3. Check VPN Gateway — connected (if deployed)
  4. Review Log Analytics ingestion — data flowing
  5. Check budget alerts — no overspend notifications

KQL — System Health Overview:

AzureDiagnostics
| where TimeGenerated > ago(24h)
| where Level == "Error"
| summarize ErrorCount = count() by ResourceType, Resource
| order by ErrorCount desc
| take 10
SeverityDefinitionResponse TimeEscalation
P1Complete service outage, no workaround15 minutesImmediate
P2Major feature unavailable, workaround exists1 hourWithin 2 hrs
P3Minor issue, service functional4 hoursNext day
P4Cosmetic/documentation issueBest effortNone
Error CodeMeaningResolution
AnotherOperationInProgressResource locked by concurrent operationWait 5–10 min, retry
InternalServerErrorAzure platform issueCheck Service Health, retry
QuotaExceededSubscription limit reachedRequest quota increase

Azure Firewall cannot be restarted directly. To recover:

Terminal window
# Option 1: Force re-provisioning (5–10 min downtime)
$fw = Get-AzFirewall -Name "fw-hub-smb-swc" -ResourceGroupName "rg-hub-smb-swc"
Set-AzFirewall -AzureFirewall $fw
# Option 2: Stop and Start (10–15 min downtime)
$fw = Get-AzFirewall -Name "fw-hub-smb-swc" -ResourceGroupName "rg-hub-smb-swc"
$fw.Deallocate()
Set-AzFirewall -AzureFirewall $fw
# Wait for deallocation, then re-allocate
$fw.Allocate($vnet, $pip, $mgmtPip)
Set-AzFirewall -AzureFirewall $fw
Terminal window
# Check status
Get-AzVirtualNetworkGateway -Name "vpng-hub-smb-swc" -ResourceGroupName "rg-hub-smb-swc"
# Check connections
Get-AzVirtualNetworkGatewayConnection -ResourceGroupName "rg-hub-smb-swc"
# Reset gateway (15–30 min recovery)
Reset-AzVirtualNetworkGateway -VirtualNetworkGateway $gw

If the 500 MB/day cap is hitting limits:

Terminal window
az monitor log-analytics workspace update \
--resource-group rg-monitor-smb-swc \
--workspace-name log-smbrf-smb-swc \
--quota 1
  1. Check current spend: Cost Management → Cost analysis in the Azure Portal
  2. Identify the top cost contributor (usually Firewall or VPN Gateway)
  3. If approaching the $500 cap, consider downgrading the scenario (e.g., fullfirewall)
  4. For VM workload costs, review VM SKUs against the allowed list (B, D/E v5/v6)
TaskFrequencyImpactDuration
Policy compliance reviewMonthlyNone30 min
Backup verificationWeeklyNone15 min
AVM module updatesQuarterlyRedeploy1–2 hours
Firewall rule reviewMonthlyNone30 min