Patch management has a gap problem. The average enterprise takes 60+ days to patch a critical vulnerability after disclosure. Attackers are exploiting many of those same vulnerabilities within days. The WannaCry ransomware attack in 2017 exploited EternalBlue — a vulnerability patched by Microsoft two months earlier. The target organisations hadn’t patched.

The solution isn’t a manual patching schedule — it’s an automated, risk-tiered patching pipeline that applies patches to production faster than attackers can reliably exploit disclosure timelines.

This guide covers the architecture, tools, and operations of a modern patch management programme for organisations running a mix of cloud, on-premises, and containerised infrastructure. Our managed vulnerability management service handles this process end-to-end.

Why Patch Management Fails

Before building a programme, understand the failure modes:

No asset inventory: You can’t patch systems you don’t know about. Shadow IT, cloud sprawl, and acquired companies all contribute to unmanaged assets that never get patched.

Manual processes don’t scale: A team manually patching 500 servers on a monthly schedule can’t keep up with the velocity of CVE disclosures. Monthly patch cycles leave weeks of exposure.

No risk prioritisation: Treating all patches equally means teams are overwhelmed by volume and patch the wrong things first. A non-critical cosmetic update gets the same treatment as a CISA KEV critical.

Fear of breaking production: Patching breaks things sometimes. Without a testing pipeline (dev → staging → production), patching becomes a manual, risky event that people avoid.

No measurement: Without metrics on patch compliance rates, mean time to patch, and outstanding vulnerabilities, there’s no accountability and no way to demonstrate improvement.

The Patching Hierarchy: What Gets Patched First

Not all patches are equal. Tier your response:

Tier 0: Emergency (24-hour response)

CISA KEV (Known Exploited Vulnerabilities): Being actively exploited in the wild. Patch immediately — any delay is negligible compared to the risk of active exploitation.
Zero-day in widely deployed, internet-facing technology: Log4Shell-class vulnerabilities.
Vendor emergency patches for critical, internet-facing services (RCE in your web server, authentication bypass in your VPN)

Process: Out-of-cycle emergency patching, war-room coordination, compensating controls (WAF rules, network blocks) until patches are applied.

Tier 1: Critical (7 days)

CVSS 9.0+ vulnerabilities in internet-facing or production systems
Vulnerabilities with public PoC exploit in internet-facing systems

Process: Expedited patching through dev → staging → production on a compressed timeline.

Tier 2: High (30 days)

CVSS 7.0–8.9, internet-facing or production
CVSS 9.0+ on internal systems with no public exploit

Process: Normal patch cycle with higher priority queue.

Tier 3: Medium (90 days)

CVSS 4.0–6.9, production systems

Process: Monthly patch cycles.

Tier 4: Low (next cycle, or accept)

CVSS < 4.0
Vulnerabilities with no practical exploitation path

Process: Patch when convenient, or formally accept risk.

Asset Inventory: The Foundation

Discovery tools:

Tool	Best For
Nmap	Network host and service discovery
Shodan + Censys	Internet-exposed assets
AWS Config, Azure Resource Graph	Cloud asset inventory
Microsoft Intune / Jamf	Managed endpoints
Kubernetes API	Container workloads
Qualys / Tenable	Combined discovery + scanning

Cloud asset inventory (AWS):

# List all EC2 instances
aws ec2 describe-instances \
  --query 'Reservations[].Instances[].[InstanceId,State.Name,Tags[?Key==`Name`].Value|[0],Platform]' \
  --output table

# All RDS instances
aws rds describe-db-instances \
  --query 'DBInstances[].[DBInstanceIdentifier,EngineVersion,Engine]' \
  --output table

# All Lambda functions and their runtimes (often neglected for patching)
aws lambda list-functions \
  --query 'Functions[].[FunctionName,Runtime]' \
  --output table

Maintain a CMDB (Configuration Management Database) — Servicenow, Jira Assets, or even a well-maintained spreadsheet for smaller organisations. Every asset in your environment should be in it with:

Owner (team responsible for patching)
Asset tier (production, staging, dev)
OS and version
Last patched date
Outstanding vulnerabilities

Windows Patching

WSUS / SCCM (On-Premises)

For Windows endpoints and servers, WSUS (Windows Server Update Services) or SCCM/MECM (Microsoft Configuration Manager) remain the standard for on-premises environments:

WSUS downloads patches from Microsoft and distributes to managed systems
SCCM adds deployment targeting, compliance reporting, and software distribution
Deploy patches first to a test ring (non-production), validate for 3–5 days, then deploy to production

WSUS deployment rings:

Ring 1 (Pilot) → 5% of systems (volunteer/IT systems)
Ring 2 (Early) → 15% of systems
Ring 3 (Standard) → 80% of systems
Emergency → All systems (CISA KEV)

Automate approval using WSUS scripting:

# Auto-approve Critical and Security updates older than 7 days
$rule = $wsus.CreateInstallApprovalRule("AutoApprove-Critical")
$cat = $wsus.GetUpdateCategories().GetEnumerator() | Where-Object {$_.Title -eq "Critical Updates"}
$rule.SetCategories($cat)
$rule.Deadline = (Get-Date).AddDays(7)
$rule.SetGroups($wsus.GetComputerTargetGroups() | Where-Object {$_.Name -eq "Production"})
$rule.Save()

Microsoft Intune (Cloud/Hybrid)

For cloud-managed Windows endpoints (Azure AD joined), use Intune Update Rings:

Configure update rings with deferral periods per group
Automatic approval of quality updates (security) with shorter deference than feature updates
Compliance policies report which devices are patched

Linux Patching

On-Premises Linux

Ansible is the most widely adopted tool for Linux patch automation:

# Ansible playbook — patch all Linux systems
- name: Patch all Linux servers
  hosts: all_linux
  become: yes

  tasks:
    - name: Update package cache (Debian/Ubuntu)
      apt:
        update_cache: yes
        cache_valid_time: 3600
      when: ansible_os_family == "Debian"

    - name: Upgrade all packages (Debian/Ubuntu)
      apt:
        upgrade: dist
        autoremove: yes
      when: ansible_os_family == "Debian"
      register: apt_upgrade_result

    - name: Update all packages (RHEL/CentOS/Amazon Linux)
      yum:
        name: "*"
        state: latest
        security: yes  # Security patches only, use '*' for all
      when: ansible_os_family == "RedHat"
      register: yum_upgrade_result

    - name: Check if reboot is required (Debian)
      stat:
        path: /var/run/reboot-required
      register: reboot_required

    - name: Reboot if required
      reboot:
        msg: "Rebooting for patch application"
        connect_timeout: 5
        reboot_timeout: 300
      when: reboot_required.stat.exists

Schedule with AWX/Ansible Tower for centralised management, role-based targeting, and audit trails.

AWS EC2 Linux Patching

AWS Systems Manager Patch Manager is the native tool for EC2 instances:

# Create a patch baseline (AWS console or CLI)
aws ssm create-patch-baseline \
  --name "AmazonLinux2-Security-Baseline" \
  --operating-system "AMAZON_LINUX_2" \
  --approval-rules '{"PatchRules":[{"PatchFilterGroup":{"PatchFilters":[{"Key":"SEVERITY","Values":["Critical","High"]}]},"ApproveAfterDays":3}]}' \
  --description "Auto-approve Critical and High patches after 3 days"

# Associate baseline to instances via maintenance window
aws ssm create-maintenance-window \
  --name "Weekly-Patching-Sunday-2AM" \
  --schedule "cron(0 2 ? * SUN *)" \
  --duration 4 \
  --cutoff 1 \
  --allow-unassociated-targets false

# Run patch immediately (emergency)
aws ssm send-command \
  --document-name "AWS-RunPatchBaseline" \
  --targets "Key=tag:Environment,Values=production" \
  --parameters "Operation=Install" \
  --timeout-seconds 600

SSM Patch Manager integrates with AWS Security Hub to surface compliance status and with CloudWatch for patching operation logging.

Container and Kubernetes Patching

Containers require a different patching model — you patch the image, not the running container:

Image Scanning and Rebuild Pipeline

# GitHub Actions — automated image rebuild on CVE detection
name: Weekly Image Rebuild

on:
  schedule:
    - cron: '0 2 * * 1'  # Every Monday at 2 AM

jobs:
  scan-and-rebuild:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Build image
        run: docker build -t myapp:${{ github.sha }} .

      - name: Scan with Trivy
        uses: aquasecurity/trivy-action@master
        with:
          image-ref: myapp:${{ github.sha }}
          severity: CRITICAL,HIGH
          exit-code: 0  # Don't fail — report findings
          output: trivy-results.json

      - name: Update base image if CVEs found
        run: |
          # Extract CVE count from Trivy results
          CVE_COUNT=$(cat trivy-results.json | jq '[.Results[].Vulnerabilities // [] | .[] | select(.Severity == "CRITICAL")] | length')
          if [ "$CVE_COUNT" -gt "0" ]; then
            echo "Found $CVE_COUNT critical CVEs — triggering rebuild with latest base image"
            # Update FROM line to latest base image
            sed -i 's|FROM node:20-alpine.*|FROM node:20-alpine|' Dockerfile
            docker build -t myapp:patched .
            docker push myapp:patched
            # Trigger deployment update
          fi

Kubernetes Node Patching

For managed Kubernetes (EKS, AKS, GKE):

EKS:

# Update node group AMI (rolling update — cordons old nodes, drains, replaces)
aws eks update-nodegroup-version \
  --cluster-name production \
  --nodegroup-name standard-workers \
  --release-version "1.31.x-20260115"

# Check update status
aws eks describe-update \
  --name <update-id> \
  --cluster-name production

Enable EKS auto mode or managed node group auto-updates to automate node AMI updates on a defined schedule.

For self-managed Kubernetes, use kured (KUbernetes REboot Daemon) to automatically drain and reboot nodes requiring kernel updates:

# kured DaemonSet — auto-reboots nodes when /var/run/reboot-required exists
helm upgrade --install kured weaveworks/kured \
  --namespace kube-system \
  --set configuration.rebootSentinel=/var/run/reboot-required \
  --set configuration.rebootSchedule="0 2 * * *"  # 2 AM nightly

Third-Party Application Patching

OS patches are straightforward. Third-party applications are harder:

Browsers (critical attack surface):

Deploy Chrome/Edge/Firefox updates via group policy or MDM
Target: within 24 hours of browser security patch release

Java, Python, Node.js runtimes:

Container images: update base image, rebuild
On-prem: Ansible playbook targeting runtime versions
Lambda: update runtime version in function configuration

Database engines (PostgreSQL, MySQL, SQL Server):

Cloud managed (RDS): apply minor version updates automatically, major version upgrades as project
On-prem: database change management process with tested rollback

Network appliances (firewalls, routers, VPN concentrators):

Often neglected — but FireEye’s 2024 research showed 60%+ of enterprise firewall appliances running OS versions > 2 years old
Monthly review of vendor security bulletins
Dedicated patching schedule with maintenance windows

Metrics and Reporting

Track patching effectiveness with these metrics:

Metric	Target	Measure
Critical patch SLA compliance	95% within 24 hours	% CISA KEV patched within SLA
High patch SLA compliance	90% within 30 days	% High CVEs patched within SLA
Mean time to patch (MTTP)	< 7 days for Critical	Average days from disclosure to patch applied
Patch coverage	95% of known assets	% of inventory with active patching
Vulnerability backlog	Trending down QoQ	Open vulnerabilities by severity
Unpatched CISA KEV	Zero	Any CISA KEV older than 7 days = incident

Report monthly to the security team, quarterly to leadership with trend data.

Patch Testing and Rollback

Pre-production testing:

Every patch should be tested in dev/staging before production (Tier 1+ patches at minimum)
Automated regression tests should run after patch application in staging
Canary deployment for high-risk patches: deploy to 5% of production, monitor for 24 hours, then full rollout

Rollback plan:

Snapshot/backup before patching production systems
Documented rollback procedure for each patch type
Rollback decision criteria defined in advance (what error rate or incident triggers rollback?)

Compensating Controls When Patching Isn’t Immediate

When you can’t patch immediately (legacy systems, change management requirements, business-critical windows), apply compensating controls:

WAF rules targeting the vulnerability’s attack vector
Network segmentation — isolate the vulnerable system from the internet or from internal systems
Disable the vulnerable feature or service if not essential
Enhanced monitoring — SIEM rules targeting exploitation attempts of the specific CVE
Access restriction — require VPN + MFA to reach the vulnerable system
Document the accepted risk — formal risk acceptance with CISO sign-off, time-limited

Compensating controls are temporary. They reduce exposure but don’t eliminate the vulnerability. Patch as soon as the window allows.

CyberneticsPlus designs and implements patch management programmes for enterprises across cloud, on-premises, and hybrid environments. We also conduct vulnerability assessments to identify your highest-priority patching gaps. Contact us to build your patch management strategy.

Patch Management 2026: Cut Remediation Time With Automated Patching Pipelines