Kubernetes Cluster Upgrades

Best Practices, Common Issues & Safe Upgrade Procedures

Lesson 1: Maintenance & Self-Healing

Kubernetes is designed as a self-healing system with intelligent resource management capabilities. Understanding maintenance procedures is critical for safe cluster operations.

Kubernetes Self-Healing

Self-Healing Capability: Kubernetes is designed to be a self-healing system, and kubelet is smart enough to manage resources and, if necessary, evict pods when resources are scarce.
How Self-Healing Works:
  • Kubelet monitors node resources (CPU, memory, disk)
  • When resources are scarce, it evicts low-priority pods
  • Failed pods are automatically restarted
  • ReplicaSets ensure desired pod count is maintained
  • Health checks (liveness/readiness) detect unhealthy pods

Node Maintenance Procedure

For planned maintenance on a worker node, you should use kubectl drain to gracefully remove all pods, perform the necessary work (e.g., reboot, kernel update), and then use kubectl uncordon to bring the node back into service.

# Step 1: Check current node status kubectl get nodes # Step 2: Cordon the node (prevent new pod scheduling) kubectl cordon worker-node-1 # node/worker-node-1 cordoned # Step 3: Drain the node (gracefully evict all pods) kubectl drain worker-node-1 \ --ignore-daemonsets \ --delete-emptydir-data \ --force # Output shows pods being evicted: # evicting pod default/web-app-abc # evicting pod default/api-server-xyz # pod/web-app-abc evicted # pod/api-server-xyz evicted # node/worker-node-1 drained # Step 4: Perform maintenance ssh worker-node-1 sudo apt update && sudo apt upgrade -y sudo reboot # Step 5: After reboot, uncordon the node kubectl uncordon worker-node-1 # node/worker-node-1 uncordoned # Step 6: Verify node is back in service kubectl get nodes # NAME STATUS ROLES AGE VERSION # worker-node-1 Ready worker 30d v1.28.0
Important: Always ensure applications have multiple replicas across different nodes before draining. Otherwise, draining could cause application downtime.

Managed vs. Self-Managed Updates

Managed Kubernetes (GKE, EKS, AKS)

Update Approach: Delete and replace nodes

  • Cloud provider handles upgrades
  • Old nodes are deleted
  • New nodes with updated version are provisioned
  • Rolling fashion (one at a time)
  • Minimal manual intervention
  • Usually zero-downtime

Self-Managed Kubernetes

Update Approach: In-place upgrades

  • Full control and responsibility
  • Update components in sequence
  • Requires careful planning
  • Tools like Kubespray help automate
  • More manual steps involved
  • Greater risk if not done correctly

Control Plane Maintenance Scenarios

High Availability (3 Masters)

HA Masters: In an HA setup (three masters), you can update masters one-by-one, as the remaining masters will maintain cluster functionality. This provides zero-downtime upgrades.
# With 3 masters, you can safely upgrade one at a time # Master 1 goes down → Masters 2 & 3 handle API requests # Master 2 goes down → Masters 1 & 3 handle API requests # Master 3 goes down → Masters 1 & 2 handle API requests # etcd maintains quorum: 2 out of 3 nodes = quorum maintained

Single Master (Not Recommended for Production)

Single Master Risk: If you only have one master, the running workload will continue, but if a pod fails and needs restarting, components like Ingress Controller and kube-proxy won't be able to reach the API server for updated pod status, potentially still routing traffic to the failed pod until the master is restored.
What Happens During Single Master Downtime:
  • Existing pods continue running normally
  • No new pods can be scheduled
  • Failed pods cannot be restarted
  • Ingress/Services may route to failed pods
  • kubectl commands fail (no API server)
  • Cluster state changes are not processed
Best Practice: Always run at least 3 master nodes in production to ensure high availability during maintenance and unexpected failures.

Lesson 2: Common Problems During Upgrade

Understanding common upgrade issues is critical for planning and troubleshooting. Most problems stem from non-backward-compatible changes.

The Importance of Reading the Changelog

Critical Rule: Always thoroughly study the changelog for breaking changes before upgrading. Many upgrade failures can be prevented by reading release notes.

Problem 1: Outdated Manifest API Versions

Deprecated API Versions

Upgrading to versions like 1.16 removed support for older API versions (e.g., apps/v1beta1 for Deployments). While existing applications will continue to run, new deployments using old manifest formats will fail.

# Old manifest (stopped working in Kubernetes 1.16) apiVersion: apps/v1beta1 # DEPRECATED! kind: Deployment metadata: name: old-app spec: replicas: 3 template: # ... # Error when applying after upgrade: # error: unable to recognize "deployment.yaml": # no matches for kind "Deployment" in version "apps/v1beta1" # Fixed manifest (correct version) apiVersion: apps/v1 # CORRECT kind: Deployment metadata: name: old-app spec: replicas: 3 selector: # Now required in apps/v1 matchLabels: app: old-app template: metadata: labels: app: old-app # ...
Impact: Existing deployments continue running, but you cannot update or create new resources using old API versions. This breaks CI/CD pipelines that rely on old manifests.
Solution: Before upgrading, update all manifests to current API versions. Use tools like kubectl convert or pluto to detect deprecated APIs in your cluster.
# Install pluto to detect deprecated APIs brew install FairwindsOps/tap/pluto # Scan your manifests for deprecated APIs pluto detect-files -d ./manifests/ # Output shows deprecated APIs: # NAME KIND VERSION DEPRECATED DEPRECATED IN REMOVED REMOVED IN # old-app Deployment apps/v1beta1 true v1.9.0 true v1.16.0 # Convert old API versions to current kubectl convert -f old-deployment.yaml --output-version apps/v1

Problem 2: Deprecated/Changed Kubelet Flags

Command-Line Flag Changes

Command-line flags for kubelet and other components often change names over time (e.g., --experimental-bootstrap-kubeconfig changed to --bootstrap-kubeconfig), which can break automated deployment scripts.

# Old kubelet configuration (Kubernetes 1.8) /usr/bin/kubelet \ --experimental-bootstrap-kubeconfig=/etc/kubernetes/bootstrap.conf \ --experimental-allowed-unsafe-sysctls=net.ipv4.tcp_syncookies \ # ... # After upgrade to 1.10+, these flags are removed: # Flag --experimental-bootstrap-kubeconfig has been deprecated # Use --bootstrap-kubeconfig instead # Updated configuration /usr/bin/kubelet \ --bootstrap-kubeconfig=/etc/kubernetes/bootstrap.conf \ --allowed-unsafe-sysctls=net.ipv4.tcp_syncookies \ # ...
Impact: Kubelet fails to start with unknown flag errors, breaking the entire node. Automation scripts that provision new nodes also fail.

Problem 3: Stricter Validation

Bug Fixes That Break Existing Configs

Sometimes, a bug fix in a newer version makes manifest validation stricter (e.g., version 1.14.1), causing previously working Helm charts or configuration files to fail.

# This worked in Kubernetes 1.13 (due to a bug) apiVersion: v1 kind: Service metadata: name: my-service spec: ports: - port: 80 targetPort: "http" # String value protocol: TCP # selector is MISSING (invalid but accepted due to bug) # After upgrade to 1.14.1: # Error: Service "my-service" is invalid: # spec.selector: Required value # Fixed manifest apiVersion: v1 kind: Service metadata: name: my-service spec: selector: # Now properly enforced app: my-app ports: - port: 80 targetPort: http protocol: TCP
Why This Happens: Kubernetes sometimes accepted invalid configurations due to bugs. When these bugs are fixed, previously "working" manifests suddenly fail validation.

Problem 4: Docker/Container Runtime Issues

Forgotten Containers

Upgrading Docker can sometimes cause the daemon to "forget" about running containers (like kube-proxy). A new instance of the container will be launched, but the old process still occupies the ports, leading to conflicts and crash loops until the old process is manually killed or the node is rebooted.

# Scenario: Upgraded Docker from 19.03 to 20.10 # Old kube-proxy container still running (PID 1234) ps aux | grep kube-proxy # root 1234 kube-proxy --kubeconfig=/etc/kubernetes/kubeconfig # Docker daemon doesn't see it docker ps | grep kube-proxy # (no output - Docker "forgot" about it) # Kubelet tries to start new kube-proxy container # Error: port 10256 already in use by PID 1234 # Check logs kubectl logs -n kube-system kube-proxy-abc # Error: listen tcp :10256: bind: address already in use # Solution 1: Kill the old process sudo kill 1234 # Kubelet will restart kube-proxy successfully # Solution 2: Reboot the node (cleaner) sudo reboot

API Version Incompatibility

A mismatch between the API version supported by the kubelet and the installed Docker client (docker cli) or daemon can cause connection failures.

# Kubelet expects Docker API 1.40 # Installed Docker supports only API 1.38 # Kubelet logs show: # Error: failed to create containerd task: # failed to create shim: API version mismatch # Check Docker API version docker version # Client: API version: 1.38 # Server: API version: 1.38 # Solution: Upgrade Docker to compatible version # OR: Use containerd directly instead of Docker
Recommendation: Modern Kubernetes (1.24+) has deprecated Docker support. Migrate to containerd or CRI-O as your container runtime to avoid these issues.

Lesson 3: General Upgrade Procedure

Following a structured upgrade procedure is essential for safe, successful cluster upgrades. Never rush or skip steps.

The Complete Upgrade Workflow

Step 1: Read Documentation

Thoroughly study the changelog for breaking changes. Look for:

  • Deprecated API versions
  • Removed or renamed flags
  • Changes in default behavior
  • Known issues and workarounds
  • Feature gates that changed

Step 2: Practice on Test Cluster

Install and upgrade a test cluster to identify and resolve potential issues specific to your setup. The test cluster should mirror production as closely as possible.

Step 3: Plan and Backup

Schedule the upgrade during a maintenance window and ensure etcd backups are performed. Have a rollback plan ready.

Step 4: Sequential Execution

Update components one-by-one, always starting with the control plane, then worker nodes.

Component Update Order

1. etcd (First)

The cluster's datastore must be upgraded first. Back up etcd before starting.

# Backup etcd ETCDCTL_API=3 etcdctl snapshot save backup.db \ --endpoints=https://127.0.0.1:2379 \ --cacert=/etc/kubernetes/pki/etcd/ca.crt \ --cert=/etc/kubernetes/pki/etcd/server.crt \ --key=/etc/kubernetes/pki/etcd/server.key # Verify backup ETCDCTL_API=3 etcdctl snapshot status backup.db --write-out=table # Upgrade etcd (example with kubeadm) # This is usually handled by kubeadm upgrade apply

2. Control Plane Components

Upgrade API Server, Scheduler, Controller Manager, and kubelet on master nodes. In HA setup, do one master at a time.

# On first master node sudo kubeadm upgrade plan # Review upgrade plan sudo kubeadm upgrade apply v1.28.0 # Upgrade kubelet and kubectl on master sudo apt-mark unhold kubelet kubectl sudo apt-get update sudo apt-get install -y kubelet=1.28.0-00 kubectl=1.28.0-00 sudo apt-mark hold kubelet kubectl sudo systemctl daemon-reload sudo systemctl restart kubelet # Repeat for other master nodes # On subsequent masters, use: # sudo kubeadm upgrade node

3. Worker Nodes

Upgrade kubelet, CNI, kube-proxy, and CoreDNS on worker nodes. Do this in batches or one node at a time.

# For each worker node: # Drain the node kubectl drain worker-node-1 --ignore-daemonsets --delete-emptydir-data # SSH to the worker node ssh worker-node-1 # Upgrade kubeadm sudo apt-mark unhold kubeadm sudo apt-get update sudo apt-get install -y kubeadm=1.28.0-00 sudo apt-mark hold kubeadm # Upgrade node configuration sudo kubeadm upgrade node # Upgrade kubelet and kubectl sudo apt-mark unhold kubelet kubectl sudo apt-get install -y kubelet=1.28.0-00 kubectl=1.28.0-00 sudo apt-mark hold kubelet kubectl # Restart kubelet sudo systemctl daemon-reload sudo systemctl restart kubelet # Exit SSH and uncordon the node kubectl uncordon worker-node-1 # Verify node is updated kubectl get nodes

Version Skew Policy

Important Rules:
  • No Version Skipping: Don't skip minor versions. Upgrade sequentially (1.25 → 1.26 → 1.27)
  • kubelet Skew: kubelet can be up to 2 minor versions behind the API server
  • kubectl Skew: kubectl can be 1 minor version ahead or behind the API server
  • kube-proxy Skew: kube-proxy should match the kubelet version
# Valid version combinations: # API Server: 1.28 # kubelet: 1.28, 1.27, or 1.26 (within 2 minor versions) # kubectl: 1.29, 1.28, or 1.27 (±1 minor version) # Invalid: # API Server: 1.28 # kubelet: 1.25 (too old - 3 versions behind)

Pre-Upgrade Checklist

Task Status Notes
Read changelog and release notes Look for breaking changes
Test upgrade on staging cluster Mirror production environment
Backup etcd Store backup securely off-cluster
Update deprecated API versions in manifests Use pluto or kubectl-convert
Schedule maintenance window Communicate to stakeholders
Prepare rollback plan Document rollback procedure
Verify sufficient node capacity Pods must be reschedulable during drain
Check PodDisruptionBudgets Ensure PDBs won't block drain

Lesson 4: Kubespray Upgrade Automation

Kubespray is an Ansible-based tool that automates Kubernetes cluster deployment and upgrades, making the process significantly simpler and more reliable.

What is Kubespray?

Kubespray: An open-source project that uses Ansible to deploy and manage production-ready Kubernetes clusters. It supports various OS distributions, network plugins, and infrastructure providers.
Benefits of Kubespray:
  • Automated, repeatable deployments
  • Safe, sequential component upgrades
  • Configurable via inventory files
  • Supports on-premises and cloud environments
  • Active community and well-tested
  • Handles complex HA configurations

Kubespray Upgrade Process

The demonstration showed how simple the upgrade can be when using an automation tool like Kubespray:

Step 1: Update Inventory Configuration

The user only needed to modify the kube_version variable in the inventory file from 1.17.5 to 1.18.4.

# File: inventory/mycluster/group_vars/k8s_cluster/k8s-cluster.yml # Before upgrade kube_version: v1.17.5 # After change (ready to upgrade) kube_version: v1.18.4 # Other important settings: kubeadm_upgrade: true # Enable upgrade mode upgrade_cluster_setup: true kube_proxy_mode: ipvs dns_mode: coredns

Step 2: Run Upgrade Playbook

# Execute the Kubespray upgrade playbook ansible-playbook -i inventory/mycluster/hosts.yml \ upgrade-cluster.yml \ --become \ --become-user=root # Optional: Use bastion host for private networks ansible-playbook -i inventory/mycluster/hosts.yml \ upgrade-cluster.yml \ --become \ --become-user=root \ -e ansible_ssh_common_args='-o ProxyCommand="ssh -W %h:%p user@bastion-host"'

How Kubespray Manages the Upgrade

Sequential Upgrade Strategy:
  • Masters: Updated sequentially (serial: 1) to ensure HA is maintained
  • Workers: Updated in batches (default serial: 20%) to minimize disruption
  • Automatic Drain: Kubespray drains each node before upgrading
  • Verification: Checks node health before proceeding to next
# Kubespray upgrade playbook workflow: # 1. Pre-flight checks - Verify SSH connectivity to all nodes - Check current Kubernetes version - Validate inventory configuration # 2. Upgrade etcd (if needed) - Backup etcd on each master - Upgrade etcd binaries - Restart etcd service - Verify cluster health # 3. Upgrade first master (serial: 1) - Drain node - Upgrade control plane components - Upgrade kubelet - Uncordon node - Verify API server is healthy # 4. Upgrade second master (serial: 1) - Same process as first master - Cluster remains available via other masters # 5. Upgrade third master (serial: 1) - Same process # 6. Upgrade worker nodes (serial: 20%) - Batch 1 (20% of workers): drain, upgrade, uncordon - Verify batch health - Batch 2 (next 20%): drain, upgrade, uncordon - Continue until all workers upgraded # 7. Post-upgrade tasks - Upgrade CoreDNS - Update kube-proxy - Verify all components - Run cluster smoke tests

Upgrade Timing

Performance: In the demonstration, the entire cluster of 6 nodes was upgraded from v1.17.5 to v1.18.4 in approximately 13 minutes.
Phase Approximate Time Details
Pre-flight checks 1-2 minutes Validation and connectivity tests
etcd upgrade 2-3 minutes Backup and upgrade 3 etcd instances
Master upgrades (3 nodes) 5-6 minutes Sequential, ~2 min per master
Worker upgrades (3 nodes) 4-5 minutes Parallel batches, faster than masters
Post-upgrade verification 1 minute Health checks and validation
Total ~13 minutes For 6-node cluster

Bastion Host for Private Networks

Bastion Host (Jump Host): A bastion host is necessary when the cluster nodes are in a private network not directly accessible from the internet. It acts as an intermediary SSH server, allowing tools like Ansible (Kubespray) to securely access the internal cluster network.
# Network topology: # Your laptop → Public Internet → Bastion Host → Private Network → Cluster Nodes # Inventory configuration for bastion # File: inventory/mycluster/hosts.yml all: vars: ansible_user: ubuntu # Configure bastion/jump host bastion_host: bastion.example.com bastion_user: ubuntu bastion_ssh_key: ~/.ssh/bastion-key.pem # SSH through bastion ansible-playbook -i inventory/mycluster/hosts.yml \ upgrade-cluster.yml \ -e ansible_ssh_common_args='-o ProxyCommand="ssh -W %h:%p -i ~/.ssh/bastion-key.pem ubuntu@bastion.example.com"'

Docker live-restore Caveat

Docker live-restore: While live-restore allows containers to continue running if the Docker daemon restarts, it can sometimes cause issues during a major Docker upgrade where the new daemon version fails to properly pick up the old containers, leading to service disruption.
# Docker daemon.json with live-restore { "live-restore": true } # Potential issue during upgrade: # 1. Docker upgraded from 19.03 to 20.10 # 2. New daemon starts but doesn't recognize old containers # 3. Containers still running but "orphaned" # 4. Kubelet tries to start new containers → port conflicts # 5. Manual intervention required # Recommendation: Disable live-restore during cluster upgrades # Or migrate to containerd/CRI-O (no Docker dependency)
Best Practice: For Kubernetes 1.24+, migrate away from Docker to containerd or CRI-O as your container runtime. This avoids Docker-specific issues and is officially recommended by Kubernetes.

Final Quiz

Test your knowledge of Kubernetes Cluster Upgrades!

Question 1: What is the correct procedure for worker node maintenance?

a) Directly reboot the node without any preparation
b) Use kubectl drain to evacuate pods, perform maintenance, then kubectl uncordon
c) Delete all pods manually before maintenance
d) Stop kubelet service and reboot immediately

Question 2: What happens during single master downtime?

a) All pods immediately stop running
b) Existing pods run normally but failed pods cannot restart and no new pods can be scheduled
c) The cluster automatically creates a new master
d) Worker nodes take over master responsibilities

Question 3: What is a common issue when upgrading to Kubernetes 1.16?

a) All pods are automatically deleted
b) Old API versions like apps/v1beta1 are removed and manifests using them fail
c) etcd becomes incompatible
d) CNI plugins stop working

Question 4: What is the correct component upgrade order?

a) Worker nodes first, then control plane
b) etcd first, then control plane components, then worker nodes
c) All components simultaneously
d) CNI first, then everything else

Question 5: What must you always do before upgrading a cluster?

a) Delete all workloads first
b) Read changelog, test on staging cluster, and backup etcd
c) Upgrade production first to find issues
d) Disable all monitoring systems

Question 6: How does Kubespray handle master node upgrades in HA setup?

a) Upgrades all masters simultaneously
b) Updates masters sequentially (serial: 1) to maintain availability
c) Requires manual intervention for each master
d) Only upgrades the primary master

Question 7: What is a bastion host used for?

a) Running the Kubernetes control plane
b) Acting as SSH jump host to access cluster nodes in private network
c) Storing etcd backups
d) Load balancing API server traffic

Question 8: Why can Docker upgrades cause "forgotten container" issues?

a) Docker always deletes all containers during upgrade
b) New Docker daemon may not recognize old containers, causing port conflicts when kubelet tries to start new instances
c) Containers are stored in wrong directory
d) Kubernetes is incompatible with Docker
Quiz Complete!
All correct answers are option 'b'. Review the lessons above to understand why these are the best answers.