Deployment is the moment most engineering teams fear. Not because deployments are inherently dangerous - they are not - but because the team has not invested in making them safe. Three strategies exist to take that fear out of the equation. Here is how they work and when to use each.
Why Your Current Approach Probably Has Downtime
If you are using a basic kubectl apply or a simple rolling update with no configuration, you likely have subtle downtime:
- •Pods accepting traffic while containers are still initializing
- •Old pods terminating while requests are in-flight
- •Health checks not configured, so bad deployments go undetected for minutes
All three strategies below assume you fix these first:
yamlspec: containers: - name: api readinessProbe: httpGet: path: /health port: 8080 initialDelaySeconds: 5 periodSeconds: 5 failureThreshold: 3 livenessProbe: httpGet: path: /health port: 8080 initialDelaySeconds: 15 periodSeconds: 10 terminationGracePeriodSeconds: 30
And in your Service/Ingress, ensure preStop gives in-flight requests time to drain:
yamllifecycle: preStop: exec: command: ["sh", "-c", "sleep 5"]
Now, the strategies.
Strategy 1: Rolling Update (Default)
Rolling updates replace pods one by one. Kubernetes waits for each new pod to pass its readiness probe before terminating an old one. The result: no downtime, no extra infrastructure.
yamlapiVersion: apps/v1 kind: Deployment metadata: name: api spec: replicas: 4 strategy: type: RollingUpdate rollingUpdate: maxSurge: 1 # max extra pods during rollout maxUnavailable: 0 # never drop below desired replica count template: # ...
With maxUnavailable: 0, Kubernetes will not remove an old pod until its replacement is healthy. With maxSurge: 1, only one extra pod runs at any time (controls cost).
When to use: Most teams, most of the time. Works for stateless APIs, background workers, anything that does not have complex state or backwards-incompatible API changes.
When it fails you: If you deploy a bad build, traffic hits the broken pods before you notice. With 4 replicas and maxSurge: 1, you will have 1 bad pod in the mix for several minutes while it rolls.
Strategy 2: Blue-Green Deployment
Blue-green runs two complete environments - blue (current) and green (new) - and switches traffic between them atomically using a Service selector swap.
Architecture:
- •
deployment-bluerunning current version - •
deployment-greenrunning new version - •One
Servicepointing to either blue or green via label selector
yaml# Blue deployment - current production apiVersion: apps/v1 kind: Deployment metadata: name: api-blue labels: app: api slot: blue spec: replicas: 4 selector: matchLabels: app: api slot: blue template: metadata: labels: app: api slot: blue spec: containers: - name: api image: api:v1.5.2 --- # Green deployment - new version apiVersion: apps/v1 kind: Deployment metadata: name: api-green labels: app: api slot: green spec: replicas: 4 selector: matchLabels: app: api slot: green template: metadata: labels: app: api slot: green spec: containers: - name: api image: api:v1.6.0 --- # Service - currently pointing to blue apiVersion: v1 kind: Service metadata: name: api spec: selector: app: api slot: blue # change to "green" to switch traffic ports: - port: 80 targetPort: 8080
To deploy: apply the green deployment, wait for all green pods to pass readiness, then patch the service:
bash# Deploy new version to green kubectl apply -f api-green.yaml kubectl rollout status deployment/api-green # Run smoke tests against green (using a test service pointing at green) ./smoke-tests.sh green # Switch traffic - atomic, sub-second kubectl patch service api -p '{"spec":{"selector":{"slot":"green"}}}' # Rollback is instant - patch back to blue # kubectl patch service api -p '{"spec":{"selector":{"slot":"blue"}}}'
When to use: When you need instant rollback, when you want to test the new version before sending real traffic, or when you have API changes that are risky.
The trade-off: Doubles your compute cost during the deployment window. For a large fleet, this can be significant. It also requires careful handling of database schema changes - both versions need to handle the same schema simultaneously during the switch.
Strategy 3: Canary Deployment
Canary routes a small percentage of production traffic to the new version, lets you observe it for a defined period, then promotes it to 100%.
The cleanest way to do this in Kubernetes is with a service mesh (Istio, Linkerd) or an ingress controller that supports traffic splitting (NGINX Ingress, Traefik, AWS ALB Ingress).
With NGINX Ingress and two Deployments:
yaml# Stable deployment (95% traffic) apiVersion: apps/v1 kind: Deployment metadata: name: api-stable spec: replicas: 19 template: metadata: labels: app: api track: stable spec: containers: - name: api image: api:v1.5.2 --- # Canary deployment (5% traffic via replica ratio) apiVersion: apps/v1 kind: Deployment metadata: name: api-canary spec: replicas: 1 template: metadata: labels: app: api track: canary spec: containers: - name: api image: api:v1.6.0 --- # Single service selects both (no track label) apiVersion: v1 kind: Service metadata: name: api spec: selector: app: api # no track label - selects both deployments ports: - port: 80 targetPort: 8080
With 19 stable + 1 canary replicas, ~5% of traffic hits the canary. Watch your metrics. If error rate stays flat, promote by updating the stable image and removing the canary:
bashkubectl set image deployment/api-stable api=api:v1.6.0 kubectl rollout status deployment/api-stable kubectl delete deployment/api-canary
For header-based canary routing with NGINX Ingress:
yamlapiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: api-canary annotations: nginx.ingress.kubernetes.io/canary: "true" nginx.ingress.kubernetes.io/canary-weight: "10" # 10% of traffic # or route specific users: # nginx.ingress.kubernetes.io/canary-by-header: "X-Canary" # nginx.ingress.kubernetes.io/canary-by-header-value: "true" spec: rules: - host: api.yourdomain.com http: paths: - path: / pathType: Prefix backend: service: name: api-canary-service port: number: 80
When to use: When you want real production traffic on the new version before full rollout. Ideal for high-traffic services where even a 1% error rate means thousands of users affected. Good for teams with solid observability - you need to be able to measure error rate and latency on the canary separately.
The trade-off: More complex to set up and automate. You need to define promotion criteria and either automate the promotion decision or have someone watching dashboards.
Which One Should You Use?
| Rolling | Blue-Green | Canary | |
|---|---|---|---|
| Infrastructure overhead | None | 2× compute during deploy | Minimal |
| Rollback speed | Minutes | Seconds | Minutes |
| Detects bad deploys | After traffic hits all pods | Before traffic switch | On a small % of traffic |
| Complexity | Low | Medium | High |
| Good for | Most workloads | Risk-averse, critical APIs | High-traffic, well-monitored |
Default recommendation: Start with Rolling. Configure maxUnavailable: 0, add proper readiness probes, and add preStop sleep. This covers 80% of use cases.
Upgrade to Blue-Green when you have had a bad deploy affect users and need instant rollback, or when your API is under high-stakes load (payments, auth).
Add Canary when your traffic volume is large enough that 1% of bad traffic is still meaningful, and when you have enough observability to make automated promotion decisions.
Not sure which strategy fits your system? Book a free audit - we will review your deployment setup and tell you what risk profile you are currently accepting.