Free Resources · No Sign-up Required
DevOps Checklists for Engineering Teams
Practical checklists covering CI/CD, Kubernetes, cloud security, and observability. Use them in your next sprint, share them with your team, or run them before your next audit.
CI/CD Pipeline Readiness Checklist
Before you deploy to production, verify your pipeline covers these fundamentals.
- All code changes go through a pull request - no direct pushes to main
- At least one required reviewer approves before merge
- Automated tests run on every pull request and block merge on failure
- Build produces a versioned, immutable artifact (Docker image, binary)
- Secrets are injected at runtime - not hardcoded or baked into images
- Staging environment mirrors production (same infra, similar data volumes)
- Deployment to production requires manual approval or a gate
- Rollback procedure is documented and tested at least quarterly
- Each deploy is logged with who triggered it, what version, and when
- Failed deploys alert the team within 2 minutes
10 checkpoints
Get a free audit of your setupKubernetes Production Readiness Checklist
Running Kubernetes in production without these in place is how incidents happen at 2am.
- Resource requests and limits set on every container
- Liveness and readiness probes configured correctly
- Horizontal Pod Autoscaler (HPA) configured for stateless services
- Pod Disruption Budgets (PDB) set for critical services
- No containers running as root
- Network policies restrict pod-to-pod traffic by default
- Secrets stored in a secrets manager - not plain Kubernetes Secrets
- RBAC is configured - no wildcard permissions for service accounts
- Image vulnerability scanning in the CI pipeline (Trivy, Snyk)
- Node auto-scaling configured and tested
- Cluster version is within N-1 of the latest stable release
- etcd is backed up daily and restore has been tested
12 checkpoints
Get a free audit of your setupCloud Security Baseline Checklist
These are the controls that come up in every SOC2, HIPAA, and ISO 27001 audit. Fix them before the auditor asks.
- Root / admin accounts have MFA enabled
- IAM users/roles follow least-privilege - no wildcard policies in production
- No long-lived API keys or access keys in code or CI environment variables
- All secrets rotated through a secrets manager (AWS Secrets Manager, Vault)
- CloudTrail / audit logging enabled across all accounts and regions
- S3 buckets are not public unless explicitly intended to be
- Encryption at rest enabled for all databases and storage volumes
- TLS enforced on all external endpoints - no HTTP in production
- Security group rules restrict inbound access - no 0.0.0.0/0 on SSH/RDP
- Dependency vulnerability scanning runs in CI (npm audit, pip-audit, etc.)
- Incident response plan exists and has been rehearsed
- Access is reviewed and revoked for offboarded employees within 24 hours
12 checkpoints
Get a free audit of your setupObservability Readiness Checklist
If you cannot answer 'what broke and why' within 5 minutes of an incident, your observability is not production-ready.
- All services emit structured logs (JSON, not plaintext)
- Logs are centralised - not trapped on individual instances
- Application performance metrics exported (latency, error rate, throughput)
- Infrastructure metrics collected (CPU, memory, disk, network per service)
- Alerts fire on symptom (high error rate, high latency) not just cause (high CPU)
- Every alert has a documented runbook
- On-call rotation is documented with a clear escalation path
- Dashboards show the last 30 days of baseline - not just live data
- Distributed tracing enabled for requests that span multiple services
- Synthetic uptime monitoring checks all external-facing endpoints
10 checkpoints
Get a free audit of your setupFound gaps in your checklist?
Book a free 30-minute audit and we will walk through exactly what needs fixing and in what order.
Book Free Pipeline Audit