Fintech Startup: 45-Minute Deploys to 3 Minutes
A Series B payments startup with 18 engineers was deploying twice a week via manual SSH scripts. Failures were common, deploys took most of an afternoon, and engineers were afraid to ship.
The Challenge
The team had a monolithic Node.js API behind a load balancer on AWS EC2. Deployments required one engineer to SSH into each server, pull the latest code, run migrations, restart the application, and verify health - manually, on three servers in sequence. The process took 45 minutes on a good day and 3 hours on a bad one. About 30% of deploys required some form of intervention. Engineers had started batching features to avoid deploys, which made each deployment larger and riskier. The cycle was self-reinforcing: more fear → bigger batches → more risk → more failures → more fear.
The Approach
The root cause was not the deployment scripts themselves - it was the lack of automation, testing gates, and rollback capability. We needed to containerize the application, build a proper CI/CD pipeline, migrate to ECS for zero-downtime deploys, and give the team confidence through fast rollback capability. The entire engagement was scoped at 3 weeks.
The Implementation
Application containerization
We containerized the Node.js API using a multi-stage Dockerfile, reducing the image size from an unbounded server snapshot to a 180MB production image. We extracted environment-specific configuration into environment variables and updated the application startup to fail fast on missing required config.
CI/CD pipeline with GitHub Actions
We built a GitHub Actions pipeline that ran on every push to main: install dependencies, run the full test suite (47 tests, 2.3 seconds), build and push the Docker image to ECR tagged with the commit SHA, and trigger a staging deployment via ArgoCD.
ECS Fargate migration
We migrated from EC2 with manual deploys to ECS Fargate with rolling deployments. The ECS service was configured with a minimum healthy percent of 100% and a maximum percent of 200%, ensuring zero-downtime deploys. We provisioned the ECS infrastructure via Terraform.
Database migration safety
Database migrations were the highest-risk part of each deploy. We implemented a migration-first deployment strategy: migrations run as a separate ECS task before the new application containers launch, and the pipeline waits for migration success before proceeding.
Key Takeaways
- Containerization is a prerequisite to modern CI/CD - without it, automation is limited
- The migration from EC2 manual deploys to ECS was the most impactful single change
- Deploy frequency naturally increased 6× in the month after automation - teams ship when shipping is safe
- The zero-downtime deploy configuration in ECS eliminated the biggest source of user-facing incidents
Facing Similar Challenges?
Book a free 30-minute audit and I will tell you what I see.