Healthtech2024-08

Healthtech: Monolith to Microservices Without Stopping Delivery

A 250K-patient digital health platform was operating a 6-year-old Rails monolith. Feature velocity had dropped 70% in 18 months as the codebase grew too large for the team to modify safely. We extracted four critical services over 12 weeks while the product team continued shipping.

Deploy Time

Monolith deploy: 22 minutes, blocks all teams

Per-service deploy: 4–6 minutes, independent

Deploy Frequency

3/week (all teams combined, serialised)

15+/week (teams deploy independently)

Incidents

70% drop in feature velocity over 18 months

Feature velocity at pre-slowdown levels within 3 months

Cost Impact

Zero patient data incidents during extraction

The Challenge

The monolith had 380K lines of Rails code, a 600-table PostgreSQL schema, and no service boundaries. Three teams of 8 engineers were constantly stepping on each other's migrations and blocking deploys. The CEO had tried a two-year full rewrite once - it failed. The constraint was non-negotiable: product development could not stop during the migration.

The Approach

We used the Strangler Fig pattern: extract services at the edges of the monolith, not the middle. We identified the four services with the most coupling pain and the clearest data ownership boundaries: notifications, appointment scheduling, billing, and document storage. Each extraction was a separate 3-week engagement.

The Implementation

Event-driven extraction with Kafka

We introduced Kafka as the communication layer between the monolith and extracted services. The monolith publishes events (AppointmentCreated, BillingInvoiceGenerated) and the new services consume them. This allowed the monolith and services to evolve independently without synchronous coupling.

Apache KafkaMSK (AWS)RailsGo

Database per service with sync period

Each new service got its own PostgreSQL database. During a 2-week sync period, writes went to both the monolith and the new service database. After validation, the monolith stopped writing to those tables and the new service became the system of record.

PostgreSQLAWS RDSDebezium

API gateway for gradual traffic shift

We deployed Kong as an API gateway in front of both the monolith and new services. Traffic routing rules allowed us to shift 5% → 25% → 100% of requests to new services per endpoint, with instant rollback capability if error rates spiked.

KongAWS ALBPrometheus

PHI boundary enforcement

Extracting services in a HIPAA environment required ensuring PHI never crossed service boundaries unencrypted and that each service maintained its own audit log. We added field-level encryption at the Kafka producer and decryption at the consumer, with per-service CloudTrail log groups.

AWS KMSApache KafkaCloudTrail

Key Takeaways

Strangler Fig works - extract at the edges with clear data ownership, never start in the middle of the monolith
Kafka as the communication layer allows the monolith and services to evolve at different speeds without synchronous coupling
A 2-week dual-write sync period before cutting over is the safety net that makes database extraction non-scary
Never attempt a full rewrite - incremental extraction with a working monolith in production is always safer and faster

Facing Similar Challenges?

Book a free 30-minute audit and I will tell you what I see.

Book Free Audit

All case studies