6 min read
On this page

The Monolith to Services Journey

You start with a monolith. At some point, you extract a service. Then another. Then maybe a few more. This is the normal trajectory of a growing software company.

The key word is "journey." This is not a one-time migration event. It is a gradual evolution driven by real problems, not blog posts about microservices.

The trigger for extracting a service is never "microservices are best practice." The trigger is a specific, measurable problem in your monolith that a service boundary would solve. If you cannot articulate the problem in a sentence, you are not ready to extract a service.

The Natural Evolution

Every company that successfully decomposes a monolith follows roughly the same path:

Phase 1 — Pure monolith (1-10 engineers):
  One codebase, one database, one deploy.
  Everyone works in the same repo.
  Coordination is easy because the team is small.
  Duration: 1-3 years

Phase 2 — Monolith with extracted services (10-30 engineers):
  Core product is still the monolith.
  1-3 services extracted for specific reasons.
  Most development still happens in the monolith.
  Duration: 1-3 years

Phase 3 — Service-oriented (30-100 engineers):
  Multiple teams own multiple services.
  Monolith still exists but is shrinking.
  Shared platform for deployment, monitoring, logging.
  Duration: ongoing

Phase 4 — Mature services (100+ engineers):
  Clear service boundaries along team lines.
  Internal platform team supports other teams.
  Monolith may still exist as one service among many.
  Duration: ongoing

Most startups will spend their entire life in Phase 1. Many successful companies stay in Phase 2 for years. Phases 3 and 4 are for companies with serious scale — both in traffic and in team size.

The Real Trigger: Team Coordination Cost

The technical argument for microservices is about scaling. The actual reason companies extract services is about people.

When two teams frequently need to change the same code, they step on each other. Merge conflicts. Broken tests from someone else's changes. Waiting for code review from a team that is busy with their own priorities. Deploys blocked because another team's change is not ready.

This coordination cost is low when you have 5 engineers. It becomes painful around 15-20. It becomes a real drag on productivity around 30+.

Coordination cost signals:
- Teams blocking each other's deploys multiple times per week
- Merge conflicts in shared code happening daily
- Engineers waiting 1-2 days for code review from another team
- A change in one area breaks tests in an unrelated area
- Deploy frequency dropping because of cross-team coordination

Not coordination cost signals:
- "The codebase is getting big" (that is normal)
- "I want to use a different language" (not a reason for services)
- "Microservices are industry standard" (not a reason for anything)

When two teams can deploy independently without coordinating, both teams move faster. That is the actual benefit of service extraction. Not performance. Not scalability. Team independence.

What to Extract First

The first service you extract should be:

  1. Clearly bounded — it has a well-defined interface with the rest of the system
  2. Independently deployable — changes to it rarely require changes in the monolith
  3. Owned by a specific team — one team is responsible for it, not shared
  4. High-change — it changes frequently enough that independent deployment matters
Good first service extractions:
- Email/notification service (clear interface, independent)
- Image/file processing (CPU-intensive, different scaling needs)
- Auth service (stable interface, security benefits from isolation)
- Search service (different technology needs, clear interface)
- Payment processing (compliance benefits from isolation)

Bad first service extractions:
- "The user model" (too intertwined with everything)
- "The API layer" (not a service, it's a gateway)
- "The admin panel" (use Retool instead)
- "All the business logic" (that's just another monolith)

Example: Extracting an Email Service

Your monolith handles transactional emails inline. When a user signs up, the controller sends a welcome email synchronously. When an order ships, the order service sends a notification. Email sending code is scattered across 15 different files.

Before extraction:
  Monolith
  ├── controllers/signup.py     -> sends welcome email
  ├── controllers/orders.py     -> sends shipping email
  ├── controllers/billing.py    -> sends receipt email
  ├── services/email.py         -> email sending logic
  └── templates/emails/         -> 20 email templates

After extraction:
  Monolith
  ├── controllers/signup.py     -> publishes "user.created" event
  ├── controllers/orders.py     -> publishes "order.shipped" event
  ├── controllers/billing.py    -> publishes "payment.completed" event

  Email Service (separate repo, separate deploy)
  ├── handlers/user_created.py  -> sends welcome email
  ├── handlers/order_shipped.py -> sends shipping email
  ├── handlers/payment.py       -> sends receipt email
  ├── templates/                -> all email templates
  └── delivery/                 -> email provider integration

The monolith no longer knows or cares how emails are sent. The email service can be deployed, scaled, and modified independently. The email team (or the one engineer who owns emails) can work without touching the monolith.

The Extraction Process

Extracting a service from a monolith is surgery. Do it carefully.

Step 1: Draw the Boundary

Before writing any code, identify exactly what moves out of the monolith and what stays. Map every place the monolith interacts with the code you want to extract.

Boundary analysis for email extraction:
- 15 places in the monolith that send emails
- 3 database tables related to email (templates, logs, preferences)
- 2 background jobs that send batch emails
- 1 admin interface for managing templates

Step 2: Create the Interface

Define how the monolith will communicate with the new service. Keep it simple. HTTP REST is fine. A message queue is fine. Do not over-engineer the communication layer.

Interface options:
- HTTP API: Simple, synchronous, easy to debug
- Message queue: Asynchronous, resilient, harder to debug
- Direct database access: No. Never share databases between services.

For email, a message queue makes sense because:
- Email sending is inherently asynchronous
- Failures should be retried, not returned to the user
- Volume can spike and the queue acts as a buffer

Step 3: Build the Service

Build the new service alongside the monolith. Do not remove anything from the monolith yet. Run both in parallel.

Step 4: Migrate Gradually

Move callers from the monolith one at a time. First, the signup flow sends emails through the new service. Then billing. Then orders. Each migration is a small, reversible change.

Migration checklist per caller:
- [ ] Update caller to use new service
- [ ] Verify emails are sent correctly
- [ ] Monitor error rates for 1 week
- [ ] Remove old email code from the caller
- [ ] Repeat for next caller

Step 5: Remove the Old Code

Once all callers use the new service, remove the email code from the monolith. This is the satisfying part. Delete the old code. Clean up the database tables.

Communication Between Services

The hardest part of a service architecture is how services talk to each other. There are two primary patterns:

Synchronous (HTTP/gRPC)

Service A calls Service B and waits for a response.

Pros:
- Simple to understand and debug
- Request/response model is familiar
- Easy to trace failures

Cons:
- Service A fails if Service B is down
- Latency adds up with multiple calls
- Creates tight coupling between services

Asynchronous (Message Queue)

Service A publishes an event. Service B consumes it whenever it is ready.

Pros:
- Service A does not depend on Service B's availability
- Handles traffic spikes (queue acts as buffer)
- Loose coupling between services

Cons:
- Harder to debug (events are fire-and-forget)
- Eventual consistency (not immediate)
- Adds infrastructure (the queue itself)

For most startups extracting their first service, synchronous HTTP is the right choice. It is simpler. The coupling concern is theoretical at small scale. Switch to async when you have evidence that sync is causing problems.

Data Ownership

The most important rule of service extraction: each service owns its data. No shared databases.

Wrong:
  Service A -> reads/writes -> Shared Database <- reads/writes <- Service B
  (Changes to the schema break both services)

Right:
  Service A -> reads/writes -> Database A
  Service B -> reads/writes -> Database B
  Service A -> API call -> Service B (when it needs Service B's data)

Shared databases are the single biggest source of coupling in service architectures. When two services share a database, they are not really separate services — they are a distributed monolith, which is worse than a regular monolith.

When Not to Extract

Sometimes the right answer is to keep things in the monolith and improve the monolith instead.

Instead of extracting a service, consider:
- Better module boundaries within the monolith
- Code ownership rules (team A owns these directories)
- Separate deploy pipelines for different parts of the monolith
- Feature flags for independent feature releases
- Modular monolith patterns (explicit module interfaces)

Shopify pioneered the "modular monolith" approach: clear module boundaries within a single codebase, enforced by tooling. This gives many of the organizational benefits of services without the operational complexity.

Common Pitfalls

Extracting too early. Extracting a service before you have a team to own it means one engineer now maintains a service and contributes to the monolith. That is worse, not better.

Distributed monolith. Extracting services that are tightly coupled and share a database. You now have all the complexity of a distributed system with none of the independence benefits.

Too many services too fast. Extracting five services in six months when you have 10 engineers. Each service needs monitoring, deployment, documentation, and on-call rotation. You cannot support that.

Not investing in platform. Services need shared infrastructure: logging, monitoring, deployment, service discovery. Without a platform team or shared tooling, each service reinvents the wheel.

Nano-services. A service that does one tiny thing and could be a function call. If your service has 200 lines of code, it should probably be a library, not a service.

Ignoring the data problem. Extracting the code but leaving the data in the shared database. This is the hardest part and the part teams most often skip.

Key Takeaways

  • The monolith-to-services journey is gradual, driven by real problems, primarily team coordination cost.
  • The trigger for extraction is measurable: teams blocking each other's deploys, daily merge conflicts, declining deploy frequency.
  • Extract services that have clear boundaries, independent deployment needs, and a team to own them.
  • Each service owns its data. No shared databases. This is non-negotiable.
  • Synchronous HTTP is fine for your first extracted service. Add async messaging when you have evidence you need it.
  • Consider modular monolith patterns before extracting services. You can get organizational benefits without operational complexity.
  • Most startups will spend their entire life in the monolith phase. That is normal and correct.