6 min read
On this page

The Framework

System design interviews are not about producing a perfect architecture. They are about demonstrating how you think through ambiguous, large-scale problems. The interviewer has seen hundreds of candidates. What separates the strong ones is not knowing the "right" answer — it is driving a structured conversation, making explicit tradeoffs, and adapting when new constraints surface. You have 45 minutes. Here is how to spend them.

The 45-Minute Structure

Phase 1: Requirements Clarification     5 minutes
Phase 2: Back-of-Envelope Estimation    5 minutes
Phase 3: High-Level Design             15 minutes
Phase 4: Deep Dive                     15 minutes
Phase 5: Wrap-Up & Discussion           5 minutes

This is not a rigid script. Some interviews skip estimation. Some spend more time on deep dives. But having this structure in your head keeps you from the most common failure mode: spending 30 minutes on one aspect and running out of time before touching others.

Phase 1: Requirements Clarification (5 Minutes)

Do not start designing. Start asking questions. The prompt is intentionally vague — "Design Twitter" means nothing until you define scope.

Functional Requirements

What does the system actually do? Pin down the core features.

Good questions for "Design a chat system":
  - One-on-one messages only, or group chat as well?
  - Do we need message history? How far back?
  - Read receipts? Typing indicators?
  - File/image sharing or text only?
  - Do messages need to be encrypted end-to-end?

Pick 3-4 core features. Explicitly state what you are not designing. "I will focus on one-on-one messaging, message persistence, and online presence. I will defer group chat, media sharing, and end-to-end encryption for now." This shows prioritization, which is exactly what the interviewer wants to see.

Non-Functional Requirements

These are the constraints that shape the architecture.

Key non-functional requirements to ask about:
  - Scale: How many users? DAU? Peak concurrent?
  - Latency: What is acceptable? Sub-100ms for reads?
  - Availability: Can we tolerate downtime? (Usually no.)
  - Consistency: Strong or eventual? Can users see stale data briefly?
  - Durability: Can we lose messages? (Usually no.)

If the interviewer says "you decide," that is not a dodge — they want you to state assumptions and justify them. "I will assume 50 million DAU with 10% concurrent at peak. Messages must not be lost. Eventual consistency is acceptable for presence indicators, but messages should be strongly consistent within a conversation."

Phase 2: Back-of-Envelope Estimation (5 Minutes)

Some interviewers care about this, some do not. Either way, quick math grounds your design in reality and prevents you from over- or under-engineering.

What to Estimate

- QPS (queries per second): DAU * actions per day / 86400
- Storage: message size * messages per day * retention period
- Bandwidth: QPS * average payload size
- Memory: for caching hot data, size of working set

Example: Chat System Estimation

50M DAU, each sends 40 messages/day
  Write QPS: 50M * 40 / 86400 ~ 23,000 writes/sec
  Peak (2x average): ~46,000 writes/sec

Message size: ~200 bytes text + 100 bytes metadata = 300 bytes
  Daily storage: 50M * 40 * 300 bytes ~ 600 GB/day
  5-year retention: ~1 PB

Read-heavy (users read more than they write):
  Assume 10:1 read/write ratio: ~230,000 reads/sec at average

Do not spend more than 5 minutes on this. Round aggressively. The point is to know whether you need a single database or a distributed system, whether caching is necessary, and whether you need to shard.

Phase 3: High-Level Design (15 Minutes)

This is the core of the interview. Sketch the major components and how they interact.

Start With the API

Define the key API endpoints before drawing boxes. This forces you to think about the interface before the implementation.

Chat system APIs:
  POST /messages          - send a message
  GET  /messages?conv=X   - fetch message history
  WS   /ws/connect        - websocket for real-time delivery
  GET  /users/:id/status  - get online/offline status

Draw the Components

Typical components in a system design:
  - Client (web, mobile)
  - Load balancer
  - API servers (stateless)
  - WebSocket servers (stateful for real-time)
  - Message queue (async processing)
  - Database (persistent storage)
  - Cache (hot data, reduce DB load)
  - CDN (static content)
  - Notification service (push notifications)

Talk Through the Flow

Walk through a concrete user action. "User A sends a message to User B. The message hits the load balancer, routes to an API server. The server writes to the database, pushes the message to the message queue. The queue delivers to the WebSocket server holding User B's connection. If User B is offline, the notification service sends a push notification."

This narrative approach is far more effective than silently drawing boxes. The interviewer follows your reasoning, catches misunderstandings early, and can steer you.

Make Tradeoffs Explicit

Every design decision has a tradeoff. State it.

"I am using a message queue between the API server and WebSocket
server. This adds latency — maybe 50-100ms — but decouples the
two services. If the WebSocket tier goes down, messages are not
lost. The tradeoff is complexity and slight delay."

The interviewer is not looking for the "right" tradeoff. They are looking for evidence that you see the tradeoff and can reason about it.

Phase 4: Deep Dive (15 Minutes)

The interviewer will pick one or two areas to explore in depth. Sometimes they tell you which. Sometimes they ask "what would you like to go deeper on?" Have opinions about what is interesting.

Common Deep Dive Areas

Data model:
  What tables/collections? How are they indexed?
  Partitioning strategy? Partition key choice?

Scaling:
  How do you handle 10x traffic? Which component is the bottleneck?
  Horizontal vs vertical scaling? Sharding strategy?

Reliability:
  What happens when a server dies? Data replication?
  How do you handle network partitions?

Consistency:
  How do you ensure message ordering in a distributed system?
  What happens during a split-brain scenario?

Caching:
  What do you cache? Cache invalidation strategy?
  Cache-aside, write-through, or write-behind?

How to Handle the Deep Dive

Go concrete. The interviewer asks about your database choice. Do not say "I would use a NoSQL database." Say:

"For the message store, I would use Cassandra. The access pattern is
write-heavy and time-series-like: we write messages sequentially and
read them by conversation in reverse chronological order. Cassandra's
partition key would be the conversation_id, and the clustering key
would be the message timestamp. This gives us efficient writes and
range queries within a conversation. The tradeoff is we give up
cross-partition transactions, but our access patterns do not need them."

Specific, justified, with tradeoffs stated. This is what a senior engineer sounds like.

Phase 5: Wrap-Up (5 Minutes)

Summarize what you designed. Call out limitations. Suggest future improvements.

"To summarize: we have a horizontally scalable API tier behind a
load balancer, WebSocket servers for real-time delivery, Cassandra
for message storage, Redis for caching recent conversations and
presence data, and a message queue for reliable delivery.

Limitations I would address next: group chat support (fan-out to
multiple recipients), end-to-end encryption (key exchange protocol),
and message search (would need an Elasticsearch index alongside
Cassandra)."

This shows you know the design is not complete — and that is fine. No system is designed in 45 minutes. The interviewer wants to see self-awareness about gaps.

Driving the Conversation

The biggest differentiator between senior and junior candidates is who drives the conversation.

Junior Pattern

The interviewer asks "Design X." The candidate draws some boxes. The interviewer asks "What database would you use?" The candidate says "MySQL." The interviewer asks "Why?" The candidate says "It is popular." The interviewer drags information out of the candidate for 45 minutes.

Senior Pattern

The interviewer asks "Design X." The candidate asks clarifying questions, states assumptions, estimates scale, proposes a design, explains tradeoffs, identifies the most interesting component, and dives deep — all while narrating their thought process. The interviewer barely needs to prompt.

You want to be the second pattern. Prepare a mental checklist and work through it without being told. If you get stuck, say so: "I am not sure about the best approach for message ordering here. Let me think through two options." That is better than silence.

What the Interviewer Actually Evaluates

Strong signals:
  - Asks good clarifying questions
  - Makes and states assumptions
  - Structures the approach before diving in
  - Explains tradeoffs, not just choices
  - Goes deep on at least one area
  - Adapts when the interviewer changes constraints
  - Identifies bottlenecks and failure modes

Weak signals:
  - Jumps straight into drawing boxes
  - Cannot estimate scale
  - Makes choices without justification
  - Designs a single-server system for a billion-user product
  - Cannot go deeper than "use a load balancer"
  - Gets defensive when the interviewer pushes back

Common Pitfalls

  • Starting to design before understanding the problem. Every minute spent on requirements saves five minutes of wasted design. Candidates who skip requirements end up designing the wrong system.
  • Over-engineering from the start. You do not need Kubernetes, service mesh, and event sourcing for every problem. Start simple and add complexity only when the requirements demand it.
  • Ignoring non-functional requirements. A design that works for 1,000 users and a design that works for 100 million users look completely different. If you do not ask about scale, you will design the wrong thing.
  • Being too general. "I would use a cache" is not useful. "I would cache the 1,000 most recent messages per conversation in Redis with a TTL of 24 hours" is useful. Specificity demonstrates expertise.
  • Not managing time. Spending 25 minutes on the high-level design and 5 minutes on the deep dive is a common failure. The deep dive is where you demonstrate senior-level thinking. Protect that time.
  • Treating interviewer pushback as criticism. When the interviewer says "What if this service goes down?" they are not saying your design is bad. They are giving you an opportunity to show how you handle failure modes. Embrace it.

Key Takeaways

  • The 45-minute structure (requirements, estimation, high-level, deep dive, wrap-up) is your safety net. It ensures coverage even when you are nervous.
  • Drive the conversation. Narrate your thinking. The interviewer cannot give you credit for thoughts that stay in your head.
  • Every decision is a tradeoff. "I chose X because of A, at the cost of B" is the sentence pattern that earns strong hire ratings.
  • Go deep on something. Breadth without depth signals a surface-level understanding. Pick the most technically interesting component and show mastery.
  • The interviewer wants to see your process, not a production-ready architecture. A thoughtful design with acknowledged gaps beats a "perfect" design delivered in silence.