System Design

These questions are designed to see how you would perform with real world programming challenges. If you were asked by your manager to design some system, what would you do?


Design process

  1. Scope the Problem Ask questions. It is important to understand what is the scope of the problem and what are use cases. Map out the major features that are requred.

  2. Make Reasonable Assumptions Some insight into product working helps coming up with reasonable assumptions about user behaviour, quantities, etc.

  3. Draw the Major Components Draw a diagram of the major components. You might have something like a frontend server (or set of servers) that pull data from the backend’s data store. You might have another set of servers that crawl the internet for some data, and anotl”)er set that process analytics. Draw a picture of what this system might look like.

  4. Identify the Key Issues For example, if you were designing TinyURL, one situation you might consider is that while some URLs will be infrequently accessed, others can suddenly peak. This might happen if a URL is posted on Reddit or another popular forum. You don’t necessarily want to constantly hit the database.

  5. Redesign for the Key Issues


This is critical step for designing big system and thus careful consideration must be made.

  1. Ask Questions
  2. Make Believe Pretend that the data can all fit on one machine and there are no memory limitations. How would you solve the problem? The answer to this question will provide the general outline for your solution.

  3. Get Real How much data can you fit on one machine, and what problems will occur when you split up the data? Common problems include figuring out how to logically divide the data up, and how one machine would identify where to look up a different piece of data.

  4. Solve Problems

Key Concepts

Horizontal vs. Vertical Scaling

Load Balancer

Allows a system to distribute the load evenly so that one server doesn’t crash and take down the whole system


Database Denormalization and NoSQL

Join in SQL is slow.

Denormalization means adding redundant information into a database to speed up reads. For example, imagine a database describing projects and tasks. You might need to get the project name and the task information. Rather than doing a join operation across these tables, you can store the project name within the task table (in addition to the project table).

A NoSQL database does not support joins and might structure data in a different way. It is designed to scale better.

todo: what is typical technologies: hadoop?

Database Partitioning (Sharding)

Sharding means splitting the data across multiple machines while ensuring you have a way of figuring out which data is on which machine. Methods:


When an application requests a piece of information, it first tries the cache. If the cache does not contain the key, it will then look up the data in the data store.

Almost any solution needs a cashing layer

Asynchronous Processing & Queues

If we were running a forum, one of these jobs might be to re-render a page that lists the most popular posts and the number of comments. That list might end up being slightly out of date, but that’s perhaps okay.

Networking Metrics (Logging)

Technology examples: Kafka for real time logging


Service Orientated Architecture (SOA)

Provisioning is …

Containers are used… Docker




Following this demo. Key features include: (1) Tweeting, (2.a) User timeline (2.b) Home timeline (3) Following. When tweet is posted, it goes through a load balancer, then to REDIS database that updates all the user timelines, who are following. This has two problems: when someone has a lot of followers, there are a lot of timelines to update, which is slow. Mixed approach, where popular figures are stored in SQL database and only couple of people are inserted at runtime. Second issue, is inactive users that use up resources. To reduce their strain on the server, only use REDIS to update users active in the last 2 weeks.

For followers, use a table for each user, where REDIS gets the timelines to update. In general it is called Push on Change and it updates all user info when a change is submitted.

REDIS is in-memory database (key-value database) that can be distributed across many machines. This is similar to NoSQL. To find machines with relevant user data, we can look them up using a hash table where the key is username.

Limitations of our approach. Space is not an issue since twitter limits the message length to 120 chars or so. But we are replicating a lot of data since we precompute home timelines for most users. This allows quick access of data.

Messaging service

Following this demo. Key features: (1) One-to-one messaging, (2) Sent / delivered / read notifications (3) Push notifications. For the messaging, we can imagine communication as follows

Message => L.B. => Server with a queue => client messaging app

The message gets deleted from the server once the client app received the message. This could be TCP protocol confirmation. Indicator ‘Sent’ is easy, same sever confirmation of received message. While other two messages can be implemented as a special type of message sent in a typical fashion as above. This would allow to reuse the same architecture.

For push notifications, the timing does not have to be very precise. These messages could be sent by a separate server that receives a notification from the main server. This new server can be integrated with Google Push notification APIs for simplicity.


  1. “Cracking the coding interview” 6th edition, book.