8 Aspects to know for System Design Interviews

8 min readJan 4, 2021

System design interviews are open ended, with no answer that’s 100% effective. All we have to do is weigh options and put forth the best one forward.

Here are some of the aspects to keep in mind while designing your next large scale system.

1. Microservices

What is a micro-service architecture and its’ advantages — Gaurav Sen — YouTube

Monoliths used to be the way of life. A single system capable of doing everything.

Breaking the responsibilities into bite sized micro services is the best way to scale up systems. Those services can be individually updated, can scale independently, can use different tech than others.

It brings lots of other challenges however, such as distributed consensus issue, fault tolerant design issue, health check, service discovery, monitoring, etc.

The following checklist should tell you just how much the landscape has changed, compared to what is being taught in colleges.

Production Readiness Checklist

Are you ready to go to prod on AWS? Use this checklist to find out.

gruntwork.io

2. Horizontal Scaling

Horizontal vs Vertical Scaling — Gaurav Sen — YouTube

Vertical scaling is merely adding more and more to the spec of an existing system. That however is unsustainable at scale.

Horizontally scaling the system is the best way forward if Google’s scale is desired. However dividing the system into smaller pieces has challenges of its own.

Scaling vertically up is great when starting out. As apparently engineers are costlier than SSDs and Ram. Which is why Plenty of Fish’s Architecture uses the scale up approach and same goes for Stack Overflow’s Architecture!

3. Load Balancing

What is Load Balancing? — Gaurav Sen — YouTube

Load balancing is the efficient distribution of network or application traffic across multiple servers in a server farm. Each load balancer sits between client devices and back-end servers, receiving and then distributing incoming requests to any available server capable of fulfilling them.

Some of the best ones are — Nginx, Envoy Proxy.

Envoy Proxy - Home

Envoy is an open source edge and service proxy, designed for cloud-native applications As on the ground microservice…

www.envoyproxy.io

4. API

A Web API is an application programming interface for either a web server or a web browser.

There are several protocols in the stack making it possible for clients & servers to communicate with each other.

a. HTTP

HTTP is the underlying communication protocol of the World Wide Web. HTTP functions as a request-response protocol in the client-server computing model.

It’s a one way communication protocol, which means when we send a request we get a response. But server can’t send something out of its own volition as the connection is one way.

It’s best for passive applications that doesn’t need to be updated live.

Few architectural patterns based on HTTP are REST, GraphQL. They are topics for another post.

A query language for your API

GraphQL is a query language for APIs and a runtime for fulfilling those queries with your existing data. GraphQL…

graphql.org

b. Socket

It’s a bi-directional protocol, that supports full-duplex communication — client and server can talk to each other independently at the same time.

It uses Single TCP connection so client and server communicate over that same TCP connection throughout the life-cycle of Web Socket connection. Which means less overhead with handshakes happening again and again.

This is best for real time applications, or real time part of the application.

Socket.IO

Push data to clients that gets represented as real-time counters, charts or logs. Starting in 1.0, it's possible to…

socket.io

c. Encoding

In what format the messages are passed between sender and receiver is also one decision that needs to be made.

Binary formats like MessagePack, Protobuf are naturally compact, yet front end can’t understand them. They are great for server to server communications.

Text format like JSON is not small in size, however it’s natural to JavaScript, hence great for front end applications and can be compressed to match the size of Protobuf or MessagePack.

The trade-offs need to be analysed, as it’s done below.

The need for speed - Experimenting with message serialization

Speeding up requests and reducing payload size with MessagePack and Protobuf.

medium.com

5. Storage

Depending on the type of data being stored and requirements, several types of storage exist.

a. Databases

What is Database Sharding? — Gaurav Sen — YouTube

A database is a collection of information that is organised so that it can be easily accessed, managed and updated.

Several DB types exists with varieties of query languages to talk to them. Like Relational, No SQL, Graph DB, Key Value, Time Series, etc.

They all have something to offer, with their own advantage & disadvantages which is a topic for another day.

Native GraphQL Database: The Best Graph DB | Dgraph

Dgraph is the world's most advanced, native GraphQL database with a graph backend. Now with Slash GraphQL, get a…

dgraph.io

b. Blob Storage

Object stores are best suited for blobs like images & videos. As they can’t be stored in Databases.

Object storage is a computer data storage architecture that manages data as objects, as opposed to other storage architectures like file systems which manages data as a file hierarchy, and block storage which manages data as blocks within sectors and tracks.

The main advantage of using object storage are metadata tags, which allow for much better identification and classification. Search capabilities and unlimited scaling make object storage ideal for unstructured data.

MinIO | High Performance, Kubernetes Native Object Storage

The Gartner Peer Insights logo is a trademark and service mark of Gartner, Inc., and/or its affiliates, and is used…

min.io

c. Distributed Caching

What is Distributed Caching? — Gaurav Sen — YouTube

It’s an essential component of large scale systems.

Caching is a technique that stores a copy of a given resource and serves it back when requested. When a web cache has a requested resource in its store, it intercepts the request and returns its copy instead of re-downloading from the originating server.

It’s the same principle behind Dynamic Programming. i.e. Instead of computing something again and again or retrieving something from slower media, why not cache the data in a faster storage and serve from there instead.

Redis is one such tool. It’s an in-memory data store, which is used as a distributed, in-memory key–value database, cache and message broker.

Cache can be sped up further with compression. See: Doordash’s LZ4 Study.

Golang Key-Value Store - Badger DB | Dgraph

FAST Insert data at the speed of 160 MB/s CRASH RESILIENT The Write Ahead Log ensures your data is safe and sound…

dgraph.io

d. Content Delivery Networks

A CDN (Content Delivery Network) is a highly-distributed platform of servers that helps minimise delays in loading web page content by reducing the physical distance between the server and the user. This helps users around the world view the same high-quality content without slow loading times.

They do so by serving the intended content from closest data centre.

KeyCDN - Content delivery made easy

Each edge server is meticulously crafted with an advanced TCP stack, 100% SSD coverage, and much more. Every account…

www.keycdn.com

6. Compression

a. Passive Compression

Some compression algorithms are best used passively like Brotli, or media compression algorithms like MP4.

Netflix has to support lots of devices of varying calibre. Each device has a video format that looks best on that particular device.

Netflix also creates files optimised for different network speeds. If you’re watching on a fast network, you’ll see higher quality video than you would if you’re watching over a slow network.

Stranger Things season 2 has 9,570 different video, audio, and text files!

Excerpt from: Netflix — What Happens When You Press Play?

All of that compression is done passively once, instead of on the fly.

This kinds of decisions need to be taken in requirement gathering phase.

google/brotli

Brotli compression format. Contribute to google/brotli development by creating an account on GitHub.

github.com

b. Active Compression

Then there’s active compression for text like data, in transit or for storage in memory.

The algorithm needs to be fast in both compression & decompression stage and eat up less CPU while doing so.

Some interesting reads:

Speeding Up Redis with Compression | DoorDash Engineering Blog

One of challenges we face almost everyday is to keep our API latency low. While the problem sounds simple on the…

doordash.engineering

How Uber Engineering Evaluated JSON Encoding and Compression Algorithms to Put the Squeeze on Trip…

Imagine you have to store data whose massive influx increases by the hour. Your first priority, after making sure you…

eng.uber.com

7. Distributed Consensus

Distributed Consensus & Data Replication — Gaurav Sen — YouTube

The Byzantine Generals Problem is a term used in computing to denote a situation wherein certain components of a system may fail if participants don’t agree on a ‘concerted strategy’ to deal with the problem.

The Byzantine Generals’ Problem is the analogy most often used to illustrate the requirement for consensus for distributed systems. i.e. How do you make sure that multiple entities, which are separated by distance, are in absolute full agreement before an action is taken?

I ran into this problem when I designed a Distributed Socket Server. Multiple instances of the same server were disagreeing with each other under load, i.e. they had conflicting data in them.

Eventually I found out about Redlock algorithm algorithm and then used it to achieve distributed consensus.

Raft Consensus Algorithm is the best I found so far, it’s used by Dgraph for their distributed graph database!

Raft Consensus Algorithm

Raft is a consensus algorithm that is designed to be easy to understand. It's equivalent to Paxos in fault-tolerance…

raft.github.io

8. Message Queues

What is a Message Queue & Where is it used? — Gaurav Sen — YouTube

Put everything into a queue. Votes, comments, thumbnail creation, precomputed queries, spam processing and corrections.

Excerpt from Reddit — Lessons Learned From Mistakes Made Scaling To 1 Billion Pageviews A Month.

Queues allow you to know when there’s a problem by monitoring queue lengths. Side benefit is queues hide problems from users because things like vote requests are in the queue and if they aren’t applied immediately nobody notices.

It’s a clever little trick, often used in single page applications called Optimistic UI. It’s a pattern that you can use to simulate the results of a mutation and update the UI even before receiving a response from the server!

So when user clicks on up-vote, instead of showing a loader or loading the whole page, we immediately show the effect of the button press. Meanwhile the user’s button press is sent to a message queue which will be processed asynchronously by the server.

There are several message queues, Rabbit MQ, Kafka, NATS.

NATS - Open Source Messaging System | Secure, Native Cloud Application Development

NATS.io is a simple, secure and high performance open source messaging system for cloud native applications, IoT…

nats.io

Conclusion

If this post doesn’t help you crack an interview, perhaps it can help you understand how you should approach your next big project.

Learn from the mistakes of others. You can’t live long enough to make them all yourself. — Eleanor Roosevelt

A site called High Scalability has some of the best articles on System Design.

Like the following postmortem!

Netflix: What Happens When You Press Play? - High Scalability -

Monday, December 11, 2017 at 8:56AM This article is a chapter from my new book Explain the Cloud Like I'm 10. The first…

highscalability.com

What do you think? Let me know your thoughts down in the comments below.