Aug 05, 2024
Photo by Google DeepMind on Unsplash
In this article we are going to go over the basic constructs of a news feed and how we can design a scalable feed that can support millions of users. This article is more focused towards thinking about the design. We will be going repeatedly over the system and make it better and better. It’s more of an iterative process. We’ll first present the basic trivial design and then taking that as a base will keep on building upon it sort of a Lego building process. Let’s first understand what a news feed is.
If one wants to see posts from their friends, they have to follow them. Following someone is unidirectional unlike Friends which is bidirectional. One can follow to receive the posts a user shares or unfollow to stop receiving them.
Having said that, let’s go over some basic functionalities that we would want from an app that supports the news feed.
Let’s say we are just starting up our small application, and we are not thinking about scaling up since we are limited on the budget and we hardly get around 100 or 200 users on our platform, and those are our friends whom we forced to use it 😅. Devising fully-fledged, beefy systems won’t help us grow but will drown us in its expenditure.
Also, designing the ideal solution might not be the perfect solution in that case. Since our user base is small, let’s devise something fast that we can test out, and if the need arises, we can think about modifying the solution. Again, design is all about trade-offs and it is situational. Designing for an MNC and designing for a startup will differ hugely.
Since we are a startup and we are expecting that we might grow to say 1 million users in the near future, we’ll start with something easy to roll out, so that we can showcase our product to our investors to keep them interested in our product so that they can keep on investing/pouring money. 😉.
Alright, we’ll start with a monolithic service that will have the following functionalities:
Monolithic Architecture
Above, we have shown a simple flow where a user communicates with our monolithic server via some API Gateway. The API gateway is required as it can help protect us from DoS attacks and provide some basic functionalities such as authentication/authorization and rate limiting . Since we are a startup, each call to our service will incur some cost, and we do not want bots or robots flooding our server with dummy requests and depriving real users.
Once the request hits our server, it can perform the above-mentioned four tasks. We’ll deep dive into each one of them to understand it better.
Let’s start from the right, i.e., our database. Which database should we choose? A relational or a non-relational database? Since the scale is small, the advantages or disadvantages will hardly be noticeable. But as we scale, the choice will prove to be either a burden or a blessing.
Choosing the right database depends on various factors such as the volume of data, the nature of the data, read/write patterns, scalability requirements, and the need for complex queries. For our use case, we’ll choose a non-relational database such as Amazon’s DynamoDB. Why?
Furthermore, relational databases are harder to scale horizontally (scale-out), and if we want to introduce new fields, it would be a nightmare for our SRE teams. On the other hand, non-relational databases are designed to scale horizontally, and they can handle large volumes of data and high traffic.
However, this doesn’t mean we are not going to use a relational database at all. We can use it to maintain our user database, where we need transactions for username selection. If two people want to obtain the same username, one would fail, which is what we want. This can be easily implemented in relational databases.
Having discussed it, let's move on to tables. The DB will contain two tables:
Here is a demo content of what we’ll store on our Posts table
Post ID | User ID | Content | Timestamp |
---|---|---|---|
1 | kishan | Hello World | 04-Aug-2024, 5:00 PM |
2 | kishan | Setting up my news feed | 04-Aug-2024, 6:00 PM |
3 | bruce | I am batman | 05-Aug-2024, 02:00 AM |
4 | john | Who took my Dog? | 04-Aug-2024, 03:00 AM |
5 | mathew | This maneuver will cost us 7 years | 04-Aug-2024, 01:00 PM |
Similarly we can populate the Relation table
Follower (User ID) | Followee (User ID) |
---|---|
bruce | kishan |
kishan | john |
The above entry means Bruce follows Kishan, whereas Kishan follows John, but John doesn’t follow anyone.
Having defined the content of the table, let’s move to our monolithic server. This server will interact with our different tables based on the requirement. If a user is creating a post, it will create an entry in the Posts table, and if a user wants to follow someone, it will create an entry in the Relation table.
Let’s say we want to display all the posts for a specific user. We will ask our monolithic server to fetch all the posts belonging to, say, user Kishan in reverse chronological order (most recent first). Our service will talk to the database and run a query such as “give me all the posts for user Kishan in decreasing order by timestamp.” The DB will do its job and return two posts:
1[
2 {
3 "postId": 2,
4 "userId": "kishan",
5 "content": "Setting up my news feed",
6 "timestamp": "04-Aug-2024, 06:00 PM"
7 },
8 {
9 "postId": 1,
10 "userId": "kishan",
11 "content": "Hello World",
12 "timestamp": "04-Aug-2024, 05:00 PM"
13 }
14]
We can then parse them in a beautiful way and render them on the client side.
What if we want to build a news feed for, say, Bruce, who follows Kishan?
In that case, our monolithic server will first fetch all the followings of Bruce. Our server will ask the Relation Table something like “Give me all the users that Bruce follows,” and our DB will return something like this:
1[
2 "kishan"
3]
Now, using this information, our server ask the Post table, find me all the Recent post by user: kishan. Which will return the following and it will send to the client.
1[
2 {
3 "postId": 2,
4 "userId": "kishan",
5 "content": "Setting up my news feed",
6 "timestamp": "04-Aug-2024, 06:00 PM"
7 },
8 {
9 "postId": 1,
10 "userId": "kishan",
11 "content": "Hello World",
12 "timestamp": "04-Aug-2024, 05:00 PM"
13 }
14]
If say our Bruce followed more folks say Mathew as well, our relation table would have returned:
1[
2 "kishan",
3 "mathew"
4]
And then our server would have asked for the post by both user: Kishan and Mathew. Which would have returned:
1[
2 {
3 "postId": 5,
4 "userId": "mathew",
5 "content": "This maneuver will cost us 7 years",
6 "timestamp": "04-Aug-2024, 01:00 PM"
7 },
8 {
9 "postId": 2,
10 "userId": "kishan",
11 "content": "Setting up my news feed",
12 "timestamp": "04-Aug-2024, 06:00 PM"
13 },
14 {
15 "postId": 1,
16 "userId": "kishan",
17 "content": "Hello World",
18 "timestamp": "04-Aug-2024, 05:00 PM"
19 }
20]
Notice, that the posts returned are sorted in reverse chronological order.
Till this time, everything works, we are all good with our 100-200 happy users.
Since the data handling is done by the third party service such as Amazon, we can be pretty sure that they will be able to handle our increased RPS or data redundancy as and when required.
But what if our single monolithic server goes down for some reason? Maybe due to some hardware failure or someone mistakenly suspend it?
In that case, our user won’t be able to connect to our server and they will get something like the following error if viewed from browser. Here the browser is working, the Cloudflare (API gateway) is working but our monolithic server is not.
error
We can run multiple instances of the application, so that the traffic is served even if one of our instances goes down. Earlier we were running only one instance of our application. In distributed settings, it is advised that you should run at least three instances of your application even though the traffic is less or your CPU/memory utilization is minimal.
This also helps us in horizontally scaling our application. Let’s say one instance of our application was able to handle 100-200 users. What will happen if the total user base increases to 1000-2000? 10X? Will that single instance be able to handle the load?
Of course not, in that case we can spin up multiple instances of our application to equally divide the load among them. This is called load balancing and it is one of the key properties of an API gateway.
As our application gains popularity, we will soon notice that most of the calls are coming to get the news feed. People are more passive scrollers. They often post, so our application is more read-heavy. We can assume a ratio of 1:100. If someone is posting one post, there are folks who are reading 100 posts. There is an imbalance, but unlike Thanos, we can’t bring balance here. Notice, we are scaling the system to accommodate the read traffic, but doing that also scales our post service because it is all part of a big monolithic architecture.
One solution is to dissolve the monolithic architecture and adopt a microservice architecture. In that case, we’ll run three services:
microservices way
This way, depending on the demand, we can scale that particular service instead of scaling all of them. In this case, our Feed Service will get more traffic than the Post Service, so we can choose to scale it more.
Say our total user base now has grown to 1,000, and getting the news feed for each user requires us to first query the Relation table to get all the followings and then query the Post table to get the recent posts from those users. You can see the downside if we keep growing. This will put a big load on both the DB servers and our Feed Service.
What will happen if a user follows a lot of people, say 10,000 or even 100,000? You can tell we won’t be able to handle it very well, and our CPU utilization might even reach 100%.
Can we think of a way to mitigate this problem?
Instead of computing the feed every time a user asks for it, because it is a bit too computationally expensive, what if we pre-compute it in the background?
Or deploy a service that does this sort of job and store the result in a different table, say a Feed table. This way, when the Feed Service needs to get the feed for a user, all it has to do is look up the table for that user ID and return the feed.
I know, this might not sound very intuitive, and it might feel like we just delegated the computationally expensive task to yet another service. But let’s go through the process once.
Whenever somebody creates a post on our platform, this request is routed to our Post Service. Previously, the Post Service simply created the entry in the Posts table and was done with it. Now it will have some additional tasks to do.
Following is a high-level overview of what happens when someone creates a post.
Using Messaging Queue
Say Kishan is being followed by Bruce, Mathew, and John. When he creates a post, the call lands on the API Gateway, which is then routed to the Post Service. The Post Service first saves the post to the Posts table, creates a message consisting of the post ID, user ID, and timestamp, and pushes it to Kafka. It is later picked up by a worker that first fetches the user IDs of all the followers of Kishan—in this case, Bruce, Mathew, and John. The worker then finds entries corresponding to these users in the Feed table and appends the post ID to them.
This makes the Feed table look something like this:
1{
2 "mathew": [
3 "1"
4 ],
5 "john": [
6 "1"
7 ],
8 "bruce": [
9 "1"
10 ]
11}
Now, when Mathew opens up his feed, the call goes to the Feed Service, which then queries the Feed table to check if we already have a precomputed feed. In our case, we do, so it fetches the list of post IDs that it needs to return to the user. It then queries the Posts table to retrieve those IDs, builds up the payload, and sends it back to Mathew.
You might ask why we are only storing post IDs in the Feed table and not the whole content. This would make the Feed table grow tremendously, so it's more efficient to store only post IDs. However, if we restrict the number of precomputed feeds or implement an eviction policy of some sort, it might work. Again, this depends on the design; you can definitely implement it, but you’ll have to reason through it.
I think in our case we are good with the current approach. If the need arises, we can experiment with it.
What about people who keep on refreshing their timelines even though there isn’t any new feed on them?In that case, we can put a cache in front of the Feed table so that it doesn’t get bombarded that much. The cache, of course, needs to be evicted after some time; otherwise, people might say that their favorite user just posted, but they didn’t see that post.
We solved the problem of Fanout. Fanout is a pattern where a single request fans out to create many more requests, thus overloading the system. In our case, the person who follows thousands of people—building their timeline is considered fanout. We solved it by using workers and a queue and introducing a Feed table.
But we introduced yet another problem. What if the user who is posting has a million followers?
Our worker will need to append the same post to a million such followers. One high-profile user posts, and our system will be overwhelmed. Our Feed table will be bombarded with append calls.
Can we do something about it?
We can create a rule: if the user who is posting has more than 100K followers, do not use this method. Instead, simply save the post to the Posts table and let the user pull that post when they check their feed next time.
This is our previous approach. In this scenario, a user’s followers will be divided into two categories:
We will fetch the posts of high-profile users being followed by millions by simply querying the Posts table, "Fetch me the latest post from user with ID: hrithik." It is highly unlikely that all the 100K followers will come and fetch the same post. If that happens, well, then we’ll be sorry for ourselves. Here is a revised design:
Hybrid Approach
Let’s walk through the flow.
Say User Kishan wants to access his news feed. He opens the app, and the call hits the API Gateway and is forwarded to the Feed Service. The Feed Service fetches the list of users that Kishan follows. The service gets 5 users, out of which 1 user is famous, meaning that user has a lot of followers. Let’s say that user is Hrithik. The remaining 4 are normal users. For the 4 normal users, the Feed Service will get their posts from the Feed table. For Hrithik, it will query the Posts table to get all the recent posts made by user Hrithik. Once it collates the posts, it returns them to the client.
We can, of course, provide redundancy at many places in our architecture to make our system more robust.
Looking at the design, we can further optimize it. For example, instead of using a NoSQL database for the follow table, we can employ a Graph DB. This will help us achieve faster retrievals and support generating a For You page. We can show posts that users' friends are interacting with. On Instagram, when you are scrolling endlessly, you might come across posts from users you don’t follow but are shown because some of your followings have commented on or liked them. This can be achieved using a GraphDB such as Neo4J .
Optimization
What if we started supporting videos or photos on our platform?Storing them in the Posts table might not be a good idea. We can instead store the videos or photos in a blob storage such as Amazon S3 and store the URL of that resource in our Posts table. During retrieval, we can process it and send it back to the client. We can then transfer those static contents to CDNs for a faster experience.
What if some malicious actor started exploiting our resources by creating multiple posts and storing tons of videos on our platform in a very short duration? To address this, we can employ authentication as well as rate limiting on each of our services. We can impose limits such as allowing a user to create only five posts a day or view 100 posts a day. This approach was actually used by Twitter when it saw a sudden spike in usage. We can relax these limits for users who are enrolled in our subscription plan.
Designing a scalable news feed involves a series of thoughtful considerations and architectural decisions. We started with a monolithic architecture, which is simpler and easier to manage at the outset, but as our user base grows, it quickly becomes evident that scaling this approach can be problematic. To address the challenges of a read-heavy application, we transitioned to a microservices architecture, separating concerns into distinct services for posts, feeds, and relationships.
We also tackled the complexities of efficiently managing followers and precomputing feeds to handle the demands of high-traffic scenarios. By leveraging a messaging queue and background workers, we were able to distribute the computational load and optimize our system’s performance. Additionally, we addressed the unique challenges posed by users with massive followings, implementing strategies to balance the load and ensure smooth performance for all users.
Scalability isn't just about handling more users; it's about ensuring that the system remains efficient, responsive, and maintainable as it grows. Through careful design, iterative improvements, and strategic use of microservices, we can build a robust and scalable news feed that meets the demands of millions of users.
Remember, every application is unique, and the solutions that work for one scenario might need tweaking for another. Always be prepared to iterate, test, and refine your design to meet your specific needs and challenges.
References:
Subscribe to the newsletter to learn more about the decentralized web, AI and technology.
Please be respectful!