**How the Twitter Algorithm Selects Tweets for Your Timeline**
Twitter’s Recommendation Algorithm
Twitter aims to deliver the most relevant content to its users by using a recommendation algorithm that filters the roughly 500 million daily tweets down to a few top tweets that appear on the user’s timeline. This article will provide an in-depth look into how the algorithm selects tweets for your timeline.
The Core Models and Features
The foundation of Twitter’s recommendation system lies in its core models and features, which extract latent information from tweet, user, and engagement data. These models help answer important questions about the Twitter network, such as the probability of future interactions between users and the identification of trending tweets within communities. Accurate answers to these questions enable Twitter to deliver more relevant recommendations.
The Recommendation Pipeline
The recommendation pipeline is divided into three main stages that utilize these features:
1. Candidate Sourcing: This stage involves fetching the best tweets from different recommendation sources, including search, explore, and ads. However, this article will focus primarily on the For You feed in the home timeline.
2. Ranking: In this stage, each tweet is ranked using a machine learning model that takes into account various features.
3. Heuristics and Filters: This stage applies filters to remove tweets from blocked users, NSFW content, and tweets that have already been seen.
The Home Mixer
The Home Mixer is the service responsible for constructing and serving the For You timeline. Built on the Product Mixer, a custom Scala framework, the Home Mixer connects different candidate sources, scoring functions, heuristics, and filters. This diagram illustrates the major components involved in constructing a timeline:
Twitter utilizes several candidate sources to retrieve recent and relevant tweets for users. The In-Network source focuses on delivering the most relevant, recent tweets from users the user follows. Real Graph, a model predicting the likelihood of user engagement, plays a crucial role in ranking In-Network tweets. The Out-of-Network sources aim to find relevant tweets from users outside the user’s network.
The In-Network source is the largest candidate source and ranks tweets from followers based on relevance using a logistic regression model. Real Graph, a critical component, predicts the likelihood of engagement between users, determining the inclusion of tweets from authors with high Real Graph scores.
Finding relevant tweets from users outside the network is more challenging. Twitter takes two approaches to address this issue. The Social Graph approach estimates relevance by analyzing engagements of people the user follows or those with similar interests. GraphJet, a graph processing engine, executes graph traversals to generate candidate tweets for Out-of-Network recommendations.
Embedding space approaches focus on content similarity. Twitter utilizes SimClusters, which use a matrix factorization algorithm to discover communities based on influential users. Users and tweets are represented within these communities, allowing for more accurate recommendations based on community relevance.
After candidate selection, the tweets go through a ranking stage where a neural network with around 48 million parameters predicts the relevance of each tweet. This neural network considers thousands of features and assigns a score to each tweet to determine the ranking.
Heuristics, Filters, and Product Features
Once the ranking stage is complete, heuristics and filters are applied to ensure a balanced and diverse feed. These include visibility filtering, author diversity, content balance, feedback-based fatigue, social proof, conversations, and edited tweets.
Mixing and Serving
Finally, Home Mixer blends the selected tweets with other content like ads, follow recommendations, and onboarding prompts, resulting in a final set of tweets ready to be displayed on the user’s device.
The Twitter algorithm utilizes a complex recommendation system to deliver relevant tweets to users’ timelines. By combining core models, candidate sourcing, ranking mechanisms, and heuristic filters, Twitter aims to provide users with a personalized and engaging experience on their platform.