This is the technical companion to the
content creator playbook. Everything here is derived from direct inspection of the open-source repository. Formulas, architecture, filter ordering, and action types are documented below with source file references.
System Architecture
The For You feed is a two-stage recommendation pipeline: retrieval (millions to thousands) followed by ranking (thousands to ~30-50 displayed). Four modules implement it.
01
Thunder (Rust)
In-network post store. Listens to Kafka event streams via tweet_events_listener.rs. Serves posts from followed accounts. Feeds into Home Mixer as ThunderSource.
02
Phoenix Retrieval (Python)
Two-tower model in recsys_retrieval_model.py. User tower encodes engagement history. Candidate tower: 2-layer MLP projection with SiLU. Dot-product similarity search with L2-normalized embeddings. Returns top-K candidates.
03
Phoenix Ranking (Grok)
Transformer model in recsys_model.py using Grok architecture from grok.py. Outputs log-probabilities for 19 action types per candidate. Probabilities feed into weighted scoring.
04
Home Mixer (Rust)
Orchestration in server.rs. Runs the full pipeline: sources → hydration → pre-scoring filters → scoring (Phoenix → Weighted → Diversity → OON) → top-K selection → post-scoring filters.
Scoring Pipeline
Four scorers run in sequence. Each adds a field to the candidate struct. The final score field determines feed position.
Calls the Grok transformer prediction service. Receives log-probabilities for 19 action types plus one continuous value (dwell_time). Stores results as PhoenixScores struct with optional f64 values per action.
Actions mapped from proto ActionName enum:
ServerTweetFav → favorite_score
ServerTweetReply → reply_score
ServerTweetRetweet → retweet_score
ClientTweetPhotoExpand → photo_expand_score
ClientTweetClick → click_score
ClientTweetClickProfile → profile_click_score
ClientTweetVideoQualityView → vqv_score
ClientTweetShare → share_score
ClientTweetClickSendViaDirectMessage → share_via_dm_score
ClientTweetShareViaCopyLink → share_via_copy_link_score
ClientTweetRecapDwelled → dwell_score
ServerTweetQuote → quote_score
ClientQuotedTweetClick → quoted_click_score
ClientTweetFollowAuthor → follow_author_score
ClientTweetNotInterestedIn → not_interested_score
ClientTweetBlockAuthor → block_author_score
ClientTweetMuteAuthor → mute_author_score
ClientTweetReport → report_score
DwellTime (continuous) → dwell_time
Applies the weighted sum formula. Each of the 19 action probabilities is multiplied by its weight and summed. VQV weight only applies if video_duration_ms > MIN_VIDEO_DURATION_MS.
weighted_score = Σ(weighti × P(actioni)) + offset
Negative score normalization (when combined_score < 0):
score = (combined_score + NEGATIVE_WEIGHTS_SUM) / WEIGHTS_SUM × NEGATIVE_SCORES_OFFSET
When combined_score ≥ 0: score = combined_score + NEGATIVE_SCORES_OFFSET
Fallback: if WEIGHTS_SUM == 0, output is max(combined_score, 0.0).
Weight values are defined in params.rs which is NOT included in the open-source release. The formula and action types are visible; the actual numeric weights are proprietary.
Prevents timeline flooding. Candidates are sorted by weighted_score descending. For each author, subsequent appearances receive a decaying multiplier.
multiplier(position) = (1 − floor) × decayposition + floor
Position is zero-indexed per author within the session. First appearance: multiplier = 1.0. The floor prevents the multiplier from reaching zero. AUTHOR_DIVERSITY_DECAY and AUTHOR_DIVERSITY_FLOOR are in params.rs (not published).
Out-of-network penalty. If in_network == false, the candidate's score is multiplied by OON_WEIGHT_FACTOR (a value < 1.0, defined in params.rs). In-network candidates pass through unmodified.
All 19 Scored Action Types
These are the actions the model predicts. Each gets a probability from the Grok transformer and a weight from the weighted scorer.
| Action |
Proto Name |
Type |
Description |
| favorite |
ServerTweetFav |
+ |
User liked the post |
| reply |
ServerTweetReply |
+ |
User replied to the post |
| retweet |
ServerTweetRetweet |
+ |
User reposted |
| photo_expand |
ClientTweetPhotoExpand |
+ |
User expanded an image |
| click |
ClientTweetClick |
+ |
User clicked into thread/media |
| profile_click |
ClientTweetClickProfile |
+ |
User visited author's profile |
| vqv |
ClientTweetVideoQualityView |
+* |
Video quality view (conditional on duration) |
| share |
ClientTweetShare |
+ |
User shared the post |
| share_via_dm |
ClientTweetClickSendViaDirectMessage |
+ |
Sent via direct message |
| share_via_copy_link |
ClientTweetShareViaCopyLink |
+ |
Copied link to clipboard |
| dwell |
ClientTweetRecapDwelled |
+ |
User dwelled on post (boolean threshold) |
| quote |
ServerTweetQuote |
+ |
User quoted the post |
| quoted_click |
ClientQuotedTweetClick |
+ |
User clicked into a quoted post |
| follow_author |
ClientTweetFollowAuthor |
+ |
User followed the author from the post |
| dwell_time |
DwellTime |
+ (continuous) |
Duration in seconds (not a probability) |
| not_interested |
ClientTweetNotInterestedIn |
− |
User marked "not interested" |
| block_author |
ClientTweetBlockAuthor |
− |
User blocked the author |
| mute_author |
ClientTweetMuteAuthor |
− |
User muted the author |
| report |
ClientTweetReport |
− |
User reported the post |
* VQV weight is conditional on video_duration_ms > MIN_VIDEO_DURATION_MS
Grok Transformer Architecture
The ranking model is a custom transformer using the Grok architecture. Source: phoenix/grok.py, phoenix/recsys_model.py.
Candidate Isolation Mask
Lower triangular causal mask for user + history (positions 0 to candidate_start_offset-1). Candidates at positions ≥ candidate_start_offset can attend to user history and themselves, but NOT to other candidates. This ensures each candidate's score is independent of batch composition.
Scores are absolute, not relative to other candidates in the batch.
Attention & Normalization
Multi-head grouped query attention with configurable num_q_heads and num_kv_heads. RMSNorm (not LayerNorm). Attention clipping at max_attn_val = 30.0 via tanh. Masking value: -1e30.
Same architecture family as Grok LLM (RMSNorm + RoPE + grouped query attention).
Rotary Positional Embeddings
RoPE with base exponent 10,000. Encodes sequence position so the model can weight recent engagement differently from older history.
Recency-weighted. Recent actions carry more positional signal.
Embedding Architecture
Hash-based embeddings: 2 hashes per user, item, and author by default. Action embeddings via multi-hot to signed vector (2*action - 1) with learned projection. Product surface: categorical vocab size 16.
Hash embeddings eliminate cold-start. New entities embed immediately.
Sequence Lengths
History: 128 positions (user's recent engagement). Candidates: 32 positions per batch. FFN widening factor: 4.0 (adjusted to multiple of 8).
Your last 128 engagements define your behavioral fingerprint for the model.
Two-Tower Retrieval
User tower: Grok transformer → average pool → L2-normalize. Candidate tower: 2-layer MLP with SiLU → L2-normalize. Dot-product similarity for top-K retrieval. EPS: 1e-12.
Retrieval is separate from ranking. A post must be retrieved before it can be scored.
GROUPED QUERY ATTENTION
RMSNORM
ROPE BASE 10,000
FFN WIDENING 4.0
ATTN CLIP 30.0
HISTORY: 128
CANDIDATES: 32
2 HASHES/ENTITY
Filter Stack (Execution Order)
Filters run before and after scoring. A filtered post never receives a score. A scored post can still be removed after scoring. Source: home-mixer/filters/.
PRE-SCORING FILTERS:
1. DropDuplicatesFilter — Removes duplicate tweet IDs
2. CoreDataHydrationFilter — Removes posts that failed metadata hydration from TES
3. AgeFilter — Removes posts older than MAX_POST_AGE (Snowflake ID timestamp math)
4. SelfTweetFilter — Removes user's own posts
5. RetweetDeduplicationFilter — Dedupes multiple reposts of same underlying content
6. IneligibleSubscriptionFilter — Removes paywalled content user cannot access
7. PreviouslySeenPostsFilter — Removes posts user already engaged with
8. PreviouslyServedPostsFilter — Removes posts already shown in current session
9. MutedKeywordFilter — Removes posts matching muted keywords (tokenized matching)
10. AuthorSocialgraphFilter — Removes posts from blocked or muted authors
─── SCORING RUNS HERE ───
─── TOP-K SELECTION RUNS HERE ───
POST-SCORING FILTERS:
11. VFFilter — Visibility Filtering (safety). Drops posts where
SafetyResult.action == Action::Drop or any other
FilteredReason value. Runs AFTER scoring.
12. DedupConversationFilter — Deduplicates branches of the same conversation thread
VFFilter runs AFTER scoring — safety does not affect ranking math
MutedKeywordFilter uses tokenized matching
AgeFilter uses Snowflake ID timestamps
Candidate & Query Features
Source: home-mixer/candidate_pipeline/candidate_features.rs, query_features.rs.
Post Candidate (PostCandidate struct)
tweet_id: i64
tweet_text: String
author_id: u64
in_reply_to_tweet_id: Option<u64>
ancestors: Vec<u64> // thread lineage
retweeted_tweet_id: Option<u64>
video_duration_ms: Option<i32>
subscription_author_id: Option<u64>
author_followers_count: Option<i32>
author_screen_name: Option<String>
in_network: Option<bool>
served_type: ServedType // ForYouInNetwork | ForYouPhoenixRetrieval
phoenix_scores: PhoenixScores // 19 action probabilities
weighted_score: Option<f64>
score: Option<f64> // final score after all scorers
User Query (ScoredPostsQuery struct)
user_id: i64
user_action_sequence: UserActionSequence
user_features: UserFeatures {
muted_keywords: Vec<String>
blocked_user_ids: Vec<i64>
muted_user_ids: Vec<i64>
followed_user_ids: Vec<i64>
subscribed_user_ids: Vec<i64>
}
in_network_only: bool // disables Phoenix retrieval if true
What's Published vs. What's Not
In the open-source code
- Full pipeline architecture and execution order
- All 19 scored action types with proto names
- Weighted scoring formula and normalization
- Author diversity decay formula
- OON penalty structure
- Complete filter stack with ordering
- Grok transformer architecture (attention, RoPE, RMSNorm)
- Embedding configuration (hash counts, sequence lengths)
- Candidate and query feature structs
NOT in the open-source code
- Numeric weight values for all 19 actions (params.rs)
- OON_WEIGHT_FACTOR value
- AUTHOR_DIVERSITY_DECAY and FLOOR values
- MIN_VIDEO_DURATION_MS threshold
- MAX_POST_AGE threshold
- PHOENIX_MAX_RESULTS and THUNDER_MAX_RESULTS
- Top-K selection count
- Trained model weights
- WEIGHTS_SUM, NEGATIVE_WEIGHTS_SUM, NEGATIVE_SCORES_OFFSET
Source
Repository: github.com/xai-org/x-algorithm
Commit: aaa167b3de8a674587c53545a43c90eaad360010
Released: January 20, 2026
Promised update cadence: Every 4 weeks
Last update: January 20, 2026 (11 weeks overdue as of this analysis)
Language: Rust (62.9%), Python (37.1%)
License: Apache 2.0
← Back to the Content Creator Playbook