X Algorithm — Technical Deep Dive

Source code analysis of xai-org/x-algorithm

Generated April 8, 2026 · commit aaa167b
This is the technical companion to the content creator playbook. Everything here is derived from direct inspection of the open-source repository. Formulas, architecture, filter ordering, and action types are documented below with source file references.

System Architecture

The For You feed is a two-stage recommendation pipeline: retrieval (millions to thousands) followed by ranking (thousands to ~30-50 displayed). Four modules implement it.

01

Thunder (Rust)

In-network post store. Listens to Kafka event streams via tweet_events_listener.rs. Serves posts from followed accounts. Feeds into Home Mixer as ThunderSource.

02

Phoenix Retrieval (Python)

Two-tower model in recsys_retrieval_model.py. User tower encodes engagement history. Candidate tower: 2-layer MLP projection with SiLU. Dot-product similarity search with L2-normalized embeddings. Returns top-K candidates.

03

Phoenix Ranking (Grok)

Transformer model in recsys_model.py using Grok architecture from grok.py. Outputs log-probabilities for 19 action types per candidate. Probabilities feed into weighted scoring.

04

Home Mixer (Rust)

Orchestration in server.rs. Runs the full pipeline: sources → hydration → pre-scoring filters → scoring (Phoenix → Weighted → Diversity → OON) → top-K selection → post-scoring filters.

Scoring Pipeline

Four scorers run in sequence. Each adds a field to the candidate struct. The final score field determines feed position.

1

PhoenixScorer

phoenix_scorer.rs

Calls the Grok transformer prediction service. Receives log-probabilities for 19 action types plus one continuous value (dwell_time). Stores results as PhoenixScores struct with optional f64 values per action.

Actions mapped from proto ActionName enum: ServerTweetFav → favorite_score ServerTweetReply → reply_score ServerTweetRetweet → retweet_score ClientTweetPhotoExpand → photo_expand_score ClientTweetClick → click_score ClientTweetClickProfile → profile_click_score ClientTweetVideoQualityView → vqv_score ClientTweetShare → share_score ClientTweetClickSendViaDirectMessage → share_via_dm_score ClientTweetShareViaCopyLink → share_via_copy_link_score ClientTweetRecapDwelled → dwell_score ServerTweetQuote → quote_score ClientQuotedTweetClick → quoted_click_score ClientTweetFollowAuthor → follow_author_score ClientTweetNotInterestedIn → not_interested_score ClientTweetBlockAuthor → block_author_score ClientTweetMuteAuthor → mute_author_score ClientTweetReport → report_score DwellTime (continuous) → dwell_time
2

WeightedScorer

weighted_scorer.rs

Applies the weighted sum formula. Each of the 19 action probabilities is multiplied by its weight and summed. VQV weight only applies if video_duration_ms > MIN_VIDEO_DURATION_MS.

weighted_score = Σ(weighti × P(actioni)) + offset

Negative score normalization (when combined_score < 0):

score = (combined_score + NEGATIVE_WEIGHTS_SUM) / WEIGHTS_SUM × NEGATIVE_SCORES_OFFSET

When combined_score ≥ 0: score = combined_score + NEGATIVE_SCORES_OFFSET

Fallback: if WEIGHTS_SUM == 0, output is max(combined_score, 0.0).

Weight values are defined in params.rs which is NOT included in the open-source release. The formula and action types are visible; the actual numeric weights are proprietary.
3

AuthorDiversityScorer

author_diversity_scorer.rs

Prevents timeline flooding. Candidates are sorted by weighted_score descending. For each author, subsequent appearances receive a decaying multiplier.

multiplier(position) = (1 − floor) × decayposition + floor

Position is zero-indexed per author within the session. First appearance: multiplier = 1.0. The floor prevents the multiplier from reaching zero. AUTHOR_DIVERSITY_DECAY and AUTHOR_DIVERSITY_FLOOR are in params.rs (not published).

4

OONScorer

oon_scorer.rs

Out-of-network penalty. If in_network == false, the candidate's score is multiplied by OON_WEIGHT_FACTOR (a value < 1.0, defined in params.rs). In-network candidates pass through unmodified.

All 19 Scored Action Types

These are the actions the model predicts. Each gets a probability from the Grok transformer and a weight from the weighted scorer.

Action Proto Name Type Description
favorite ServerTweetFav + User liked the post
reply ServerTweetReply + User replied to the post
retweet ServerTweetRetweet + User reposted
photo_expand ClientTweetPhotoExpand + User expanded an image
click ClientTweetClick + User clicked into thread/media
profile_click ClientTweetClickProfile + User visited author's profile
vqv ClientTweetVideoQualityView +* Video quality view (conditional on duration)
share ClientTweetShare + User shared the post
share_via_dm ClientTweetClickSendViaDirectMessage + Sent via direct message
share_via_copy_link ClientTweetShareViaCopyLink + Copied link to clipboard
dwell ClientTweetRecapDwelled + User dwelled on post (boolean threshold)
quote ServerTweetQuote + User quoted the post
quoted_click ClientQuotedTweetClick + User clicked into a quoted post
follow_author ClientTweetFollowAuthor + User followed the author from the post
dwell_time DwellTime + (continuous) Duration in seconds (not a probability)
not_interested ClientTweetNotInterestedIn User marked "not interested"
block_author ClientTweetBlockAuthor User blocked the author
mute_author ClientTweetMuteAuthor User muted the author
report ClientTweetReport User reported the post

* VQV weight is conditional on video_duration_ms > MIN_VIDEO_DURATION_MS

Grok Transformer Architecture

The ranking model is a custom transformer using the Grok architecture. Source: phoenix/grok.py, phoenix/recsys_model.py.

Candidate Isolation Mask

Lower triangular causal mask for user + history (positions 0 to candidate_start_offset-1). Candidates at positions ≥ candidate_start_offset can attend to user history and themselves, but NOT to other candidates. This ensures each candidate's score is independent of batch composition.

Scores are absolute, not relative to other candidates in the batch.

Attention & Normalization

Multi-head grouped query attention with configurable num_q_heads and num_kv_heads. RMSNorm (not LayerNorm). Attention clipping at max_attn_val = 30.0 via tanh. Masking value: -1e30.

Same architecture family as Grok LLM (RMSNorm + RoPE + grouped query attention).

Rotary Positional Embeddings

RoPE with base exponent 10,000. Encodes sequence position so the model can weight recent engagement differently from older history.

Recency-weighted. Recent actions carry more positional signal.

Embedding Architecture

Hash-based embeddings: 2 hashes per user, item, and author by default. Action embeddings via multi-hot to signed vector (2*action - 1) with learned projection. Product surface: categorical vocab size 16.

Hash embeddings eliminate cold-start. New entities embed immediately.

Sequence Lengths

History: 128 positions (user's recent engagement). Candidates: 32 positions per batch. FFN widening factor: 4.0 (adjusted to multiple of 8).

Your last 128 engagements define your behavioral fingerprint for the model.

Two-Tower Retrieval

User tower: Grok transformer → average pool → L2-normalize. Candidate tower: 2-layer MLP with SiLU → L2-normalize. Dot-product similarity for top-K retrieval. EPS: 1e-12.

Retrieval is separate from ranking. A post must be retrieved before it can be scored.
GROUPED QUERY ATTENTION RMSNORM ROPE BASE 10,000 FFN WIDENING 4.0 ATTN CLIP 30.0 HISTORY: 128 CANDIDATES: 32 2 HASHES/ENTITY

Filter Stack (Execution Order)

Filters run before and after scoring. A filtered post never receives a score. A scored post can still be removed after scoring. Source: home-mixer/filters/.

PRE-SCORING FILTERS: 1. DropDuplicatesFilter — Removes duplicate tweet IDs 2. CoreDataHydrationFilter — Removes posts that failed metadata hydration from TES 3. AgeFilter — Removes posts older than MAX_POST_AGE (Snowflake ID timestamp math) 4. SelfTweetFilter — Removes user's own posts 5. RetweetDeduplicationFilter — Dedupes multiple reposts of same underlying content 6. IneligibleSubscriptionFilter — Removes paywalled content user cannot access 7. PreviouslySeenPostsFilter — Removes posts user already engaged with 8. PreviouslyServedPostsFilter — Removes posts already shown in current session 9. MutedKeywordFilter — Removes posts matching muted keywords (tokenized matching) 10. AuthorSocialgraphFilter — Removes posts from blocked or muted authors ─── SCORING RUNS HERE ─── ─── TOP-K SELECTION RUNS HERE ─── POST-SCORING FILTERS: 11. VFFilter — Visibility Filtering (safety). Drops posts where SafetyResult.action == Action::Drop or any other FilteredReason value. Runs AFTER scoring. 12. DedupConversationFilter — Deduplicates branches of the same conversation thread
VFFilter runs AFTER scoring — safety does not affect ranking math MutedKeywordFilter uses tokenized matching AgeFilter uses Snowflake ID timestamps

Candidate & Query Features

Source: home-mixer/candidate_pipeline/candidate_features.rs, query_features.rs.

Post Candidate (PostCandidate struct)

tweet_id: i64 tweet_text: String author_id: u64 in_reply_to_tweet_id: Option<u64> ancestors: Vec<u64> // thread lineage retweeted_tweet_id: Option<u64> video_duration_ms: Option<i32> subscription_author_id: Option<u64> author_followers_count: Option<i32> author_screen_name: Option<String> in_network: Option<bool> served_type: ServedType // ForYouInNetwork | ForYouPhoenixRetrieval phoenix_scores: PhoenixScores // 19 action probabilities weighted_score: Option<f64> score: Option<f64> // final score after all scorers

User Query (ScoredPostsQuery struct)

user_id: i64 user_action_sequence: UserActionSequence user_features: UserFeatures { muted_keywords: Vec<String> blocked_user_ids: Vec<i64> muted_user_ids: Vec<i64> followed_user_ids: Vec<i64> subscribed_user_ids: Vec<i64> } in_network_only: bool // disables Phoenix retrieval if true

What's Published vs. What's Not

In the open-source code

  • Full pipeline architecture and execution order
  • All 19 scored action types with proto names
  • Weighted scoring formula and normalization
  • Author diversity decay formula
  • OON penalty structure
  • Complete filter stack with ordering
  • Grok transformer architecture (attention, RoPE, RMSNorm)
  • Embedding configuration (hash counts, sequence lengths)
  • Candidate and query feature structs

NOT in the open-source code

  • Numeric weight values for all 19 actions (params.rs)
  • OON_WEIGHT_FACTOR value
  • AUTHOR_DIVERSITY_DECAY and FLOOR values
  • MIN_VIDEO_DURATION_MS threshold
  • MAX_POST_AGE threshold
  • PHOENIX_MAX_RESULTS and THUNDER_MAX_RESULTS
  • Top-K selection count
  • Trained model weights
  • WEIGHTS_SUM, NEGATIVE_WEIGHTS_SUM, NEGATIVE_SCORES_OFFSET

Source

Repository: github.com/xai-org/x-algorithm Commit: aaa167b3de8a674587c53545a43c90eaad360010 Released: January 20, 2026 Promised update cadence: Every 4 weeks Last update: January 20, 2026 (11 weeks overdue as of this analysis) Language: Rust (62.9%), Python (37.1%) License: Apache 2.0

← Back to the Content Creator Playbook