X Algorithm — Technical Deep Dive

Source code analysis of xai-org/x-algorithm

Generated May 15, 2026 · commit e414c17

This is the technical companion to the content creator playbook. The May 15, 2026 commit (e414c17) is a 187-file rewrite — 18,263 line additions, 926 deletions. The three independent scorers (weighted, author-diversity, OON) have been collapsed into a single RankingScorer, with a new optional second-stage VMRanker (gRPC). A new BlenderSelector merges scored posts with ads, prompts, and Who-To-Follow modules into the final feed. A new Python module grox/ ships Grok-powered viral, safety, and spam classifiers. Per-post action prediction expanded from 19 to 22 signals — including a new negative not_dwelled penalty.

What Changed in This Release

Three scorers collapsed into one

The previous weighted_scorer.rs, author_diversity_scorer.rs, and oon_scorer.rs are deleted. A single RankingScorer now applies the weighted sum, score normalization, author diversity decay, and OON downweighting in one pass. Diversity is now computed against the full sorted batch (not just sequential pairs), and the OON factor branches into three values: topic-mode (TopicOonWeightFactor), new-user (NEW_USER_OON_WEIGHT_FACTOR), and default (OonWeightFactor).

home-mixer/scorers/ranking_scorer.rs

New optional second-stage VM Ranker

A new VMRanker scorer (feature-flagged via EnableVMRanker) sends the entire scored candidate set to an external gRPC service (xai_vm_ranker_proto) and replaces the score with the value-model output. Includes DPP (Determinantal Point Process) parameters: VMRankerDppTheta, VMRankerDppMaxSelectedRank. Value model selected via VMRankerValueModelId. The actual value-model implementation is NOT in the published code — only the client call site is visible.

home-mixer/scorers/vm_ranker.rs

Two-tier pipeline architecture

The single pipeline has been split into PhoenixCandidatePipeline (inner — produces scored posts) and ForYouCandidatePipeline (outer — assembles the final feed). The outer pipeline pulls scored posts from the inner one via ScoredPostsSource and combines them with ads, prompts, Who-To-Follow modules, and Push-To-Home posts.

candidate_pipeline/for_you_candidate_pipeline.rs
candidate_pipeline/phoenix_candidate_pipeline.rs

BlenderSelector + Ads module

Replaces the simple TopKScoreSelector for the For-You pipeline. Partitions candidates by item type (post / ad / WTF / prompt / push-to-home), then runs one of two ad-blending strategies (SafeGapAdsBlender or PartitionOrganicAdsBlender, selected via the AdsBlenderType param). Prompts insert at the front; WTF modules at WHO_TO_FOLLOW_POSITION; push-to-home pins at position 0.

home-mixer/selectors/blender_selector.rs
home-mixer/ads/{partition_organic_blender,safe_gap_blender,util}.rs

Three new scored action signals

The Phoenix scorer struct (PhoenixScores) now carries 22 fields, up from 19. New: quoted_vqv_score (video quality view inside a quote), click_dwell_time (continuous duration after click), and not_dwelled_score (probability the user scrolled past). The latter is a new negative signal with a corresponding NotDwelledWeight in the params. Several previously aggregated signals (share_via_dm, share_via_copy_link, quoted_click) are now first-class scored actions in the ranking_scorer formula.

RankingScorer::ScoringWeights fields

Grox: a new Grok-powered content classifier subsystem

An entirely new top-level Python package grox/ ships 47 new files. Includes Grok-based classifiers for viral quality (BangerInitialScreen), comprehensive safety (PostSafetyScreenDeluxe, SafetyPtos), spam (SpamEapiLowFollowerClassifier), reply ranking (ReplyScorer), multimodal post embedding (v2 and v5 embedders), ASR for audio/video, and a plan/task/scheduler framework. Grox imports grox.config, grox.lm, grox.prompts, grok_sampler, monitor, and strato_http — none of which are in the repo. Grox cannot be run as published.

grox/ — 47 files, no entry point

9 new candidate sources

In addition to the original Thunder and Phoenix sources: ads_source, cached_posts_source, phoenix_moe_source (Mixture-of-Experts retrieval), phoenix_topics_source, prompts_source, push_to_home_source, scored_posts_source, tweet_mixer_source, who_to_follow_source.

home-mixer/sources/

17 new query hydrators

The query is now hydrated with substantially more context: blocked/muted/followed/subscribed user IDs, cached posts, followed and inferred Grok topics, followed starter packs, impressed posts, an impression bloom filter, IP, mutual-follow lists, past-request timestamps, retrieval/scoring sequences, served history, user demographics, and an inferred-gender feature (UserInferredGender).

home-mixer/query_hydrators/ (18 files)

14 new side effects

Logging and state-mutation now broken out into discrete side effects: ads-injection logging, client-events Kafka, For-You response stats, mutual-follow stats, Phoenix experiments, Phoenix request cache, publish-seen-ids Kafka, Redis post-candidate cache, reranking Kafka, scored-stats, served-candidates Kafka, truncate-served-history, update-past-request-timestamps, update-served-history.

home-mixer/side_effects/ (14 files)

New filters (topic-aware)

New filters: topic_ids_filter.rs (571 lines — major topic filtering logic), new_user_topic_ids_filter.rs, previously_seen_posts_backup_filter.rs, ancillary_vf_filter.rs, video_filter.rs. Many existing filters were also significantly modified.

home-mixer/filters/

11 new candidate hydrators

Including ads brand-safety (two variants), blocked_by, engagement_counts, filtered_topics, following_replied_users, has_media, language_code, mutual_follow_jaccard (Jaccard similarity over follow graphs), quote_hydrator, tweet_type_metrics.

home-mixer/candidate_hydrators/

Phoenix model artifacts now published

A new phoenix/artifacts/oss-phoenix-artifacts.zip Git LFS pointer (3.1 GB) ships actual trained weights: retrieval transformer + candidate tower (~3 MB), 1M-entry hash embedding tables (~1.4 GB each for retrieval and ranker), a 537K-post sports-only retrieval corpus, and a config. This is a "mini" 4-layer / 128-dim model trained on real engagement data — explicitly NOT the production model, which is larger and trains continuously.

phoenix/artifacts/oss-phoenix-artifacts.zip

System Architecture (Updated)

The For You feed is now a two-pipeline system. The inner Phoenix pipeline produces ranked posts. The outer ForYou pipeline assembles the final feed by mixing posts with ads, prompts, and other modules. Retrieval is multi-source (Thunder + Phoenix-MoE + Phoenix-Topics + Tweet Mixer + Push-To-Home + cached posts).

Multi-Source Retrieval

Eleven sources feed the candidate pool: Thunder (in-network), Phoenix (out-of-network), Phoenix-MoE (Mixture-of-Experts retrieval), Phoenix-Topics (topic-based retrieval over followed/inferred Grok topics), Tweet Mixer, Cached Posts, Push-To-Home, plus Ads, Who-To-Follow, and Prompts which join later. Sources: home-mixer/sources/.

Hydration & Pre-Scoring Filters

Candidate hydrators enrich posts (engagement counts, language, media, quote ancestry, brand safety, etc.). Then the pre-scoring filter stack removes duplicates, old posts, blocked/muted, seen/served, paywalled, and posts failing core-data hydration.

Phoenix Inference (Grok Transformer)

The PhoenixScorer calls the Phoenix prediction service (gRPC). New-user requests can be routed to a separate cluster via PhoenixRankerNewUserInferenceClusterId when the user's action-sequence length is below PhoenixRankerNewUserHistoryThreshold. Returns 22 per-action probabilities + 2 continuous dwell signals per candidate.

RankingScorer (Unified)

Single scorer applies weighted sum, score normalization, author diversity decay, and OON downweight in one pass. Replaces three separate scorers from the previous release.

VMRanker (Optional, Feature-Flagged)

If EnableVMRanker is set, the scored candidate set is shipped via gRPC to a value-model service (VMRankerClient). The service applies a separate value-model + optional DPP-based diversity reranking and returns new scores. The proto/service implementation is NOT in the open-source release.

BlenderSelector

Partitions all candidates by type, runs the configured ads blender (safe_gap or partition_organic) to interleave ads, then inserts prompts (at the front), Who-To-Follow modules (at WHO_TO_FOLLOW_POSITION), and pins any Push-To-Home post at position 0.

Post-Selection Filtering & Side Effects

VFFilter (safety) and DedupConversationFilter run after selection. Then 14 side effects fire: Kafka publishes (client events, served candidates, seen IDs, reranking), Redis caching, served-history updates, response stats, Phoenix experiments logging, and timestamp/history maintenance.

2 PIPELINES (PHOENIX + FORYOU) 11 SOURCES 22 SCORED ACTIONS 14 SIDE EFFECTS UNIFIED RANKINGSCORER OPTIONAL VMRANKER

Scoring Pipeline

Two scorers run in sequence. The first (PhoenixScorer) populates raw per-action probabilities. The second (RankingScorer) collapses them to a single score with diversity + OON adjustments. An optional third scorer (VMRanker) can replace the score with a value-model output.

PhoenixScorer

phoenix_scorer.rs

Calls the Phoenix prediction service (gRPC) and populates PhoenixScores on each candidate. Routes new users (action-sequence shorter than PhoenixRankerNewUserHistoryThreshold) to a separate inference cluster. An egress sidecar can be enabled via UseEgressSidecar with automatic fallback to the primary client on failure. Product surface routes between HomeTimelineRankedFollowing (in-network-only mode) and HomeTimelineRanking (full).

PhoenixScores fields (22 total):
favorite_score              reply_score
retweet_score               photo_expand_score
click_score                 profile_click_score
vqv_score                   share_score
share_via_dm_score          share_via_copy_link_score
dwell_score                 quote_score
quoted_click_score          quoted_vqv_score          ← NEW
follow_author_score         not_interested_score
block_author_score          mute_author_score
report_score                not_dwelled_score         ← NEW
dwell_time (continuous)     click_dwell_time          ← NEW (continuous)

RankingScorer (Unified)

ranking_scorer.rs

Replaces the three previous scorers. Applies all four operations on each candidate in one pass:

weighted_score = Σ(weight_i × P(action_i))

Score offset normalization (negative branch):

offset_score = (combined_score + negative_sum) / total_sum × NEGATIVE_SCORES_OFFSET

When combined_score ≥ 0: offset_score = combined_score + NEGATIVE_SCORES_OFFSET. Fallback: if total_sum == 0, max(combined_score, 0.0).

After weighting, each candidate's score is normalized via util::score_normalizer::normalize_score(c, raw) (implementation not in open source).

Author diversity then applies geometric decay over rank position per author:

multiplier(position) = (1 − floor) × decay^position + floor

Candidates are first sorted by weighted score descending; each author's nth appearance in that ordering gets decayⁿ. Finally, out-of-network candidates are downweighted:

final = diversity_adjusted × effective_oon_weight

Three branches for effective_oon_weight:

If query.topic_ids non-empty: TopicOonWeightFactor
If user is "eligible new user" (account age < NewUserAgeThresholdSecs AND followed count ≥ NEW_USER_MIN_FOLLOWING): NEW_USER_OON_WEIGHT_FACTOR
Otherwise: OonWeightFactor

All weight constants — favorite, reply, retweet, photo_expand, click, profile_click, vqv, share, share_via_dm, share_via_copy_link, dwell, quote, quoted_click, quoted_vqv, cont_dwell_time, cont_click_dwell_time, follow_author, not_interested, block_author, mute_author, report, not_dwelled, AuthorDiversityDecay, AuthorDiversityFloor, OonWeightFactor, TopicOonWeightFactor, NEW_USER_OON_WEIGHT_FACTOR, NEGATIVE_SCORES_OFFSET, MinVideoDurationMs — are referenced from params.rs, which is NOT in the open-source release.

VMRanker (optional)

vm_ranker.rs

Feature-flagged via EnableVMRanker. When enabled, builds a RankRequest with all PhoenixScores + candidate metadata (tweet_id, author_id, in_network, is_retweet, is_reply, author_followers_count, vqv_ineligible, retweeted_tweet_id, current ranking_scorer score) and ships it via gRPC to VMRankerCluster resolved from VMRankerClusterId. The returned score replaces the ranking_scorer score; on missing entries the previous score is preserved. Includes optional DPP parameters: if either VMRankerDppTheta > 0 or VMRankerDppMaxSelectedRank > 0, sends DppParams. Value-model selection via VMRankerValueModelId. The remote service implementation is NOT in the open-source release — the DPP math, value-model architecture, and reranking logic all live behind the gRPC boundary.

All 22 Scored Action Types

These are the actions the ranking model predicts. Each has a probability (or continuous value) and a weight from params.rs. Three are new in this release.

Action	Proto / Field	Type	Description
favorite	favorite_score	+	User liked the post
reply	reply_score	+	User replied to the post
retweet	retweet_score	+	User reposted
photo_expand	photo_expand_score	+	User expanded an image
click	click_score	+	User clicked into thread/media
profile_click	profile_click_score	+	User visited author's profile
vqv	vqv_score	+*	Video quality view (conditional on `video_duration_ms > MinVideoDurationMs` via `util::candidates_util::vqv_weight`)
share	share_score	+	User shared the post
share_via_dm	share_via_dm_score	+	Sent via direct message
share_via_copy_link	share_via_copy_link_score	+	Copied link to clipboard
dwell	dwell_score	+	User dwelled on post (boolean threshold)
quote	quote_score	+	User quoted the post
quoted_click	quoted_click_score	+	User clicked into a quoted post
quoted_vqv NEW	quoted_vqv_score	+*	Video quality view inside a quote post (conditional on duration via `quoted_vqv_weight`, gated by `EnableQuotedVqvDurationCheck`)
cont_dwell_time	dwell_time (f64 seconds)	+ (continuous)	Continuous dwell duration. Scales linearly with watch/read time, unlike binary signals.
cont_click_dwell_time NEW	click_dwell_time (f64 seconds)	+ (continuous)	Continuous dwell duration after a click (e.g., time in thread or external article). Brand new continuous signal in this release.
follow_author	follow_author_score	+	User followed the author from the post
not_interested	not_interested_score	−	User marked "not interested"
block_author	block_author_score	−	User blocked the author
mute_author	mute_author_score	−	User muted the author
report	report_score	−	User reported the post
not_dwelled NEW	not_dwelled_score	−	Probability user scrolled past without dwelling. Brand-new negative signal in this release. Previously, no-dwell carried no weight; it is now an explicit downvote with weight `NotDwelledWeight`.

* VQV weight is conditional via util::candidates_util::vqv_weight which checks video_duration_ms against MinVideoDurationMs. quoted_vqv applies the same gate when EnableQuotedVqvDurationCheck is true.

** Two of the 22 are continuous (f64 seconds) rather than 0–1 probabilities: dwell_time and click_dwell_time. Both contribute linearly with attention duration.

*** The Phoenix demo model artifact published with this release was trained on 19 actions (per phoenix/test_recsys_model.py config); the 22-action expansion shows up in the Rust scoring code and reflects what production is using.

Blender Selector & Ads Pipeline

The new outer For-You pipeline assembles posts, ads, prompts, and Who-To-Follow modules into a single ordered feed. Source: home-mixer/selectors/blender_selector.rs, home-mixer/ads/.

FeedItem Variants

The new FeedItem proto wraps five distinct item types: Post(ScoredPost), Ad(AdIndexInfo), WhoToFollow(WhoToFollowModule), Prompt(Prompt), PushToHome(PushToHomePost).

Posts now compete with non-post items for slots.

Ads Blenders

Two implementations of the AdsBlender trait: SafeGapAdsBlender (enforces minimum gap between ads) and PartitionOrganicAdsBlender (default — partitions organic and ad slots). Selection via the AdsBlenderType string parameter ("safe_gap" or anything else).

Brand safety hydrators (ads_brand_safety_hydrator, ads_brand_safety_vf_hydrator) enrich ad candidates before blending.

Insertion Order

The selector runs in this order: (1) ads blender produces blended posts+ads, (2) prompts inserted at the front (positions 0..n by index), (3) one Who-To-Follow module inserted at WHO_TO_FOLLOW_POSITION - 1, (4) Push-To-Home post pinned at position 0.

Push-To-Home always wins position 0 — opening a notification overrides organic ranking.

Non-Selected Placeholders

Dropped posts and ads (those not selected after blending) are emitted as non_selected placeholders in the SelectResult for downstream logging via side effects like scored_stats_side_effect and ads_injection_logging_side_effect.

Grox: Grok-Powered Content Classification

A new top-level Python package grox/ introduces a content-classification subsystem. It pre-processes posts before they enter the recommendation system. Source: grox/ (47 files, 6,000+ lines).

BangerInitialScreen

Vision-language model classifier (VLM_PRIMARY, temperature 1e-6) that scores posts on viral quality, slop_score, has_minor_score, and produces taxonomy categories + a description. Posts that don't clear the screen may be filtered before reaching the ranker.

Source: grox/classifiers/content/banger_initial_screen.py

PostSafetyScreenDeluxe + SafetyPtos

Two-stage safety pipeline. PostSafetyScreenDeluxe performs comprehensive screening; SafetyPtos (Policies and Terms of Service) classifies category and policy violations separately.

Source: grox/classifiers/content/post_safety_screen_deluxe.py, safety_ptos.py

SpamEapiLowFollowerClassifier

Spam detection targeting low-follower accounts via the EAPI surface. Feeds into rate-limiting decisions.

Source: grox/classifiers/content/spam.py

ReplyScorer

Grok-powered reply ranking. Replies are no longer sorted purely chronologically or by engagement — Grok scores them too.

Source: grox/classifiers/content/reply_ranking.py

Multimodal Post Embedders (v2 & v5)

Two generations of multimodal (image + text) post embedders. Embeddings are published to Kafka via task_write_mm_embedding_sink and feed downstream retrieval and ranking.

Source: grox/embedder/multimodal_post_embedder_v2.py, v5.py

ASR Processor

Automatic speech recognition for video posts. Audio is transcribed and the transcript becomes input to the embedder and summarizer for downstream classification.

Source: grox/data_loaders/asr_processor.py

Plans, Tasks, Schedules

A general execution framework: plans/ (initial_banger, master, post_embedding_v5, post_safety, reply_ranking, safety_ptos, spam_comment), tasks/ (24 task definitions including pub, rate_limit, ASR, banger_screen, embedding_pub, multimodal_post_embedding, post_safety, summarizer), schedules/, dispatcher.py, engine.py.

Heavy Internal Dependencies

Grox imports grox.config.config, grox.lm (post, user, convo), grox.prompts.template, grok_sampler.config, grok_sampler.vision_sampler, monitor.metrics, and strato_http.queries.grok_topics — none of which are in the published repo.

Grox cannot be run as published. Released as architectural disclosure, not as runnable code.

Phoenix Grok Transformer Architecture

The Phoenix ranking model. Source: phoenix/grok.py, phoenix/recsys_model.py, phoenix/recsys_retrieval_model.py.

Candidate Isolation Mask

Lower triangular causal mask for user + history. Candidates attend to user history and themselves, but NOT to other candidates. Each candidate's score is independent of batch composition — making it consistent and cacheable.

Scores are absolute, not relative to other candidates in the batch.

Attention & Normalization

Multi-head grouped query attention with configurable num_q_heads and num_kv_heads. RMSNorm. Attention clipping via tanh at max_attn_val = 30.0. Masking value -1e30.

Same architecture family as Grok LLM (RMSNorm + RoPE + grouped query attention).

Rotary Positional Embeddings

RoPE with base exponent 10,000. Encodes sequence position so the model can weight recent engagement differently from older history.

Recency-weighted. Recent actions carry more positional signal.

Embedding Architecture

Hash-based embeddings: 2 hashes per user/item/author by default. Action embeddings via multi-hot to signed vector (2*action - 1) with learned projection. Product surface: categorical vocab size 16.

Hash embeddings eliminate cold-start. New entities embed immediately.

Released Mini Config

The model artifact zip ships a smaller-than-production model: 128-dim embeddings, 4 transformer layers, 4 attention heads, key size 32, widening factor 2, history sequence length 127, candidate sequence length 64, 1M-entry user/item/author vocab, 19 action types.

Production uses a larger model with more layers and wider embeddings — explicitly stated in Phoenix README.

Two-Tower Retrieval

User tower: Grok transformer → average pool → L2-normalize. Candidate tower: 2-layer MLP with SiLU → L2-normalize. Dot-product similarity for top-K retrieval. EPS: 1e-12.

Retrieval is separate from ranking. A post must be retrieved before it can be scored.

GROUPED QUERY ATTENTION RMSNORM ROPE BASE 10,000 ATTN CLIP 30.0 HISTORY: 127 CANDIDATES: 64 2 HASHES/ENTITY MINI: 4 LAYERS / 128 DIM

Query Hydrators (New)

Before any candidates are fetched, the query is hydrated with the user's full context. The May 15 release adds 17 new hydrators producing a much richer per-request feature set. Source: home-mixer/query_hydrators/.

QUERY HYDRATION FIELDS:
blocked_user_ids               muted_user_ids
followed_user_ids              subscribed_user_ids
followed_grok_topics           inferred_grok_topics       ← AI-derived topic graph
followed_starter_packs         filtered_topics
impressed_posts                impression_bloom_filter    ← compact impression dedup
cached_posts                   served_history
retrieval_sequence             scoring_sequence
mutual_follow                  mutual_follow_jaccard      ← Jaccard sim on follow graphs
past_request_timestamps        ip
user_demographics              user_inferred_gender       ← inferred from behavior

USER_INFERRED_GENDER: behavioral inference GROK TOPICS: followed + inferred JACCARD: mutual-follow similarity BLOOM FILTER: impression dedup

Filter Stack (Updated)

Filters run before and after scoring. The May 15 release adds 5 new filters and modifies most of the existing ones. Source: home-mixer/filters/.

PRE-SCORING FILTERS:
 1. DropDuplicatesFilter             — Removes duplicate tweet IDs
 2. CoreDataHydrationFilter          — Removes posts that failed metadata hydration
 3. AgeFilter                        — Removes posts older than threshold
 4. SelfTweetFilter                  — Removes user's own posts
 5. RetweetDeduplicationFilter       — Dedupes reposts of the same underlying content
 6. IneligibleSubscriptionFilter     — Removes paywalled content user cannot access
 7. PreviouslySeenPostsFilter        — Removes posts user already engaged with
 8. PreviouslySeenPostsBackupFilter  — Bloom-filter backup for seen-post deduplication ← NEW
 9. PreviouslyServedPostsFilter      — Removes posts already shown in current session
10. MutedKeywordFilter               — Removes posts matching muted keywords
11. AuthorSocialgraphFilter          — Removes posts from blocked/muted authors
12. TopicIdsFilter                   — Topic-based filtering (571 lines)         ← NEW
13. NewUserTopicIdsFilter            — Topic filtering specific to new-user flow   ← NEW
14. VideoFilter                      — Filters specific to video candidates       ← NEW
15. AncillaryVfFilter                — Pre-scoring safety filter pass             ← NEW

    ─── PHOENIX SCORING RUNS HERE ───
    ─── RANKINGSCORER RUNS HERE ───
    ─── OPTIONAL VMRANKER RUNS HERE ───
    ─── TOP-K SELECTION RUNS HERE ───

POST-SCORING FILTERS:
- VFFilter                          — Visibility Filtering (safety drop)
- DedupConversationFilter           — Deduplicates thread branches

VFFilter still runs AFTER scoring Topic filtering is new and substantial Bloom filter for seen-post dedup

Candidate & Query Features (Updated Structs)

Source: home-mixer/models/candidate.rs, candidate_features.rs, query.rs, user_features.rs.

PostCandidate (excerpt)

tweet_id: u64
author_id: u64
author_followers_count: Option<i32>
in_reply_to_tweet_id: Option<u64>
retweeted_tweet_id: Option<u64>
video_duration_ms: Option<i32>
subscription_author_id: Option<u64>
in_network: Option<bool>
phoenix_scores: PhoenixScores   // 22 fields
prediction_request_id: Option<Uuid>
last_scored_at_ms: i64
weighted_score: Option<f64>
score: Option<f64>              // final (post-VMRanker if enabled)

ScoredPostsQuery (excerpt)

user_id: i64
client_app_id: i32
country_code, language_code
seen_ids, served_ids
in_network_only: bool
is_bottom_request: bool
topic_ids: Vec<i64>
excluded_topic_ids: Vec<i64>
user_features: UserFeatures {
  followed_user_ids
  blocked_user_ids, muted_user_ids
  subscribed_user_ids
  muted_keywords
}
scoring_sequence: Option<ActionSequence>
retrieval_sequence: Option<ActionSequence>
subscription_level, age_in_years
push_to_home_post_id: Option<u64>
params: xai_feature_switches::Params
decider: Option<Decider>

What's Published vs. What's Not

The "incomplete" claim circulating on X is accurate. What was released is the architecture, formulas, and data flow — not a buildable or runnable system. Below is a precise audit.

In the open-source release

Full two-pipeline architecture (Phoenix + ForYou)
All 22 scored action types with field names
Weighted scoring formula and normalization branches
Author diversity decay formula and ordering
OON penalty structure with 3 branches (default / topic / new-user)
VMRanker client call site, request/response shape, DPP parameter struct
Complete filter stack with ordering
Grok transformer architecture (attention, RoPE, RMSNorm)
Phoenix mini model artifacts via Git LFS (~3 GB; 4-layer / 128-dim)
Ads blender selection logic + interface
Grox classifier class definitions (architecture only)
Full query hydrator list and feature names
14 side-effect modules (Kafka, Redis, history maintenance)

NOT in the open-source release

No Cargo.toml anywhere — Rust code cannot be built
No params.rs — all 20+ weight constants and feature-switch keys
No util/ module — score_normalizer, candidates_util, phoenix_request
No clients/ module — VMRanker, Gizmoduck, AdIndex, Kafka, TES, ServedHistory, WhoToFollow, Prompts, PastRequestTimestamps
VMRanker remote service — value model, DPP logic, reranker math all behind gRPC
All internal xai_* crates: xai_home_mixer_proto, xai_recsys_proto, xai_vm_ranker_proto, xai_feature_switches, xai_decider, xai_x_rpc, xai_dark_traffic, xai_stringcenter, xai_profiling, xai_urt_thrift, xai_pipeline_tracing, xai_candidate_pipeline::component_library
Production Phoenix weights — only mini 4-layer / 128-dim checkpoint released
Production retrieval corpus — only 537K Sports posts from a 6-hour window
Frozen checkpoint — production retrains continuously; this is a snapshot
Grox internals: grox.config, grox.lm, grox.prompts, grok_sampler, monitor, strato_http — Grox cannot run
Top-level README is stale — still describes the 3-scorer architecture from January
All numeric thresholds: MinVideoDurationMs, NewUserAgeThresholdSecs, NEW_USER_MIN_FOLLOWING, NEGATIVE_SCORES_OFFSET, MAX_GRPC_MESSAGE_SIZE
PROMPTS_POSITION, WHO_TO_FOLLOW_POSITION constants
FOR_YOU_MAX_RESULT_SIZE (top-K count)

The honest read: xAI published the structure, the formulas, and a working-but-mini ML model — but not the configuration that would let an outside party reproduce production behavior, and not the runnable build system. This is enough for an architectural audit. It is not enough to run the algorithm.

Source

Repository: github.com/xai-org/x-algorithm
Commit: e414c171ed68266341193330bc4864bf3f3534e3
Previous commit: aaa167b3de8a674587c53545a43c90eaad360010 (Jan 20, 2026)
This update: May 15, 2026
Diff: 187 files changed, 18,263 insertions, 926 deletions
Time between updates: 16 weeks
Language mix: Rust (orchestration) + Python (ML + Grox classifiers)
License: Apache 2.0

← Back to the Content Creator Playbook