Numerai Submission Scores: From Day 1 to Resolution

Daily Numerai submission scores are noisy early and converge late. We measure how predictive day-one MMC is, and when rankings actually stabilize.

Numerai scores do not arrive all at once. They evolve day by day as market returns accumulate over the scoring window. A model that looks brilliant on day three can look mediocre by day twenty, and the reverse happens just as often. Reading the journey correctly is the difference between patient staking and chasing noise.

This post tracks daily MMC evolution using submission-level data you can also explore on any round detail page. For background on the metrics, see How Numerai Works and MMC vs Correlation.

How Scoring Works

Each round's predictions are scored against realized market returns over a multi-week window, and updated daily as new return data lands. Early scores reflect short-term moves; later scores incorporate longer-term patterns. The final score sets your payout.

Day one is based on a tiny slice of the information that will ultimately determine payout. It is a partial signal, informative but unreliable.

The Score Journey

How does the average daily score evolve from round open to resolution?

Per-round average daily MMC from day 0 to day 27, with a bold cross-round mean hovering near 0.003
Per-round average daily MMC from day 0 to day 27, with a bold cross-round mean hovering near 0.003

Each thin line is one round's average daily MMC across all models; the bold line is the mean across rounds. In the first week, individual rounds swing between roughly -0.04 and +0.025 MMC before the cross-round average settles near 0.003. Two rounds with nearly identical final MMC can take wildly different paths to get there.

Do Early Scores Matter?

The cross-model correlation between day-one and final MMC tells you how much early information is worth.

Per-round correlation between day-one and final MMC across rounds ~720 to 1215, 10-round rolling average around 0.2
Per-round correlation between day-one and final MMC across rounds ~720 to 1215, 10-round rolling average around 0.2

The 10-round average sits near 0.2, with individual rounds swinging from about -0.4 to +0.6. Day-one scores carry some signal, but the relationship is far from deterministic. Plenty of strong day-one models regress by resolution, and plenty of weak starters recover.

Predictive power also drifts across rounds. Some stretches make early signals reasonably informative; others reduce them to coin flips.

Volatility Across Rounds

Rounds differ sharply in how much daily scores bounce around.

Median per-model score standard deviation for rounds 1119 to 1214, rising from about 0.014 to a peak near 0.040 at round 1184
Median per-model score standard deviation for rounds 1119 to 1214, rising from about 0.014 to a peak near 0.040 at round 1184

Across the last 20 rounds, median daily-score standard deviation climbs from about 0.014 in the 1120s to peaks above 0.033, topping out near 0.040 at round 1184. High-volatility rounds line up with turbulent market conditions, visible in the broader trends view. Patience matters most in exactly these rounds, when daily swings are largest and least informative.

When Rankings Stabilize

How quickly do model rankings converge to their final order?

Spearman correlation between daily model rankings and final rankings for five recent rounds, rising from roughly 0 to 1.0 across the 28-day scoring window
Spearman correlation between daily model rankings and final rankings for five recent rounds, rising from roughly 0 to 1.0 across the 28-day scoring window

Spearman correlation between daily and final rankings shows clear convergence. Rankings start noisy — some rounds dip below 0 in the first few days — then climb steadily and cluster near 0.8-0.9 by day 14. By resolution they all pin near 1.0.

Absolute scores stay unreliable early, but your relative position on the leaderboard stabilizes well before the round closes.

Takeaways

Early scores are noisy. Day-one MMC correlates with final MMC around 0.2 on average. Do not overreact to early performance.

Volatility is a round property, not a model property. Market conditions drive most of the daily swing, and recent rounds have been unusually volatile.

Rankings converge faster than scores. Focus on rank rather than raw score for mid-round assessment — the round detail timeline makes this convergence visible.

Patience is a competitive advantage. Participants who restake based on daily fluctuations are systematically disadvantaged against those who wait for the signal to stabilize. For a broader view of how payouts actually land, see Round Economics.