Return Persistence: Do Numerai's Winners Keep Winning?
Autocorrelation of model returns sits at 0.15 at 20-round lag and decays to near zero by 200 rounds. Top-100 models have a half-life of about 62 days, and rank quintile transition matrices show meaningful reversion to the middle.
Does past performance predict future results? If top models stay on top, the leaderboard reflects genuine skill. If performance mean-reverts, rank is mostly noise and chasing winners is a losing strategy.
Numerai makes this testable. Models stake real NMR, scores resolve after a fixed delay, and performance history is public.
Autocorrelation of Returns
The cleanest test of persistence: for each model, compute the correlation between its return in round N and round N+lag. Average across all models with enough history to be meaningful.

At a 20-round lag, the average autocorrelation is 0.15 — positive and statistically significant across thousands of models, but modest. By 50 rounds the signal has decayed to 0.09, and by 200 rounds it is indistinguishable from zero at 0.02.
Good models tend to stay good short-term, but a strong quarter does not predict performance a year out. This matches model survival data — early performance is weakly predictive of long-term persistence, but far from deterministic.
Rank Quintile Transitions
Bucket all models into rank quintiles on a given date, then check which quintile they occupy 90 days later. Perfect persistence would put 100% on the diagonal. Pure randomness would put 20% everywhere.

The diagonal runs between 28% and 35% — above the 20% baseline but nowhere near deterministic. The top quintile retains 35% of its members after 90 days. The bottom quintile shows 31% retention — poor performers also persist somewhat.
Transitions favor adjacent quintiles. A Q1 model is more likely to drop to Q2 (22%) than collapse to Q5 (10%). Extreme rank changes are rare, but the distribution pulls everyone toward the middle over time.
Top-100 Half-Life
For models that reach the top 100 by rank, how long do they stay? This survival curve tracks consecutive days in the top 100 from the moment of entry.

The half-life is approximately 62 days. Half of all models that enter the top 100 have dropped out within two months. The curve flattens after 180 days — the 15% who survive six months tend to be benchmark-beating veterans with large stakes.
The early steepness comes from models that spike into the top 100 during a favorable regime, only to fall back when conditions shift.
Rolling Persistence Over Tournament History
Is the tournament becoming more or less predictable? For each round, we compute the Spearman rank correlation between model returns in round N and round N+50, then track this rolling correlation over time.

Rolling persistence has drifted upward from about 0.08 in early rounds to roughly 0.14 recently — the tournament is slightly more predictable now, plausibly because marginal models have churned out and the remaining field is more stable.
Dips around rounds 750 and 1050 correspond to payout factor shifts and market condition changes. During regime changes, persistence collapses temporarily before re-establishing.
Takeaways
Past performance is weakly predictive, not deterministic. An autocorrelation of 0.15 means about 2% of next-round variance is explained by this-round returns. Enough to be real, not enough to be relied upon.
The leaderboard is more stable than random but less stable than it looks. Top-quintile models have a 35% chance of staying after 90 days — better than 20% random, but still a 65% chance of dropping.
Top-100 tenure is short. A 62-day half-life means the leaderboard you see today will look substantially different in two months. Chasing last month's winners is a weak strategy.
Persistence is slowly increasing. The maturing participant pool has made performance slightly more predictable over time, consistent with the stake-weighted age trend showing experienced models accumulating influence.
For stakers: 20+ rounds of good performance is a mildly positive signal for the next 20 rounds, but extrapolating one strong quarter is not supported by the data. Consistent performance across market regimes is a far stronger indicator of skill than any short-term rank.
All charts on this page are generated from live tournament data tracked by nmrdash.