| Metric | Value | Interpretation |
|---|---|---|
| C2 Sharpe | 1.245 | Bottom-20 themes by λ₂, equal weight |
| B&H Sharpe | 0.935 | All-theme equal weight benchmark |
| Alpha (ann.) | 6.66% | After controlling for market beta |
| Market Beta | 0.956 | Near-full market exposure |
| Max Drawdown | -22.6% | vs B&H -23.8% |
| Turnover | 0.66 themes/day | 80% zero-change days |
| Fiedler Autocorr | 0.996 | Extremely persistent structural signal |
| Cross-sectional z | 4.28 | p<0.002, 500 permutations |
| Temporal z | 2.78 | p=0.006, date alignment matters |
| Membership z | 7.16 | Theme structure essential (30 trials) |
| FM t (20d) | 4.63 | Bottom-tail dummy, controlled |
| FM t (60d) | 7.07 | Significance grows with horizon |
| Mom. correlation | 0.005 | Orthogonal to momentum |
| Stress spread | +10.6 bps/day | Signal earns in stress periods |
| Calm spread | -0.1 bps/day | Zero spread in benign conditions |
| Capacity | $500M-$2B | Weekly rebal, 225 US large caps |
Five years is the minimum we'd accept for a structural claim. We compensate with unusually extensive falsification:
Out-of-sample validation on non-US markets (KRX, Japan) is the most important next step, and parallel HyperGraph systems already exist.
No. The linear FM fails because the signal is genuinely nonlinear — a tail/cliff effect. With the correct tail-dummy FM specification:
| Horizon | t-stat |
|---|---|
| 5-day | 2.36 |
| 20-day | 4.63 |
| 60-day | 7.07 |
The t-statistic increases with horizon — hallmark of a structural, low-frequency signal. Many commercially successful screens (Altman Z-score, Piotroski F-score) share this property.
A proper linear factor, yes. But this is a tail selection effect. D1 consistently dominates, but D7-D9 also perform well. We are not claiming λ₂ is a linear pricing factor — it is a powerful threshold screen.
All 128 themes have 100% data coverage from Jan 2021 onward. The F3 membership-shuffle test (z = 7.16) shows the specific grouping structure matters — random stock groupings produce Sharpe 0.88 ± 0.05, while real C2 at 1.245 is 7+ sigma above.
The 67 experiments were exploratory. The final paper tests a single, pre-specified rule (C2: bottom-20, equal weight). The cross-sectional shuffle is the definitive guard: 0 out of 500 random permutations achieved Sharpe 1.245. Empirical p < 0.002, no adjustment needed.
No. Fiedler-momentum rank correlation: 0.005. Fiedler-volatility rank correlation: -0.114. D1 themes have 20.4% vol — identical to D10's 21.4%. Lambda-2 captures an orthogonal structural dimension.
Proposed mechanism:
Key evidence: The entire low-vs-high λ₂ spread is earned during market stress (+10.6 bps/day), with zero spread in calm (-0.1 bps/day) — precisely the crowding/fragility prediction.
Of six predictions from the synchronization-fragility hypothesis, 4/6 confirmed:
| Prediction | Result |
|---|---|
| D10 worse CVaR | PASS (-2.89% vs -2.78%) |
| D1-D10 spread positive skew | PASS (+0.854) |
| Rising λ₂ predicts worse returns | PASS (Sharpe 0.609) |
| High λ₂ crashes harder in stress | PASS (+10.6 bps/day) |
| D10 worse skew | FAIL |
| D10 deeper MaxDD | FAIL |
Recommended framing: structural health screen with crash-avoidance properties.
Definitively not. Rank correlation with 20-day momentum: 0.005 (effectively zero). In the double-sort, the best cell is low-λ₂ / middle-momentum, not low-λ₂ / high-momentum.
The themes map to globally recognizable narratives: "Nuclear Energy," "Robotics," "Secondary Batteries," etc. The mapping includes 1,244 US equities covering all major large caps. The F3 test confirms groupings capture real structural information. Korean origin is irrelevant to the correlation structure.
225 unique US stocks, 64% large/mega cap. Total daily volume: $116B.
| Rebalance | Conservative | Moderate |
|---|---|---|
| Daily | $118M | $500M |
| Weekly | $589M | $2B |
Binding constraint: 2 micro-liquid stocks. Excluding them raises capacity materially. At weekly rebalance, realistic capacity is $500M–$2B.
| Frequency | Sharpe | vs Daily |
|---|---|---|
| Daily | 1.245 | — |
| Weekly | 1.115 | -0.130 |
| Biweekly | 1.109 | -0.136 |
| Monthly | 1.146 | -0.099 |
Monthly slightly outperforms biweekly — consistent with quarterly signal timescale. Cost difference between daily and weekly is minimal (80% zero-change days).
At institutional-grade execution (10-25 bps), the signal remains robust. Breaks down at 100 bps (unrealistic for this universe).
~225 positions. On typical day, 0-1 themes rotate (~0-15 stock trades). 80% of days: zero changes. Mean streak: 29.8 days. Operationally trivial.
In persistent bear markets it loses money. However: positive for full 2022, best during regime transitions (2022 H2 Sharpe 1.458), drawdown recovery 2.8x faster (206d vs 581d).
Both fail. Persistence filter: k=5 Sharpe 1.196. Inverse weighting: 1/λ₂ Sharpe 1.154, MaxDD worsens. Smoothed rank: monotonically degrades. The signal is already maximally persistent (autocorr 0.996) and binary. Simplicity is the correct implementation.
| k | Sharpe | Return | MaxDD |
|---|---|---|---|
| 10 | 1.386 | 28.4% | -21.7% |
| 15 | 1.255 | 25.3% | -23.1% |
| 20 | 1.245 | 24.5% | -22.6% |
| 30 | 1.049 | 20.5% | -22.0% |
| 40 | 0.987 | 19.2% | -22.3% |
Monotonically decreasing — consistent with tail effect. k=20 balances signal strength and diversification.
All adjacency variants produce Sharpe > 1.0. Positive-correlation-only baseline performs best (1.113 on quarterly grid). Directionally stable.
Within-window shuffles have z = 0.48 — exact daily composition irrelevant. Quarterly-scale structural state is what matters. Robust to moderate window changes (40-90d).
Not tested. Unnormalized captures both synchronization strength and density. The normalized would remove density effect. Open empirical question.
Factor crowding measures crowding at the stock/factor level. We measure at the theme level. Our contribution is the measurement tool (Fiedler eigenvalue), not a new theory. The stress-conditional result is directly consistent with crowding premium frameworks.
Conceptual rather than methodological. RMT uses full eigenvalue spectrum on full market matrix. We use a single eigenvalue (λ₂) on within-theme subgraphs.
High IR partly reflects low tracking error (4.88%). Most factors show 0.3-0.8 in-sample. We'd be comfortable with 50% decay live (IR ~0.7) — still commercially attractive.
Only two inputs: (1) Static theme-to-ticker mapping (~128 themes), updated quarterly at most. (2) Daily close prices for ~1,244 stocks. Computation takes ~2 minutes. No external signals, no macro data, no NLP, no alt data.
Lag-0 vs lag-1 difference is negligible (1.251 vs 1.245). Even 20-day-old data produces Sharpe 1.143. Zero latency requirement. End-of-day pricing sufficient.
| Metric | Threshold | Action |
|---|---|---|
| Fiedler autocorrelation | < 0.95 | Signal structure changing |
| Daily turnover | > 3 themes/day | Check data/structural break |
| Rolling 60d spread | Negative > 90d | Regime review |
| Jaccard similarity | < 0.85 | Signal unstable |
Worst realized: 2022 H1 (Sharpe -0.873). Worst structural: market microstructure changes where crowding evolves faster than the 60-day window detects.
Promising (spread Sharpe 0.609 for level signal, +10.6 bps/day stress advantage). But not backtested as long-short. High priority next step.
Parallel HyperGraph systems exist for KRX and Japan. Preliminary KRX results show similar structural health effect. Formal cross-market validation is the single most important extension.
Taxonomy: Requires 50+ groups with 5+ stocks each. GICS sectors (11) too few. GICS sub-industries (~160) might work.
ML: Evidence suggests no improvement. Composite scoring degrades to 0.961. Signal is a clean threshold.
Macro: C2 works best in high-return periods (Sharpe 3.306). Conditioning could help drawdowns but risks overfitting.
Change signal: Falling-λ₂ outperforms rising (spread Sharpe 0.609). Conceptually distinct from level-based C2.
F3 membership-shuffle: random groupings produce Sharpe 0.88 ± 0.05. Real C2 at 1.245 is 7.16 standard deviations above. The theme structure is essential — it organizes correlations into semantically coherent subgraphs.
Capacity: $500M-$2B. Quarterly timescale and low turnover resist HF arbitrage. Larger risk is signal degradation if many allocators use the same rule. Can absorb $1-2B before capacity binds.