HyperGraph Regime Gate — Empirical Research

1.245

C2 Sharpe

6.66%

Ann. Alpha

1.36

Info Ratio

4.28

Shuffle z-score

7.07

FM t-stat (60d)

$0.5-2B

Capacity

LIVE

Loading AI commentary…

🇺🇸 US Market

—

🇰🇷 KRX Market

—

One-Sentence Summary: Low-Fiedler themes (weakest internal stock synchronization) deliver Sharpe 1.25 because structurally healthy themes with distributed leadership avoid crowding-driven crashes — a tail effect confirmed by permutation tests (z = 4.28), stress-conditional analysis (+10.6 bps/day spread during stress), and theme-membership shuffles (z = 7.16).

📊 Empirical Evidence

Decile Sharpe Ratios (D1 = Lowest λ₂)

D1 consistently dominates — tail selection, not linear factor

Falsification Tests

Permutation z-scores — all significant at p<0.01

Tail-Dummy Fama-MacBeth t-statistics

Signal strengthens with horizon — structural, not noise

Transaction Cost Sensitivity

Signal survives institutional-grade execution (10-25 bps)

Rebalance Frequency

Weekly/monthly retain most signal — low-frequency structure

Liquidity Tier Distribution

225 US stocks — 64% large/mega cap

Conditional Spread: Stress vs Calm Periods

Entire D1-D10 advantage earned during market stress (+10.6 bps/day) — zero spread in calm (-0.1 bps/day)

📋 Key Results Summary

Metric	Value	Interpretation
C2 Sharpe	1.245	Bottom-20 themes by λ₂, equal weight
B&H Sharpe	0.935	All-theme equal weight benchmark
Alpha (ann.)	6.66%	After controlling for market beta
Market Beta	0.956	Near-full market exposure
Max Drawdown	-22.6%	vs B&H -23.8%
Turnover	0.66 themes/day	80% zero-change days
Fiedler Autocorr	0.996	Extremely persistent structural signal
Cross-sectional z	4.28	p<0.002, 500 permutations
Temporal z	2.78	p=0.006, date alignment matters
Membership z	7.16	Theme structure essential (30 trials)
FM t (20d)	4.63	Bottom-tail dummy, controlled
FM t (60d)	7.07	Significance grows with horizon
Mom. correlation	0.005	Orthogonal to momentum
Stress spread	+10.6 bps/day	Signal earns in stress periods
Calm spread	-0.1 bps/day	Zero spread in benign conditions
Capacity	$500M-$2B	Weekly rebal, 225 US large caps

❓ Investor & Researcher Q&A

I. Signal Validity & Statistical Rigor ▼

Q1 Your backtest is only 5 years. How can you trust a Sharpe of 1.25 on such a short sample? ▼

Five years is the minimum we'd accept for a structural claim. We compensate with unusually extensive falsification:

Cross-sectional shuffle (500 trials): z = 4.28, p < 0.002
Temporal shuffle (500 trials): z = 2.78, p = 0.006
Membership shuffle (30 trials): z = 7.16, p = 0.000
Subperiod stability: C2 outperforms B&H in 7 of 10 half-year windows
Year-by-year: Positive Sharpe in every calendar year including 2022 (bear)

Out-of-sample validation on non-US markets (KRX, Japan) is the most important next step, and parallel HyperGraph systems already exist.

Q2 The Fama-MacBeth regression on rank(λ₂) produces t = -0.06. No signal? ▼

No. The linear FM fails because the signal is genuinely nonlinear — a tail/cliff effect. With the correct tail-dummy FM specification:

Horizon	t-stat
5-day	2.36
20-day	4.63
60-day	7.07

The t-statistic increases with horizon — hallmark of a structural, low-frequency signal. Many commercially successful screens (Altman Z-score, Piotroski F-score) share this property.

Q3 Decile monotonicity is weak (Spearman 0.12-0.36). Not a proper factor? ▼

A proper linear factor, yes. But this is a tail selection effect. D1 consistently dominates, but D7-D9 also perform well. We are not claiming λ₂ is a linear pricing factor — it is a powerful threshold screen.

Q4 Could this be survivorship bias? Theme taxonomy applied retroactively. ▼

All 128 themes have 100% data coverage from Jan 2021 onward. The F3 membership-shuffle test (z = 7.16) shows the specific grouping structure matters — random stock groupings produce Sharpe 0.88 ± 0.05, while real C2 at 1.245 is 7+ sigma above.

Q5 Multiple hypothesis corrections? 67 rescue experiments. ▼

The 67 experiments were exploratory. The final paper tests a single, pre-specified rule (C2: bottom-20, equal weight). The cross-sectional shuffle is the definitive guard: 0 out of 500 random permutations achieved Sharpe 1.245. Empirical p < 0.002, no adjustment needed.

Q6 Is Fiedler eigenvalue just a proxy for volatility or correlation? ▼

No. Fiedler-momentum rank correlation: 0.005. Fiedler-volatility rank correlation: -0.114. D1 themes have 20.4% vol — identical to D10's 21.4%. Lambda-2 captures an orthogonal structural dimension.

II. Economic Mechanism ▼

Q7 What's the causal story? Why should low synchronization predict higher returns? ▼

Proposed mechanism:

Theme identity creates structured co-movement (F3: z = 7.16)
λ₂ measures synchronization intensity
Low sync = distributed leadership = less crowding
Less crowding = more persistent returns

Key evidence: The entire low-vs-high λ₂ spread is earned during market stress (+10.6 bps/day), with zero spread in calm (-0.1 bps/day) — precisely the crowding/fragility prediction.

Q8 D10 doesn't have worse skew or drawdown. Contradicts crowding story? ▼

Of six predictions from the synchronization-fragility hypothesis, 4/6 confirmed:

Prediction	Result
D10 worse CVaR	PASS (-2.89% vs -2.78%)
D1-D10 spread positive skew	PASS (+0.854)
Rising λ₂ predicts worse returns	PASS (Sharpe 0.609)
High λ₂ crashes harder in stress	PASS (+10.6 bps/day)
D10 worse skew	FAIL
D10 deeper MaxDD	FAIL

Recommended framing: structural health screen with crash-avoidance properties.

Q9 Is this just momentum in disguise? ▼

Definitively not. Rank correlation with 20-day momentum: 0.005 (effectively zero). In the double-sort, the best cell is low-λ₂ / middle-momentum, not low-λ₂ / high-momentum.

Q10 Korean-defined themes working for US stocks? ▼

The themes map to globally recognizable narratives: "Nuclear Energy," "Robotics," "Secondary Batteries," etc. The mapping includes 1,244 US equities covering all major large caps. The F3 test confirms groupings capture real structural information. Korean origin is irrelevant to the correlation structure.

III. Implementation & Capacity ▼

Q11 What's the realistic fund capacity? ▼

225 unique US stocks, 64% large/mega cap. Total daily volume: $116B.

Rebalance	Conservative	Moderate
Daily	$118M	$500M
Weekly	$589M	$2B

Binding constraint: 2 micro-liquid stocks. Excluding them raises capacity materially. At weekly rebalance, realistic capacity is $500M–$2B.

Q12 Daily rebalancing impractical. Weekly/monthly loss? ▼

Frequency	Sharpe	vs Daily
Daily	1.245	—
Weekly	1.115	-0.130
Biweekly	1.109	-0.136
Monthly	1.146	-0.099

Monthly slightly outperforms biweekly — consistent with quarterly signal timescale. Cost difference between daily and weekly is minimal (80% zero-change days).

Q13 Transaction costs — does the signal survive? ▼

At institutional-grade execution (10-25 bps), the signal remains robust. Breaks down at 100 bps (unrealistic for this universe).

Q14 How many positions? Operationally manageable? ▼

~225 positions. On typical day, 0-1 themes rotate (~0-15 stock trades). 80% of days: zero changes. Mean streak: 29.8 days. Operationally trivial.

Q15 Long-only beta ~0.96. Bear market performance? ▼

In persistent bear markets it loses money. However: positive for full 2022, best during regime transitions (2022 H2 Sharpe 1.458), drawdown recovery 2.8x faster (206d vs 581d).

IV. Robustness & Sensitivity ▼

Q16 Persistence filters or inverse-λ₂ weighting improvements? ▼

Both fail. Persistence filter: k=5 Sharpe 1.196. Inverse weighting: 1/λ₂ Sharpe 1.154, MaxDD worsens. Smoothed rank: monotonically degrades. The signal is already maximally persistent (autocorr 0.996) and binary. Simplicity is the correct implementation.

Q17 Selection depth k sensitivity? ▼

k	Sharpe	Return	MaxDD
10	1.386	28.4%	-21.7%
15	1.255	25.3%	-23.1%
20	1.245	24.5%	-22.6%
30	1.049	20.5%	-22.0%
40	0.987	19.2%	-22.3%

Monotonically decreasing — consistent with tail effect. k=20 balances signal strength and diversification.

Q18 Different graph constructions? ▼

All adjacency variants produce Sharpe > 1.0. Positive-correlation-only baseline performs best (1.113 on quarterly grid). Directionally stable.

Q19 Correlation window sensitivity (60 days)? ▼

Within-window shuffles have z = 0.48 — exact daily composition irrelevant. Quarterly-scale structural state is what matters. Robust to moderate window changes (40-90d).

Q20 Unnormalized vs normalized Laplacian? ▼

Not tested. Unnormalized captures both synchronization strength and density. The normalized would remove density effect. Open empirical question.

V. Literature & Comparison ▼

Q21 Relation to factor crowding literature? ▼

Factor crowding measures crowding at the stock/factor level. We measure at the theme level. Our contribution is the measurement tool (Fiedler eigenvalue), not a new theory. The stress-conditional result is directly consistent with crowding premium frameworks.

Q22 Random matrix theory connection? ▼

Conceptual rather than methodological. RMT uses full eigenvalue spectrum on full market matrix. We use a single eigenvalue (λ₂) on within-theme subgraphs.

Q23 IR of 1.36 seems very high. Comparable numbers? ▼

High IR partly reflects low tracking error (4.88%). Most factors show 0.3-0.8 in-sample. We'd be comfortable with 50% decay live (IR ~0.7) — still commercially attractive.

VI. Practical Deployment ▼

Q24 What data is needed for live implementation? ▼

Only two inputs: (1) Static theme-to-ticker mapping (~128 themes), updated quarterly at most. (2) Daily close prices for ~1,244 stocks. Computation takes ~2 minutes. No external signals, no macro data, no NLP, no alt data.

Q25 Latency requirement? T+1 pricing OK? ▼

Lag-0 vs lag-1 difference is negligible (1.251 vs 1.245). Even 20-day-old data produces Sharpe 1.143. Zero latency requirement. End-of-day pricing sufficient.

Q26 Product structure options? ▼

Thematic Selection Overlay: Alpha signal for thematic ETF allocators
Enhanced Thematic ETF: Equal-weight 225 stocks, rebalanced weekly
Systematic Thematic Fund: λ₂ as primary allocation signal

Q27 Live monitoring metrics? ▼

Metric	Threshold	Action
Fiedler autocorrelation	< 0.95	Signal structure changing
Daily turnover	> 3 themes/day	Check data/structural break
Rolling 60d spread	Negative > 90d	Regime review
Jaccard similarity	< 0.85	Signal unstable

Q28 Worst-case scenario? ▼

Worst realized: 2022 H1 (Sharpe -0.873). Worst structural: market microstructure changes where crowding evolves faster than the 60-day window detects.

VII. Extensions & Future Work ▼

Q29 Can you short the high-λ₂ themes? ▼

Promising (spread Sharpe 0.609 for level signal, +10.6 bps/day stress advantage). But not backtested as long-short. High priority next step.

Q30 Does this work in other markets? ▼

Parallel HyperGraph systems exist for KRX and Japan. Preliminary KRX results show similar structural health effect. Formal cross-market validation is the single most important extension.

Q31-34 Alternative taxonomies, ML, macro conditioning, λ₂ change signal? ▼

Taxonomy: Requires 50+ groups with 5+ stocks each. GICS sectors (11) too few. GICS sub-industries (~160) might work.

ML: Evidence suggests no improvement. Composite scoring degrades to 0.961. Signal is a clean threshold.

Macro: C2 works best in high-return periods (Sharpe 3.306). Conditioning could help drawdowns but risks overfitting.

Change signal: Falling-λ₂ outperforms rising (spread Sharpe 0.609). Conceptually distinct from level-based C2.

VIII. Honest Limitations ▼

Q35 What are you most worried about? ▼

5-year sample: Unusual conditions (COVID, meme stocks, AI boom, rate hikes)
Retroactive taxonomy: Can't test historically removed themes
Single taxonomy: Dependent on Naver grouping quality
No OOS test: No walk-forward or live paper trading yet
Nonlinear = harder to hedge: Can't replicate with factor exposures

Q36 Why not just buy diversified stocks directly? ▼

F3 membership-shuffle: random groupings produce Sharpe 0.88 ± 0.05. Real C2 at 1.245 is 7.16 standard deviations above. The theme structure is essential — it organizes correlations into semantically coherent subgraphs.

Q37 If widely known, arbitraged away? ▼

Capacity: $500M-$2B. Quarterly timescale and low turnover resist HF arbitrage. Larger risk is signal degradation if many allocators use the same rule. Can absorb $1-2B before capacity binds.