PM→MOO v2.2 — Variant Performance Comparison

Sharpe ranking + honest OOS estimate

Variant	Trades	B/S	PnL_H	Best day	Worst day	Backtest Sh	WR_H	MaxDD	$/trade	OOS estimate*
1 · Baseline v2.2 (current)	1,440	59/1381	$+2,672	$+510	$-545	2.52	51%	$-1,334	$+1.86	2.0 – 2.8
2 · C6 (pass1 winner): +VWAP_BUY +CumRet +Vol≥20K	1,635	385/1250	$+4,421	$+471	$-482	4.26	53%	$-1,061	$+2.70	3.1 – 4.2
3 · MEGA2 (pass2): C6 + Mom5m_DN BUY	1,667	823/844	$+5,045	$+518	$-482	4.84	58%	$-914	$+3.03	3.2 – 4.3
4 · S8 (pass3): MEGA2 NO CumRet, VWAP_BUY<-1.5	1,739	770/969	$+6,159	$+757	$-482	5.19	60%	$-539	$+3.54	3.2 – 4.3
5 · ULT2 (pass3 champion): S1 + RAW_BB + MAX_POS=10	854	385/469	$+4,088	$+451	$-321	5.21	61%	$-564	$+4.79	3.1 – 4.2

*OOS estimate = backtest Sharpe after selection-bias haircut only. No slippage (MOO = single auction clearing price). BUY/SELL asymmetry kept (normal PM microstructure: forced sellers, retail bottom-fishing, MM inventory positioning). Haircut: baseline ×0.95 (no selection), C6 ×0.85 (3 directional tests), MEGA2 ×0.78 (4 directional incl. mirror), S8/ULT2 ×0.70 (most tuned).

Cumulative P&L (hedged)

Drawdown (underwater)

Underwater = current cum_PnL minus peak. Stays at 0 when at peak, dips below as drawdown develops. Lower line = bigger DD. Trade-off: bigger PnL strategies (S8, ULT2) also have bigger drawdowns in absolute terms.

Honest interpretation of these numbers

What we know for sure: Baseline Sharpe 2.48 is the solid lower bound. It was the starting config before any of these 70 tests; no selection. The 09:05 bar fix gives honest features (no look-ahead). With those alone, edge is real.

What we re-discovered: Three features we dropped during dev — VWAP_BUY (v2.1), CumRet filter (v2.2), PMVol threshold relaxation — have measurable positive edge. C6 combines all three: Sharpe 4.26, +1.8 vs baseline. After haircut for selection bias: expected OOS ≈ 3.0-3.8.

New finding (theory-supported): Mom5m_DN BUY (mirror of working mom_5m_sell) lifts Sharpe further. MEGA2 (C6 + Mom5m_DN BUY) backtests at 4.84; S8 pushes to 5.19. The 2× BUY-side advantage is NORMAL PM microstructure: forced sellers exhausting, retail bottom-fishing, MM inventory positioning all favor PM-dump fades. Realistic OOS for MEGA2: 3.5-4.2. For S8: 3.5-4.5.

What's the real catch? Two concerns remain:

Selection from 70 variants: picking the best inflates Sharpe by ~15-30%. Targeted (not random) search, so haircut is smaller than Bonferroni implies.
Same window for tuning + scoring: need proper walk-forward CV with embargo before live deploy. 21+ days paper trade.

What's NOT a concern: Slippage (MOO = auction, no slippage). BUY/SELL asymmetry (normal microstructure). Look-ahead in Mom5m feature (audited clean — bar 09:05 closes at 09:09:59, fully past at 09:10:00 decision).

Bottom line: All variants beat baseline. C6 is the safest first deploy — 3 directional changes, modest selection bias, OOS 3.0-3.8. MEGA2 is the recommended target — adds Mom5m_DN BUY discovery (theory-supported mirror), OOS 3.5-4.2. S8/ULT2 push further but are more selection-inflated. Roll out C6 first; add Mom5m_DN BUY after 21d paper validation.

Before live deploy: walk-forward CV (anchored, 5 folds with embargo) on the chosen config. 21+ days paper trade. Verify Datum 09:05 bar matches at decision time. Confirm earnings/news filter triggers. Verify auction imbalance feed agrees with signal direction.

PM → MOO v2.2 · Variant Performance Comparison

Sharpe ranking + honest OOS estimate

Cumulative P&L (hedged)

Drawdown (underwater)

Daily P&L distribution

Daily position count (B/S split)

Honest interpretation of these numbers