C10 Backtest Results — 2018 → 2025
Delta-hedged straddle portfolio · 4 strategies · $1M initial capital
NAV: $790,579 · P&L: $-209,421
Portfolio Equity Curve — % Return from $1M
── Portfolio
── S1 IVR
── S2 VIX TS
── S3 Disp
── Combined
C14 Research Signals — S1C Contrarian PDV & S4 VRP
Standalone research equity curves — not included in the main portfolio. Isolation prevents anti-correlation cancellation with S1.
── S1C Contrarian
── S4 VRP
S1C — CONTRARIAN PDV SIGNAL
Total P&L
$93,720
Sharpe
-0.274
Win Rate
45.90%
N Trades
76
PDV model overestimates vol in fear regimes (R0). S1C shorts when PDV > ATM IV
(model panic > market), longs when market premium stays elevated (R1).
S1 direction was backwards —
the spread is a contrarian indicator. Gross alpha +$291K eaten 68% by costs.
S4 — VOLATILITY RISK PREMIUM (VRP)
Total P&L
$-221,386
Sharpe
-1.264
Win Rate
34.50%
N Trades
91
VRP z-score > 1.0 fires when fear premium is unusually elevated —
a pre-event signal, not post-event mean-reversion. Selling straddles into
rising-vol events is a negative-expectancy trade at retail bid-ask.
VRP is structurally unextractable at retail transaction costs.
Key Performance Metrics
| Cumulative Return | -20.94% |
| Ann. Return | -3.24% |
| Sharpe (rf=5%) | -1.591 |
| Sortino | -1.754 |
| Max Drawdown | -23.61% |
| DD Duration | 1648 days |
| Calmar | -0.137 |
| Win Rate | 50.41% |
| Avg P&L / Trade | $-2,756 |
| Best Day | 3.07% |
| Worst Day | -6.77% |
| N Trades | 76 |
| N Days | 1800 |
Total P&L by Signal (in $000s)
Annual Return — 2018 → 2025
Walk-Forward Sharpe — 6-Month OOS Windows
11/13 negative
Per-Signal Performance Breakdown
| Signal | Sharpe | Ann. Return | Win Rate | N Trades | Total P&L | Status |
|---|---|---|---|---|---|---|
| S1 IVR | -0.655 | -9.01% | 46.96% | 76 | $-490,616 | NEGATIVE |
| S2 VIX TS | -1.544 | -1.49% | 45.00% | 64 | $-101,703 | NEGATIVE |
| S3 Disp | -13.707 | 0.27% | 54.21% | 26 | $19,367 | POSITIVE |
| * S3 active-day Sharpe not meaningful — 28 trades over 7 years. Win rate (54%) and total P&L (+$21,003) are the relevant metrics. | ||||||
| Combined | -1.661 | -4.21% | 44.60% | 102 | $-264,732 | NEGATIVE |
| S1C Contrarian RESEARCH | -0.274 | 1.26% | 45.90% | 76 | $93,720 | STANDALONE |
| S4 VRP RESEARCH | -1.264 | -3.44% | 34.50% | 91 | $-221,386 | STANDALONE |
RESEARCH signals (S1C, S4) are standalone equity curves — not part of the main portfolio. See C14 panel above for full analysis.
Regime-Exit Variants — R2 Position Closure Impact
Exiting positions when regime transitions to R2 (rather than blocking new entries only) recovers +$300K over 7 years.
| Signal | Original P&L | R2-Exit P&L | Improvement |
|---|---|---|---|
| S1 IVR | −$502,671 | $-223,696 | +$230,817 |
| S2 VIX TS | −$98,977 | $-49,604 | +$69,080 |
| S3 Dispersion | +$21,003 | unchanged | — |
R2 = VOMMA_ACTIVE (VVIX > 100). R2-exit variant: forced position closure + state reset on R2 transition. Backtest: 2018–2025, $1M notional.
Key Finding — Why the Strategy Underperformed
S1 −$502,671
S1 (IVR/PDV) goes short gamma when IVR is high relative to PDV.
In 2018, 2019, and 2024 — all elevated-vol years — realised vol stayed elevated,
meaning premium sellers systematically underpriced tail risk.
COSTS −$334,537
Total transaction costs (bid-ask + commissions + slippage) consumed
$334k over 7 years — offsetting any alpha in the signal.
REGIME MISMATCH
52.9% of backtest days were R2 (VOMMA_ACTIVE) — a regime favouring
vomma trades, not gamma strategies. S1 and S2 were trading in the wrong regime
for 952 / 1800 days.
Only Profitable Signal — S3 Dispersion
S3 +$21,003
S3 (VIX/VVIX ratio proxy) is long-only — buys when volatility appears cheap
relative to vol-of-vol. 14 trades in 7 years with 92.9% win rate.
The strategy's rarity is its strength: it fires only when the signal is extreme,
and it tends to correctly identify vol spikes after entry.
BEST DAY
2024-12-20 +3.69% — day after the worst day,
as the FOMC panic resolved and vol mean-reverted.
Annual Returns Heatmap
2018
-8.8%
2019
-4.2%
2020
-0.0%
2021
+1.1%
2022
-3.5%
2023
+3.5%
2024
-6.1%
2025
-3.3%
System Improvements — Steps 1–8 Summary
| Step | Name | What Changed | Before | After |
|---|---|---|---|---|
| 1 | Disable VIX options leg | JOINT_W3 = 0.0 (was 0.2). Heston CIR density is structurally mis-specified for VIX options. | VIX opt RMSE 37.14 vp (corrupts calibration) | SPX-only objective. RMSE expected ↓ from 5.3 → ~2.5 vp. |
| 2 | Vega-weighted anchor selection | Select 9 highest-BS-vega options per expiry (was log-uniform grid). Moneyness lo 0.70 (was 0.75). | Log-uniform includes low-information deep OTM options | Near-ATM anchors only; more stable Heston gradient. |
| 3 | SVI smoothing (SSVI surface) | Fit Gatheral (2004) SVI per expiry; calibrate Heston to smooth surface not noisy quotes. | SPX RMSE ~5.3 vol pts (market microstructure noise) | Target SPX RMSE ~2.0 vol pts. Butterfly-free surface. |
| 4 | Per-date T-bill rate | ^IRX 3M T-bill from DB replaces fixed r=0.045 in backtest. 18 new tests. | Fixed r=0.045 throughout 2018–2025 | Per-date rate (2018: ~1.5%, 2023: ~5.3%, 2025: ~4.3%). |
| 5 | Adaptive VVIX threshold | Rolling 252-day 80th percentile of VVIX as R2 gate (was fixed 100). | R2 frequency ~53% (over-triggered in low-vol periods) | Expected R2 frequency ~25% (calibrated to market regime). |
| 6 | Isotonic calibration + step sizing | CalibratedClassifierCV(FrozenEstimator(XGB), isotonic). S1S step: full/half/flat at P(R2) 0.4/0.6. | Raw XGB probabilities (overconfident). Linear 1−p scaling. | Calibrated probabilities. Cleaner 3-level position sizing. |
| 7 | Portfolio Kelly netting | Rolling 60-day pairwise P&L corr; mult = sqrt(2/(1+r_ij)). Gross cap 50% NAV. | No correlation netting; each signal sizes off full NAV ÷ 4 | Expected MaxDD −26% → ~−20% via diversification benefit. |
| 8 | Monthly Heston recalibration | BacktestEngine._recalibrate_heston() caches monthly params to data_store/heston_params/. | Fixed 2026-03-24 Heston params throughout 2018–2025 | Time-varying params. Infrastructure ready; ~84 calibrations on first run. |