What Was Broken
The first extended run used a dataset where the external BTC spot feed did not load. That made key BTC distance and volatility features effectively zero, so the extended retry30/retry15 comparison was not feature-parity comparable to the old M144 run.
How It Was Fixed
I built an analysis-only spot backfill DB and rebuilt the May 2-May 7 BTC dataset with --external-db. Historical Coinbase ticks are used where available; the May 4-May 8 tail is filled from Coinbase 60s candles. Production history DB was not mutated.
Feature Parity Check
| Item | Broken Extended Dataset | Fixedspot Dataset | Read |
|---|---|---|---|
| External spot rows loaded | BTC=0, ETH=0, SOL=0 | BTC=1,309,046; ETH=759,621; SOL=397,489 | Spot feed now exists for feature generation. |
| btc_dist_to_strike | all zero | mean -2.77, min -1,151.93, max 948.84 | Distance-to-strike signal restored. |
| btc_vol_60s | all zero | mean 0.0001266, near-zero rate 0.01% | Short-horizon volatility restored. |
| btc_dist_normalized_vol | all zero | mean -0.3565, zero-rate 0.00% | Sigma-style distance signal restored. |
Caveat: the fixedspot tail uses 60-second candle synthetic ticks, so this is analysis-grade parity, not tick-exact replay parity for the missing tail.
Training Sample
| Sample | Rows | Flips | Flip Rate | Meaning |
|---|---|---|---|---|
| Full current-style train | 63,342 | 1,330 | 2.10% | All eligible current-style rows before cutoff. |
| Old M144 retry30 train | 6,923 | 316 | 4.56% | Original M144 sequential attempt sample. |
| Fixedspot E30 retry30 train | 7,860 | 366 | 4.66% | Extended to May 7 with fixed spot features. |
| Fixedspot E15 retry15 train | 8,883 | 507 | 5.71% | More frequent retry attempts; more adverse rows. |
Strict OOS: May 7 UTC Sanity Test
| Model | AUC | dAUC | PR AUC | dPR | Orders | Static PnL/Contract | dPnL | Seq15 PnL | Seq30 PnL | Seq60 PnL |
|---|---|---|---|---|---|---|---|---|---|---|
| Current primary | 0.539 | 0.000 | 0.068 | 0.000 | 278 | -3.067 | 0.000 | 0.898 | 0.788 | -0.314 |
| Old M144 retry30 | 0.721 | +0.182 | 0.083 | +0.015 | 148 | 2.916 | +5.983 | 1.018 | 0.876 | 0.625 |
| Fixedspot E30 retry30 | 0.626 | +0.088 | 0.064 | -0.004 | 167 | 2.919 | +5.986 | 0.809 | 0.850 | 0.640 |
| Fixedspot E15 retry15 | 0.560 | +0.021 | 0.056 | -0.011 | 131 | 3.899 | +6.966 | 2.270 | 1.269 | 1.059 |
This OOS slice is only 405 rows and 19 flips. It is useful as a sanity check after the feature-parity fix, but the deployment read comes from the live trace roleplay below.
Live Trace Roleplay: Current-Fire Veto Effect
Filter Delta is the actual-dollar impact on trades current would already fire. A good veto should block losing filled trades without sacrificing too much winner PnL.
| Model | Period | Current W/L | Current PnL | Kept W/L | Kept PnL | Blocked W/L | Filter Delta |
|---|---|---|---|---|---|---|---|
| Old M144 retry30 | 120h | 267 / 15 | -208.73 | 130 / 0 | 384.90 | 137 / 15 | +593.64 |
| Old M144 retry30 | 144h | 321 / 21 | -298.93 | 152 / 1 | 403.19 | 169 / 20 | +702.12 |
| Fixedspot E30 retry30 | 120h | 267 / 15 | -208.73 | 195 / 12 | -351.82 | 72 / 3 | -143.09 |
| Fixedspot E30 retry30 | 144h | 321 / 21 | -298.93 | 237 / 18 | -498.85 | 84 / 3 | -199.92 |
| Fixedspot E15 retry15 | 120h | 267 / 15 | -208.73 | 188 / 13 | -314.85 | 79 / 2 | -106.12 |
| Fixedspot E15 retry15 | 144h | 321 / 21 | -298.93 | 228 / 19 | -451.81 | 93 / 2 | -152.88 |
Live Trace Roleplay: Primary/Add-On View
Add-on OB PnL is the current-reject universe replayed through orderbook fill logic only when the tested model would newly fire. As Primary PnL = filter-kept current-fire PnL + add-on OB PnL.
| Model | Period | Model AUC | PR AUC | Add-on W/L | Add-on OB PnL | Primary W/L | Primary PnL | Primary Delta |
|---|---|---|---|---|---|---|---|---|
| Old M144 retry30 | 120h | 0.871 | 0.053 | 19 / 1 | -114.26 | 149 / 1 | 270.64 | +479.38 |
| Old M144 retry30 | 144h | 0.851 | 0.057 | 23 / 3 | -423.17 | 175 / 4 | -19.99 | +278.94 |
| Fixedspot E30 retry30 | 120h | 0.849 | 0.047 | 25 / 3 | -174.79 | 220 / 15 | -526.62 | -317.88 |
| Fixedspot E30 retry30 | 144h | 0.835 | 0.051 | 31 / 7 | -499.05 | 268 / 25 | -997.90 | -698.97 |
| Fixedspot E15 retry15 | 120h | 0.820 | 0.039 | 27 / 2 | +389.31 | 215 / 15 | 74.46 | +283.19 |
| Fixedspot E15 retry15 | 144h | 0.821 | 0.050 | 34 / 6 | +243.72 | 262 / 25 | -208.09 | +90.84 |
Readout
The feature-parity bug was real, and fixing it improves the extended models versus the broken extended run. But after rerunning with parity, old M144 is still the strongest veto: it blocks 15/15 losses in 120h and 20/21 losses in 144h. Fixedspot E30/E15 do not block enough losses as veto policies.
Recommendation
Do not replace the current M144 veto with fixedspot E30 or E15. Keep old M144 for veto validation. Fixedspot E15 is worth researching as a primary/add-on signal because it has positive add-on PnL in both 120h and 144h windows, but it should not be deployed as a veto from this run.
Artifacts
Fixedspot DB: data_collection/analysis/m144_attempt15_extended_20260513/spot_backfill_20260502_0508.db. Fixed dataset: dataset_btc_5s_one_strike_20260502_0508_fixedspot.pkl. Models: fixedspot/E30_seq_attempt_retry30_train_to_20260507.pkl and fixedspot/E15_seq_attempt_retry15_train_to_20260507.pkl. Comparison CSV: fixedspot/live_trace_fixedspot_comparison.csv.