Sample engagement · Verdict: KILL

Capital-Safety Scan: Rysk RFQ option markdowns — verdict, and the math behind it


Verdict

Scan target Rysk RFQ option “markdowns” (clearing prices below Deribit fair value): a read-only measurement project asking whether a market-neutral, Deribit-hedged capture is worth measuring at full scale
Validation date 2026-06-23
Verdict KILL (structural, for a read-only operator)

One-line reason: the observed “markdown” is the maker’s own foregone spread, so a read-only operator captures none of it; the residual mispricing is sign-ambiguous noise centered near zero; and the round-trip hedge cost exceeds any plausible edge.

Scope of this verdict: this is a validation-pass verdict, produced before committing days of runtime to the full measurement. The governing question was whether the measurement would produce a trustworthy answer, and the answer terminated the project. The full evidence-backed run was never performed. The proposed cheap closure step, a confidence-bounded repricing of the ~560 existing priced trades, was recommended but not executed. Everything below reports what the validation pass itself observed.

Three kill-criteria fired:

  1. Capturability. All 293 live BTC/ETH trades observed were taker-buys against exactly 2 maker counterparties. The edge, where it exists, is earned by the maker. A read-only operator earns none of it. (A taker who buys below fair value and hedges could in principle capture it. That path fails on criteria 2 and 3 rather than on this one.)
  2. Edge size. The live correlation between Deribit’s own mark and the Rysk cleared price is 0.979. The markdown is sign-ambiguous noise centered near zero.
  3. Hedge economics. Live Deribit option half-spread is approximately 320bp median / 760bp mean of premium. Any sub-few-hundred-bp markdown is net-negative after the hedge.

A note on provenance: I ran this scan on my own project, a read-only measurement thesis about this venue, using the same fixed-scope method I offer to clients. The method returned KILL before days of runtime were committed. I publish the result because the method’s credibility rests on its willingness to return KILL when KILL is the answer, including against my own work.


What the Scan asked

A Capital-Safety Scan works through 3–5 kill-criteria: falsifiable questions where a NO answer terminates the thesis. The job of the scan is then to make each one fail. Whether the venue is interesting is beside the point. This engagement was a validation pass, run before committing days of runtime to the full measurement; its job was to establish whether the measurement would produce a trustworthy answer at all. For this target, the criteria were:

1. Capturability: who actually earns the observed edge? An on-chain price discrepancy is only an edge for you if your seat at the table can collect it. If every observed trade clears against a small set of professional makers, the “discount” in the print is simply the maker’s quoted spread, and no free money is left for a third party. Falsification test: classify every live fill by side and counterparty. If the flow is one-sided taker flow against a concentrated maker set, the criterion fails.

2. Edge size: does the mispricing survive independent repricing? Screenshots of individual “cheap” fills prove nothing. The test is to reprice every observed trade with an independent model, from independent inputs, and look at the full two-sided distribution, covering both the trades that cleared below fair value and the trades that cleared above it. If the distribution is centered near zero, the apparent edge is selection on noise.

3. Hedge economics: does the round-trip hedge cost exceed the edge? A market-neutral capture requires hedging every fill on a reference venue. The hedge is not free: the operator crosses the reference venue’s spread on entry and again on exit. If the measured half-spread on the hedge leg exceeds the measured markdown, the strategy loses money on every trade even when the markdown is real. The spread here is measured from live order books rather than assumed.

Each criterion was evaluated adversarially: the working assumption throughout is that the thesis is wrong and the burden of proof is on the data. I run these passes with heavy AI orchestration and independent recomputation at every step; every number still ships with a source you can check.


Findings: the three kill-criteria

1. Capturability: the markdown belongs to the maker

Observed: all 293 live BTC/ETH trades in the observation window were taker-buys, and they cleared against exactly 2 maker counterparties. The cleared price in every observed trade is the maker’s ask.

Why it moves real money: when a maker’s ask sits below Deribit fair value, the positive “markdown” in the data is the spread the maker chose to forgo in order to win the fill. It is income to the maker, by construction. A read-only operator observing the trade does not collect it, and no refinement of the data changes who is on which side of the trade. That is why this criterion cannot be fixed by better measurement, for the project as scoped. A taker buying below fair value and hedging on Deribit could in principle capture a real markdown; that path fails on the next two criteria (the residual edge is approximately zero, and the hedge consumes more than any plausible markdown) rather than on this one. Monetizing the flow from the other side requires becoming the maker: capital at risk, a live RFQ quoting and hedging system, and winning fills against two entrenched incumbents. That is a fundamentally different project from the one scanned.

2. Edge size: the mispricing does not survive independent repricing

Measured: each observed fill was repriced with an independent Black–Scholes implementation using Deribit mark_iv on the matching instrument as the sole volatility input. The pricer was validated first; it reproduces Deribit’s own mark to within ~1–2%. Separately, the live correlation between Deribit’s own mark and the Rysk cleared price is 0.979.

Why it moves real money: the makers quote extremely tight to Deribit. The residual markdown distribution is two-sided, with no stable sign, and centered near zero; if anything, Rysk clears slightly rich to Deribit. There is no fat systematic mispricing to harvest. Any strategy built on the assumption that this venue leaks hundreds of basis points per trade is selecting the positive tail of a noise distribution, which is selection bias.

3. Hedge economics: the round trip costs more than the edge

Measured: the live Deribit option half-spread on the relevant instruments is approximately 320bp median / 760bp mean of premium.

Why it moves real money: a market-neutral capture must cross that spread to establish the hedge and cross it again to unwind. Even granting, counterfactually, a genuine markdown of a few hundred basis points, the hedge leg alone consumes more than that on the median trade, and more still on the mean, given the measured skew of the spread distribution (320bp median vs 760bp mean). Net of hedge, the strategy’s expected value per trade is zero to negative.

For completeness: the underlying data quality was adequate for the measurement. 96% of observed Rysk trades were priceable against a matching Deribit instrument, at roughly 42 BTC/ETH trades per day. The KILL is not a data-availability artifact: the measurement had the data it needed and found no edge.


What the method caught in its own tooling

This scan’s internal tooling contained four flaws. All four were caught by the adversarial review passes that are part of the standard method, and all four are disclosed here, since a diligence product earns trust in part by publishing its own errors.

1. The revenue panel was rigged by construction. The first version of the revenue estimate summed only the positive markdowns, multiplied them by an adjustable fill-share assumption, and reported the result gross of hedge costs. With a true edge near zero and wide noise, selecting only the positive tail of a symmetric distribution produces a strongly positive number automatically; the panel was printing fake revenue for a dead venue. This is the single most common failure mode in crypto strategy pitches, and the scan’s own tooling reproduced it. It was caught and replaced with an honest statistic: the two-sided mean of all markdowns, minus the 320bp hedge half-spread. That number is approximately zero to negative.

2. A pricing bias was found and logged. The fair-value engine priced options off the Deribit index price (spot) rather than the forward. This injects a coherent tilt of roughly 400–520bp into the markdown estimate on out-of-the-money trades. It is a real bug. It is logged and, as of this writing, unfixed. Fixing it (pricing off the forward, or subtracting the basis) matters for a fully defensible confidence-bounded repricing, which is exactly the closure step that was proposed and not performed. No direction is claimed for the tilt here; trade-level numbers on OTM instruments simply carry it.

3. The confidence interval was flagged as over-confident. The confidence-interval calculation used a standard iid interval (1.64·sd/√N). But the observed flow comes from only 2 makers across 4 expiries; the trades are heavily clustered, and the effective sample size is far below the nominal count. (The nominal count here is the ~560 existing priced trades on the venue that the proposed confidence-bounded repricing would cover, a larger population than the 293 live BTC/ETH fills observed in this window.) A defensible interval must be cluster-robust, keyed on (maker, expiry, strike). No such interval was computed in this pass; the iid interval is reported here as what it is: too narrow.

4. The distribution’s width was polling drift rather than signal. Trade and mark data were polled on a 600-second cycle. Non-simultaneity between the Rysk fill and the Deribit mark leaves a random residual of roughly 150–350bp per trade, so the width of the markdown distribution reflects polling drift rather than any property of the venue. Early reads that treated trade-level markdowns as meaningful were downgraded accordingly: only the mean of the distribution survives the noise.

Disclosure of the tooling’s own flaws is part of the standard deliverable.


Limitations and unknowns (fail-closed)

Everything below is stated so that the reader cannot mistake the boundaries of this measurement. Where something was not measured, it is UNKNOWN — not assumed zero, not assumed safe.

  • Timing residual. Trade and mark data were polled on a 600-second cycle. Non-simultaneity between the Rysk fill and the Deribit mark leaves a random residual of roughly 150–350bp per trade (flaw 4 above). Individual trade-level markdowns are therefore not meaningful; only the mean of the distribution survives the noise, and the distribution’s width is polling drift rather than signal.
  • Spot-vs-forward bias. The ~400–520bp OTM pricing tilt described above is logged but unfixed. Trade-level numbers on OTM instruments carry this bias.
  • Confidence-bounded repricing not performed. The cluster-robust, CI-bounded repricing of the ~560 existing priced trades, the cheap closure step this pass recommended, was not executed. The edge-size finding rests on a point estimate plus the structural findings, without a computed interval.
  • Maker behavior under size. How the 2 makers would re-quote against a new, persistent counterparty is UNKNOWN. Nothing in this scan measures it.
  • Regime dependence. The observation window is a single period. Whether markdown behavior differs in a volatility shock is UNKNOWN.
  • Anything not listed above and not measured in the findings is UNKNOWN.

None of these unknowns rescues the thesis. The three kill-criteria fired on what was measured, and the first (capturability, for the project as scoped) cannot be fixed by measurement at all.


Reproduction appendix

No privileged data was used; every input is public. One caution up front: the specific numbers in this document cannot be re-derived after the fact, because Deribit mark_iv at past fill timestamps is not retrievable from the public API. A re-run is necessarily a fresh prospective collection, over your own observation window, and will produce its own numbers. What you can reproduce is the procedure and the verdict logic rather than this window’s figures.

  1. Collect fills. Pull live Rysk RFQ fills for BTC and ETH options over a stated observation window: instrument, side, counterparty, cleared price, timestamp. State the rule used to classify the taker side and to count distinct maker counterparties from fills. (Observed in this pass’s window: 293 BTC/ETH trades, all taker-buys, against a concentrated maker set of 2 counterparties.)
  2. Match instruments. For each fill, identify the matching Deribit instrument (same underlying, expiry, strike, type). Record the match rate. (Measured: 96% priceable.)
  3. Reprice independently. For each matched fill, record Deribit mark_iv for the instrument at the nearest timestamp (collected live, at observation time; it cannot be pulled retroactively) and compute an independent Black–Scholes fair value. Validate the pricer first by reproducing Deribit’s own mark (expected agreement: ~1–2%). Do not use Rysk’s own pricing as an input anywhere.
  4. Compute the two-sided markdown distribution. For every fill, favorable or otherwise, compute the per-trade markdown in basis points: independent fair value minus cleared price, normalized by premium. State your denominator convention (cleared premium vs. fair value) explicitly and hold it fixed; absolute bps values depend on that convention and may differ across implementations. Report the full distribution: mean, sign balance, and the correlation between Deribit mark and Rysk cleared. (Measured: corr = 0.979; distribution centered near zero.)
  5. Subtract the hedge. Pull the live Deribit order book for the matched instruments and measure the half-spread as a fraction of premium. (Measured: ~320bp median, ~760bp mean.) Subtract the median half-spread from the mean markdown.
  6. Read the result against the pass condition. PASS requires a mean markdown, net of hedge, that is positive, with a cluster-robust confidence interval keyed on (maker, expiry, strike) that excludes zero. Do not use an iid interval; with 2 makers and 4 expiries the effective sample size is far below the trade count. That cluster-robust interval was not computed in this pass; it is the recommended closure step for anyone who needs a fully defensible bound. What this pass has is the point estimate: the honest net-of-hedge statistic is approximately zero to negative, and the verdict is KILL on structure and the point estimate, without a computed interval.

Three cautions for anyone re-running this: (a) if your tooling prices off spot rather than the forward, your OTM markdowns carry a coherent ~400–520bp tilt; correct for it or discard OTM trades. (b) If your polling interval is coarse, individual trade markdowns are dominated by a 150–350bp timing residual, and only the mean is informative. (c) On counts: the 293 trades in step 1 are the live BTC/ETH fills from this pass’s polling window, and steps 1–2 as written yield roughly 281 priceable trades. The ~560 cited in the confidence-interval discussion is the larger population of existing priced trades on the venue that the proposed cluster-robust repricing would cover. The two figures describe different samples, so there is no internal inconsistency.


About the Capital-Safety Scan

A Capital-Safety Scan is a fixed-scope diligence engagement for allocators and operators considering deploying capital to a specific crypto venue or strategy. Each Scan covers: the custody and withdrawal path; settlement and funding mechanics reconciled against the venue’s documented promises; and 3–5 named kill-criteria, defined in advance and checked adversarially, with reproducible artifacts for every number. (This published engagement was scoped to the three kill-criteria reported above; the custody, withdrawal-path, and settlement-mechanics checks were not performed for this venue and are UNKNOWN.) The deliverable is a written verdict (GO, GO-WITH-CONDITIONS, or KILL) in 5 business days, at a fixed price of $1,500 for one venue. Larger engagements: Diligence Report, $6,000–$9,000, 2 weeks; Standing Retainer, $3,500/month, 3-month minimum, including 2 Scans per month.

A Scan is not an audit and does not replace one. It is a protocol risk review that complements formal code audits by testing the economic and operational promises an audit does not cover. I (Skylar) run every Scan; my background is in telecom inter-carrier billing reconciliation, an industry where the entire job is reconciling what was promised against what was settled, to the last unit.

The Scan is paid to say no when no is the right answer, and this engagement was one of those cases.

Skylar · https://skylarsabo.com · [email protected]

Want this on your venue?

One venue, five business days, a clean GO / GO-WITH-CONDITIONS / KILL with the reproducible work behind it. Fixed price: $1,500.

Book a Capital-Safety Scan