IX Dossier

ZRR0 IX — The Pricing / Valuation Data Engine

"A Lighthouse for cars" — unbiased mass-measurement, max-bid, cross-market arbitrage, and a closed feedback loop

Status: Design spec, v1.0 — 2026-06-06 (pre-ops) Owner (to be): the ANALYTICS / DATA hire (this document defines the role's entire mandate) Cost flags: every dollar figure is tagged [RESEARCHED] (sourced, current-2026), [RESEARCHED-PRIOR] (sourced in a prior brief, re-verify at contract), or [ESTIMATE].

This is the centerpiece of ZRR0 IX. Sourcing, logistics, and licensing are table-stakes that any importer can copy. The durable moat — the thing a gut-number flipper cannot replicate — is a data-driven valuation engine with a feedback loop that prices every buy and sale against the full distribution of real comps, computes a disciplined max-bid, and gets smarter with every car. It is the Lighthouse trading system, re-pointed at cars.


0. The one-paragraph thesis

You already built this once. The Lighthouse measures a market exhaustively, labels outcomes, lets signatures emerge from the data (not from gut hypotheses), gates everything against a hard utility floor, and re-fits as new outcomes land. The car business is the same machine with different inputs. Each car-configuration is a "trade." The auction is the order book. The landed-cost stack is slippage + fees. The cross-market spread (US resale comp minus Japan-landed cost) is the edge. The max-bid is the utility function made executable. The "villain map" — specs that look good but reliably miss the margin floor (R/RA grades, rust-belt titles, value-destroying mods, sub-anniversary build months) — becomes a hard compliance + economics veto. You measure the terrain once and query it infinitely, and every closed unit writes its result back to re-fit the comps, the mod-uplift coefficients, and the variance.


1. Explicit mapping: Lighthouse → ZRR0 IX engine

Lighthouse (trading)ZRR0 IX engine (cars)
Atlas (per-bar measured terrain)Every car-config measured across market / region / season / grade / mileage / mod-stack
A QueryOne valuation: "what is this car worth, and what is my max bid, right now?"
Chimera / decorrelationCross-market niche portfolio: JDM / USDM-export / EURO / Korean are independent supply sleeves; the combination smooths margin
Utility function profit = freq × (WR × R_win − (1−WR))Max-bid engine: max_bid = E[net_resale] − landed_overhead − target_margin − risk_buffer; deal_score is the gate
R:R floor (hard)Landed-margin floor (hard) — never bid above target_sell − floor; treat exactly like the 2:1 R:R floor
Frequency floor / ideal cadenceAuction cadence × win-rate; ideal 2–8 turns/month, never let inventory sit (carry cost = drawdown)
Max-consecutive-lossCars stuck in inventory / build cost-overruns — the working-capital "drawdown" analogue
Villain map / anti-edge (free veto)"Deal-killer" flags: R/RA grade, salvage/branded title, value-destroying mod, sub-25-yr build month, swap-voided EPA exemption
Measure-once-query-manyBuild the comp store once; cache every pull; re-query the local store, never re-pay the vendor
Atlas update (re-fit on new bars)Nightly re-fit on every won/lost bid + every realized sale — the closed feedback loop
Regime variable (vol regime)FX (JPY/USD), the duty regime, and season are measured regime variables, not constants
Out-of-sample validation before deploying capitalValidate max-bid against realized sales before trusting the curve; don't deploy capital on an unvalidated model

The discipline that transfers most directly and protects the most capital: store the whole distribution, never a single number, and never store a volatile input as a constant.


2. The load-bearing number you must NOT hard-code: DUTY

This is the single biggest correction to the whole pricing premise, and it is why "anchor to a data engine, not a gut number" is itself dangerous if the engine stores duty wrong.

Current state, verified June 6 2026:

  • The Supreme Court struck down the IEEPA "reciprocal" tariffs on Feb 20 2026 (Learning Resources, Inc. v. Trump). The administration replaced them on Feb 24 2026 with a Section 122 global surcharge of 10% under the Trade Act of 1974.
  • Section 122 is statutorily capped at 15% and 150 days — the current order sunsets July 24 2026 unless extended by Congress.
  • The Court of International Trade then struck down Section 122 itself (May 2026); that ruling is on appeal, so the 10% is still being collected pending the appeal.
  • For a 25+ year car under HTS 9903.94.04: it is exempt from the 25% Section 232 auto tariff, and Section 232 goods do not stack with Section 122 — BUT classic cars classified under 9903.94.04 are currently being assessed 2.5% base + 10% Section 122 = ~12.5% all-in as of June 2026.

So the duty rate moved 2.5% → ~15% → ~12.5% in roughly ten months, and has TWO scheduled break-points inside the likely first-run window: the July 24 2026 sunset and the pending appeal ruling. Three of the planning briefs hard-coded 2.5%, one hard-coded 15% — all of them are wrong on any given week.

Hard rules for the engine

  1. duty_rate is a per-shipment, broker-confirmed input — NEVER a stored constant. It lives in a small duty_regime table keyed by (origin_country, entry_date, hts_code), with an as_of date and a source (broker email / CBP ruling). The engine reads the current row at bid time and flags it stale after 14 days.
  2. No bid may be placed on a spread thinner than the duty-volatility band. Treat duty exactly like an FX regime variable. Set duty_band_floor = 15% for Japan-origin worst-case planning (the Section 122 statutory cap), and require the spread to survive that worst case even when today's rate is 12.5%.
  3. Re-verify before every entry, and specifically re-check status the week of July 24 2026 and whenever the CIT appeal rules.
Duty scenario (Japan-origin 25-yr car)All-in dutyStatus
Pre-2025 baseline~2.5%historical
Aug 2025 US-Japan reciprocal~15% (incl. base)superseded
June 2026 (current)~12.5% (2.5% base + 10% Sec 122)[RESEARCHED], verify per shipment
Sec 122 sunset / appeal voids itback toward ~2.5%possible after Jul 24 2026
Worst-case planning band15%use this as the bid floor

Sources: Skadden — CIT strikes down Section 122; Nat'l Law Review — SCOTUS invalidates IEEPA, Section 122 applies; Peacock — Section 122 guide 2026; WCShipping — duties on classic car imports today; CBP Section 232 auto FAQs.


3. Data sources — acquisition + legality

The engine has five data layers mirroring the Atlas. Two principles govern all of them: prefer official/licensed APIs over scraping for anything load-bearing (so one ToS change can't kill the engine), and cache every pull (per-call fees compound fast; measure-once-query literally saves money here).

3.1 US wholesale "truth" layer — license-gated

SourceWhat it givesAccessCostNotes
Manheim MMR / MUVVIWholesale auction truth (10M+ txns/yr) + macro indexAPI via Cox (manheim.data@coxautoinc.com / developer.manheim.com); basic VIN-scan MMR complimentary with a Manheim dealer accountNot public; quote required [RESEARCHED-PRIOR]Requires the WA dealer license. This is the most important buy-side number for US-side cars.
Black BookWholesale/auction-weighted book, best buy-sideDealer subscription + APIQuote required [RESEARCHED-PRIOR]Buy-side ceiling cross-check
J.D. Power (NADA)Retail/trade/loan book (bank-accepted)APIQuoteRetail ceiling + financing reference

Sequencing dependency (critical): MMR and Black Book are gated behind a dealer license. So the WA dealer license is a prerequisite for the engine's most important layer, not just for auction access. Until it clears, the engine runs on the listings/enthusiast/JP layers below — which is enough to start, but the wholesale truth layer fills in only after licensing.

3.2 US listings + sold comps — the workable backbone

SourceWhat it givesAccessCostLegality
MarketcheckEvery US/Canada dealer active + sold listingREST APITiered plans + per-call data fees (inventory search ~$0.002, auction ~$0.008, MarketCheck-Price ~$0.07–0.13/call); APIs/datasets advertised "from $8"; bulk = Enterprise [RESEARCHED-PRIOR — re-verify tier $ at marketcheck.com/apis/pricing]Official API, clean
eBay MotorsOfficial sold comps (Marketplace-Insights / Browse)Official API, free tier$0 tierClean
classic.comAggregates sold from BaT/Cars&Bids/PCarmarket/Hagerty/Collecting Cars; CMB benchmark per make/model/generationNo public API → partner inquiry or careful public-data scrapeScrape cost onlyCheck ToS; treat as supplemental
Hagerty Valuation Tool40,000 collector-car values + condition tiersWeb tool; data-license inquiryQuoteBest enthusiast condition-curve reference
BaT / Cars & Bids / PCarmarketEnthusiast + modded sold truth (prices ZRR0/HER0/MUTT builds)No official API; Apify actors exist (~$0.13–0.20/CU + ~$8/GB residential proxy)Apify ~$49–200/mo [RESEARCHED-PRIOR]Public sold results only, throttle, robots.txt, supplemental
Copart / IAAISalvage floor (HER0/MUTT donor pipeline)Unified API / Apify (168 fields/lot, ~3-yr sold history)Per the scraper/APIPer-state access rules vary

3.3 Japan source-side — the JDM "MMR"

SourceWhat it givesAccessCostNotes
USS / TAA auction result historyThe source-side wholesale truthNo open API; access via your auction agent's exporter login or a resellervia agentForeigners can't log in directly; the agent is the data conduit
auctiondatasearch.jp (ADS)Crawls major JP auctions, ~3-month sold stats, EN-translatedMembership tiers (View / Advance / Premium)Premium / Advance each require ≥¥100,000 deposit; ¥5,000/mo usage fee if inactive [RESEARCHED]Deposit is the bidding float, not a pure data fee
auctionsheetjp.com~140 auctions, sold stats, Goo-net one-price, FOB calc, API includedAPIfrom ~$7+ [RESEARCHED-PRIOR]Cheapest structured JP sold-data + grades
JapanStat / J-PNAuction history + analytics; J-PN syncs every ~60s via APIAPI / membershipQuoteGood for cadence/volume signals
Goo-net / Yahoo Auctions JP (Yafuoku)Retail asking + private soldApify actors for Yafuoku closed/soldApifySupplemental retail-ask layer

The auction sheet is a PRIMARY feature, not a footnote. Overall grade (S/6/5/4.5/4/3.5/3/R/RA) + exterior letter (A–E) + interior letter (A–E) + the damage map (A1–A3 scratch sizing, etc.) + mileage + options + chassis code. R and RA = repaired/accident = the villain-map hard reject for resale stock. Parse and store every field.

3.4 Identity + history (join keys)

SourceWhatCost
NHTSA vPICFREE VIN decode, 100+ fields — the base join key$0 [RESEARCHED]
DataOneExact trim + installed options (premium)Quote
VinAuditMid-price specs + value + historyMid
Carfax/AutoCheck (US)Title/accident historyPer-report; for JDM use the auction sheet + export certificate instead

3.5 Mod-value signals (prices the builds)

No clean API exists for "what did this specific mod stack add to resale." This is built by tagging modded sold cars (BaT/Cars&Bids modded results, forum build threads, YouTube build logs) with a structured mod stack and attributing the resale delta. This NLP/tagging work is the analytics hire's standing job, starting from hand-coded priors and converging to a regression as your own sold-build data accumulates.

3.6 Legality posture (one rule)

Build the core on licensed/official APIs (eBay, Marketcheck, MMR, Black Book, auctionsheetjp, vPIC). Treat scraped enthusiast data (BaT/Yahoo-JP/classic.com) as supplemental — throttle, respect robots.txt, collect only publicly-visible sold results — so a single takedown or ToS change degrades but never kills the engine.


4. The measurement schema (one normalized row per car-config)

Store the whole distribution of comps per config, never a single book number. "Value" is a query against the distribution.

4.1 Identity (join key block)

vin · chassis_code (e.g. BNR34, FD3S, JZX100) · year · make · model · trim · engine · trans · drivetrain · build_month (load-bearing for the 25-yr rule — see §8) · market ∈ {JDM, USDM, EURO, KR}

4.2 Condition

  • JDM: overall_grade · ext_grade (A–E) · int_grade (A–E) · damage_map (structured) · mileage_km · auction_house · r_ra_flag (bool, villain-map)
  • USDM/other: title_status (clean/branded/salvage) · mileage_mi · condition_tier (Hagerty-style) · rust_flag (rust-belt villain-map)

4.3 Mods (prices the builds)

mod_stack = list of {category, part, brand, est_cost, labor_hrs, reversible(bool)} · mod_stack_hash (so identical builds collapse to one config) · build_line ∈ {ZRR0, HER0, MUTT, IX-stock}

4.4 Context (regime variables)

region · season_month · channel (auction / dealer / private / salvage) · fx_jpy_usd (at bid + at sale) · duty_regime_id (FK to §2 table) · listed_date

4.5 Outcome labels (fill over time — the feedback)

our_hammer · landed_cost_total · list_price · sold_price · sold_date · days_to_sell · realized_margin · realized_margin_pct · predicted_max_bid · predicted_vs_actual · bid_won(bool) · lost_bid_hammer (calibrates whether max-bid is systematically too low)

Comp aggregates stored per (chassis_code, grade_band, mileage_band, market): n_comps, median_sold, iqr, p10, p90, last_90d_count, slope_90d (appreciating vs depreciating).


5. The models

5.1 Depreciation / appreciation curve (per config)

Fit value-vs-age per config from the comp store. Classic appreciators invert the curve — detect via the slope_90d / classic.com CMB benchmark; a positive slope flips the model from "discount for age" to "premium for scarcity." Output: an expected resale anchor + a confidence interval that widens with low n_comps.

5.2 Mod-value uplift attribution (prices ZRR0/HER0/MUTT)

sold_built = base_comp + Σ uplift(mod_category) − haircut(over-modded / taste-specific) + f(grade, mileage).

  • Phase 0–1: hand-coded priors per mod category (e.g. tasteful coilover/wheel/widebody add; engine swap on the right chassis adds; over-cammed taste-specific subtracts).
  • Phase 2+: replace priors with a regression on sold_price ~ base_comp + mod_category_dummies + mileage + grade once you have tens of comparable modded sales per category.
  • Honest caveat: uplift is data-starved at launch (few of your sales yet). Run on priors + public modded comps, and treat early build max-bids with a wider risk buffer. The MUTT/HER0 margin lives or dies on accurate repair/build-cost data — this is the max-consecutive-loss analogue, so the villain-map veto on bad bases is non-negotiable.

5.3 Cross-market arbitrage spread (the edge)

arb_spread = US_resale_comp − (JP_auction_comp + landed_cost), ranked by arb_spread × turn_velocity (a thin spread that sells in 2 weeks beats a fat spread that sits 6 months — frequency × edge, straight from the utility function). Inbound JDM rides the 25-yr unlock; outbound USDM muscle sells dear into Japan (0% JP duty) / Australia. Compute the spread NET of the current duty regime (§2), freight, insurance, broker — never gross.

5.4 The max-bid + deal-score engine (the utility function)

landed_overhead = jp_inland + agent_fee + ocean_freight + marine_insurance
                + duty(current_regime)            # §2 — volatile, per-shipment
                + mpf + hmf + customs_broker
                + us_port_handling + trucking
                + recon_reserve                   # do NOT omit — see risks
                + (build_cost if build_line != IX-stock)

E_net_resale   = median_sold(config, last_90d, grade_band, mileage_band)
                 adjusted for mod uplift (§5.2) and season

risk_buffer    = base_buffer
                + comp_variance_term(iqr)         # wider distribution → wider buffer
                + thin_comp_penalty(n_comps<10)   # rare chassis → overbid risk
                + grade_uncertainty(R/RA, unverified sheet)
                + fx_volatility_term(jpy_usd)
                + season_term

max_bid        = E_net_resale − landed_overhead − target_margin − risk_buffer
deal_score     = (E_net_resale − total_cost_at_actual_bid) / E_net_resale

RULE: bid only if deal_score >= floor (start ~0.18–0.22, the landed-margin floor)
RULE: never bid above max_bid
RULE: reject if any villain-map flag is set (§4.2 / §8)

This is the direct analog of profit = frequency × (WR × R_win − (1−WR)) with a hard floor. target_margin is the R:R floor; risk_buffer is what stops you overbidding a thin-comp rarity into a loss.

5.5 Villain map (free veto, compliance + economics firewall)

Hard rejects, evaluated before any positive scoring (a free veto is as valuable as a positive signal):

  • Sub-anniversary build month (sub-25-yr by exact build plate — seizure risk, §8)
  • R / RA auction grade for resale stock (repaired/accident)
  • Salvage / branded / rust-belt title for resale stock
  • EPA-exemption-voiding engine swap on a sub-25-yr import (§8)
  • Spread thinner than the duty-volatility band (§2)
  • Unverified auction sheet (no per-lot result feedback from the agent)

6. The feedback loop (the Atlas update step)

Every won bid, every lost bid, and every sale writes back:

  1. Won + soldpredicted_vs_actual, days_to_sell, realized_margin update the comp distribution and the depreciation/uplift coefficients.
  2. Lost bids (lost_bid_hammer) → calibrate whether max_bid is systematically too low (you keep losing winnable cars) or correctly disciplined (you only lose overpriced ones). This is the buy-side equivalent of measuring fill rate.
  3. Nightly job re-fits curves, uplift coefficients, and risk_buffer variance terms; flags configs whose realized margin is drifting below floor (regime cooldown detection).
  4. Out-of-sample discipline (the trading lesson): a model fit during a hot JDM cycle will overvalue in a cooldown. Validate max-bid against realized sales before trusting it; never deploy capital on an unvalidated curve. Hold out recent sales as a check set the same way you OOS-validate a trading book.

7. Phased build path (spreadsheet → scripts → DB/dashboard)

PhaseWhenStackWhat it doesCost/mo
Phase 0First JDM run (pre-license)Google Sheet — one row/candidate, manual comps (classic.com free + Marketcheck free tier + auctionsheetjp), hand-coded max-bid + deal-score formulaValidate the schema + the max-bid/deal-score formula on real bids before writing any code~$0–300 [ESTIMATE]
Phase 1After schema provenPython + DuckDB (or Postgres); vPIC (free) + Marketcheck Basic + auctionsheetjp API + eBay sold API; Apify scheduled scrapers (BaT/Yahoo-JP). Script comp-query, max-bid, arb-spread, the duty_regime tableFirst automated max-bids; cache everything~$300–800 [ESTIMATE/RESEARCHED]
Phase 2After first profitable cyclesPostgres + dbt models + Metabase/Streamlit dashboard: deal-score leaderboard, arb heatmap (JP→US by chassis), per-car max-bid calculator. Add Black Book → MMR once licensed. Same DB back-ends the SEO/data web property~$1k–2.5k + license costs [ESTIMATE]

Golden rule across all phases: cache aggressively. Persist every comp pull; re-query the local store, not the vendor. Measure-once-query saves real money when Marketcheck/MMR/Black Book charge per call.


8. Compliance gates the engine must enforce as hard vetoes

These are economics-killing or car-seizing, so they belong inside the engine, not in a human's head:

  1. 25-year rule is per-build-MONTH, not model-year. A Feb-2001 R34 imported in Jan-2026 is an illegal entry → CBP detention/seizure/forfeiture, not a fixable duty bill. Gate on the build-plate month; never let the exporter self-clear. (The R34/Skyline-heavy basket is the highest-exposure car for this.) Source: NHTSA importation FAQs; projectjdm 2026 guide.
  2. Engine-swap / EPA voiding. A non-original engine voids the 21-yr EPA emissions exemption (Form 3520-1, Code E) unless the replacement is EPA-certified equal-or-newer — which a JDM/performance engine almost never is. Hard rule: emissions-relevant swaps happen ONLY on cars already 25+ and already on US soil. Japan-side builds are limited to non-emissions cosmetic/chassis/suspension work. The engine must flag any imported config whose mod stack includes a non-original drivetrain on a sub-25 entry. Source: EPA importing vehicles forms.
  3. Duty regime check (§2) — stale-flag after 14 days, broker-confirm per shipment.
  4. USDM-export spine — AES/EEI filing + title-to-carrier ≥72h pre-departure (19 CFR 192.2); penalties to $10,000/violation (willful: criminal). The engine should not green-light an outbound USDM unit until the AES/EIN workflow is gated. Source: CBP exporting used vehicles FAQs.

9. What the ANALYTICS / DATA hire owns

This is a real data-engineering + light-ML role (Python / SQL / dbt / API integration / data-licensing compliance), not a spreadsheet jockey — and it is the operator's stated highest-leverage hire because it owns the moat.

Mandate:

  1. Ingestion + cleaning — wire vPIC, Marketcheck, eBay, auctionsheetjp/ADS, Copart/IAAI, and (post-license) MMR/Black Book; build the cache layer.
  2. The schema + comp store (§4) — normalize one row per config, store full distributions.
  3. The models (§5) — depreciation/appreciation, mod-uplift, arb-spread, max-bid + deal-score.
  4. The feedback re-fit (§6) — the nightly job; lost-bid calibration; OOS validation.
  5. The dashboard (§7 Phase 2) — leaderboard, arb heatmap, max-bid calculator.
  6. Data-licensing compliance — keep the core on official APIs; police scraping ToS.
  7. The villain-map firewall (§5.5 / §8) — encode every hard veto.

Sequencing fix (from the stress-tests): engage this person as a contractor in PRE-OPS to stand up engine v0 (Phase 0/1) BEFORE run 1, so car #1's max-bid is data-set, not the gut number the operator explicitly wants to eliminate. Productize and convert to full-time only after the feedback data accumulates.


10. Open questions (resolve at contract / first run)

  • Manheim MMR + Black Book API pricing for a single-location wholesale dealer is not public — quote directly (manheim.data@coxautoinc.com; Black Book sales) once licensed.
  • Does your chosen JDM agent give raw USS/TAA sold-history export, or only a curated bidding view? Determines whether you also need ADS/JapanStat for the Atlas. Resolve when selecting the agent.
  • classic.com data-license vs scrape-only — worth a direct inquiry; it's the single best enthusiast-comp aggregator.
  • Marketcheck current tier $ — re-verify at marketcheck.com/apis/pricing (prior brief cited $299/$749; search now shows "from $8" + per-call fees — confirm at contract).
  • Mod-uplift convergence — how many comparable modded sales before the regression beats priors (likely tens per category).
  • Korean / EURO sold-data sources are thinner than JP/US and need their own source survey at those phases.

11. Risks (honest)

RiskMitigation
Duty stored as a constant → systematic mispricing (the worst failure mode)Per-shipment broker confirm; 15% worst-case bid floor; stale-flag at 14 days (§2)
License dependency — MMR/Black Book + auctions blocked until WA license clearsStart on listings/JP/enthusiast layers; sequence the license early
Thin-comp false precision on rare chassis (R34, niche EURO)risk_buffer widens with low n_comps + high IQR; never overbid a rarity
Per-call fees / proxy fees compound (Marketcheck price APIs $0.07–0.13, residential proxies ~$8/GB)Cache everything; re-query local store
JP grade/odometer fraudWeight verified sheets + arrival QA; flag unverified; R/RA hard reject
Regime overfit (hot-cycle model overvalues in cooldown)OOS-validate against realized sales before trusting the curve
Scraping ToS/takedownCore on official APIs; scraped data is supplemental, throttled, public-only
FX swings erase a thin spread (3–5 business-day JP payment window)FX is a measured regime variable + a risk_buffer term; pre-position JPY with the agent
Recon cost omitted from landed stackrecon_reserve is a mandatory line in landed_overhead (§5.4)

Bottom line

The "Lighthouse for cars" is buildable almost 1:1 and is the operator's durable edge over gut-number flippers. Build the schema and the max-bid formula in a spreadsheet on run 1, prove them on real bids, then graduate to Python+DuckDB and a Postgres+dashboard warehouse the analytics hire owns. The two disciplines that protect the most capital, both straight from the trading book: store the whole distribution (never a single number), and never store a volatile input — above all duty — as a constant.

All figures flagged [RESEARCHED] / [RESEARCHED-PRIOR] / [ESTIMATE]. Duty regime verified June 6 2026 and MUST be re-confirmed per shipment.