How this works

The honest answer to 'is any of this actually doing computer vision?'

What's real in this app

Every row, every count, every confidence, every chart on every page derives from one of the following sources. The detection store is in your browser's IndexedDB, populated from a build-time inference pass over the bundled photo set.

  • Every asset record originates from running YOLOv9-c on a real asset photo (pumps, valves, pipes, tanks, panels, HVAC).
  • Every anomaly row combines a real detection bbox with a real heuristic pass: thermal pseudo-imaging (luminance + percentile hotspot) or variance / edge analysis.
  • ΔT estimates are computed from actual luminance differences between hotspot and surrounding regions of the photo — not random. The mapping is documented in `scripts/detect/lib/thermal.ts`.
  • Asset health scores degrade based on detected anomaly density.
  • Work orders auto-generate from high-severity anomalies (confidence > 0.72 or severity High/Critical).

Models and algorithms in use

NameSourceLicensePurpose
YOLOv9-c (COCO 80 classes)Xenova/yolov9-c_all on Hugging FaceGPL-3.0 (yolov9 base) — verify before commercial deployment. For internal demo OK.Person, vehicle, animal, container detection. Used by all 4 apps.
Thermal pseudo-imagingAlgorithm (not a model). Implemented in `scripts/detect/lib/thermal.ts`.Apache 2.0 (project)Real luminance-based hotspot detection on RGB photos. Generates ΔT estimates for AssetIQ.
Variance / Edge anomalyAlgorithm. Implemented in `scripts/detect/lib/thermal.ts` (`analyzeVariance`).Apache 2.0 (project)Visual anomaly heuristic for AssetIQ corrosion / vibration and StockFlight damage detection.
Tesseract.js v5https://github.com/naptha/tesseract.jsApache 2.0OCR on pallet label crops in StockFlight.

What's mocked (and why)

These pieces don't have a real-CV equivalent in this demo because they require hardware we don't have, or they're configuration data, or they're synthetic for privacy. They're disclosed honestly here so reviewers can pattern-match what to expect in a production deployment.

  • Drone telemetry, GPS, IMU, battery state, fleet status. Mocked. No hardware connected.
  • Mission scheduler / planner UI state. Mocked. There's nothing to dispatch.
  • Worker IDs and names. Synthetic (Faker-generated) for privacy. Real face recognition is deliberately out of scope.
  • Persona avatars (Sarah, Marcus, David, Priya, Tom). Static UX fixture.
  • Plant codes (DTP, REVC, KCAP, CAP, LAP), zone definitions, PPE rules per zone. Static configuration — these are operational metadata, not CV outputs.
  • Geofence polygons. Static SVG.
  • Integration toggles (ServiceNow, Manhattan WMS). Mock state.

Storage and reset

All detection state lives in IndexedDB (one database per app). The first-load seed comes from /seeds/initial.json.gz. The seed file is built from the build-time inference pass and committed alongside the photos. "Reset Demo" clears IndexedDB and re-seeds. Test Detection uploads run the same model in your browser and append new rows to the same IndexedDB. No data ever leaves your machine.

Pipeline

The build-time pipeline lives in scripts/detect/. To regenerate seeds:

pnpm exec tsx scripts/detect/safetyvision.ts
pnpm exec tsx scripts/detect/assetiq.ts
pnpm exec tsx scripts/detect/sentinel.ts
pnpm exec tsx scripts/detect/stockflight.ts

Compliance & Audit metadata (SafetyVision)

Every violation detail page surfaces evidence-grade metadata for OSHA / legal review:

  • Model version string — e.g. yolov9-c_quant_v1.0.onnx, persisted on each Violation record as modelKey.
  • Confidence threshold — default 0.25, configurable per zone via Settings.
  • Photo SHA-256 hash — computed at ingest, persisted as sourcePhotoHash. Hash chain across the recording stream prevents undetected tampering.
  • Bbox coordinates — raw model output (image-space [x, y, w, h]) on each record as bbox.
  • Last-trained date — published in this MODELS_BLOCKS table and on every detail page.
  • Human review flag — boolean; flips when any operator acknowledges or modifies the finding.

OSHA 300A scope mapping: severity Critical = days-away case (29 CFR 1904.7(b)(3)); severity High = job-transfer / restriction case (29 CFR 1904.7(b)(4)). Recordable scope subject to site safety lead review before posting.

Chain-of-Custody (Sentinel)

Sentinel's evidence pipeline is designed for subpoena response and chain-of-custody auditability:

  • Recording storage location. PoC: browser IndexedDB local to the operator's workstation. Production: Ford-tenant S3 bucket with server-side encryption (SSE-KMS) and bucket-level versioning.
  • Retention policy. Video recordings — 90 days. Incident metadata — 7 years (matches OSHA & ATF retention windows).
  • Deletion authority. Only the Security Director role (Lt. Park or designee) can delete. Every delete logs immutably to the audit trail.
  • Hash chain. Every saved recording is hashed (SHA-256) at write; subsequent reads verify the hash. Hash log is part of the Evidence Package export.
  • Subpoena workflow. Operator hits "Evidence Package" on incident detail → generates a PDF containing incident metadata, AI classification provenance, recording chain-of-custody, audit trail, and an attestation block for the Security Director's signature.
  • Access log. Every view / export / delete is logged with operator ID, timestamp, IP, and reason code. Lockable per case under the Locks pane (post-PoC).

Competitive Honesty

Where competitors beat Dolunts today, and why we picked our trade-offs:

  • Percepto — thermal-trained CNN, mature drone fleet, outdoor depth. We're indoor-first with a luminance-heuristic thermal pseudo-image. For petchem / outdoor industrial, Percepto wins.
  • Verkada Command — mature security VMS, 16-up live wall, subpoena export. We're four-app breadth; security depth is one quarter behind.
  • Gather AI — WMS-native integration with one major customer. We're integration-mocked for the PoC; production wiring is in Phase 2.
  • Visionify — PPE-finetuned models out of the box. We use COCO-class detection with PPE inferred from photo-category metadata. A finetuned PPE ONNX is post-PoC.
  • VelocityEHS — established OSHA-form pipeline with a Ford contract. Our OSHA 300/300A/301 column layouts conform to federal template (Phase 17 R4). Display fidelity is now competitive; integration depth is not.

The Dolunts thesis is breadth + transparency. Each competitor wins one domain; the platform that wins by domain breadth has to be honest where it's behind.

Model License Commitments

The YOLOv9-c base model used today is GPL-3.0 licensed. Production deployment requires either Ford accepting GPL-3.0 implications or a permissively-licensed swap. The swap is contractually committed to land within 60 days of the contract effective date. Candidate shortlist:

  • RT-DETR (Apache 2.0) — transformer-based; comparable mAP on COCO; decoder format differs (requires reshape adapter).
  • YOLO-NAS (Apache 2.0) — Deci's NAS-found architecture; competitive accuracy; well-documented ONNX export.
  • DETIC (Apache 2.0 from Meta) — open-vocabulary; useful if the PPE finetune is in-scope post-PoC.

YOLOv8 / YOLOv9 forks are AGPL — also restrictive — explicitly excluded. Procedure documented in audit/phase17/MODEL_SWAP_PLAN.md.

Detection geometry & semantic correctness (Phase 18)

Phase 18 closes three classes of detection defect that earlier builds exposed. The platform now satisfies, app-wide, the contract that every bounding box on screen was emitted by the model and every PPE label means what it says.

Bounding box geometry (W1)

Bounding boxes are rendered in normalized [0..1] coordinates over the displayed image. The decoder maps model output (640×640 stretched input space) back to that frame by dividing model-emitted pixel coordinates by the model input dimensions. The earlier build divided by the original image dimensions, which squeezed every box into the top-left quadrant whenever the source photo was larger than 640×640. A debug overlay is available at ?debug=1 showing the conversion math.

PPE inference policy (W2 — Path A, 5-model ensemble)

SafetyVision's /test-detection route runs a 5-model PPE-finetuned ONNX ensemble locally in your browser: Hexmon/vyra-yolo-ppe-detection (14 classes), ayushgupta7777/safetyvision-yolov8 (13 classes, includes Goggles), keremberke/yolov8m-protective-equipment-detection (10 classes), Tanishjain9/yolov8n-ppe-6classes, and leeyunjai/yolo11-ppe. Each model runs sequentially against the same frame; the union of their detections is then deduplicated by class label + spatial IoU. Any single model can surface a PPE item the others miss — the ensemble is built specifically to lift recall on PPE classes (Safety Glasses especially) where individual fine-tunes drift on real-world eyewear styles. Per-model confidence floors are tuned per calibration; the merged output is what the operator sees. NO-X classes are the explicit "the model looked and did not find this PPE on a visible worker" channel; positive classes are direct detections. Live Feed routes remain on the honest "Awaiting detection" path until the ensemble is wired in there (Phase 18.5).

Detection consistency (W3)

The Live Feed overlay and the Live Detections sidebar read from a single per-frame detection store (shared/src/detection/state-react.tsx). There are no timer-injected synthetic events. When no inference has run for a frame, the overlay omits the box and the sidebar shows an "Awaiting detection" entry referring to the same frame.

Live Feed throttling

The Live Feed cycles photos on a 4-second interval per tile. PPE inference, when wired (Phase 19 model swap), will run once per cycle. Multi-tile views (Sentinel Live Wall) run independent cycles per tile; each tile publishes its frame to the same store.

Fixture-based geometry verification

Bounding-box geometry is verified against a fixture set of industrial photos with hand-annotated subject regions. Each release runs pnpm phase18:check:bbox-geometry; the gate asserts ≥0.5 IoU between rendered boxes and ground truth.

Out-of-scope models (honest)

Predictive failure modeling. Survival modeling (e.g. P(failure within 14 days) per asset) requires a labeled failure history that we have not collected for the PoC. The Phase 16 predictive-failure banner ("Pump P-204 · 67% in next 14 days") has been removed from this build per founder decision. A published model card with feature inputs, calibration curve, and training data window is post-PoC scope. The plan: stand up a 7-day rolling anomaly-density model with a published threshold table in Phase 2 of the engagement.