Status: Proposed
Date: 2026-02-18
Author: System Architect (AgentDB v3)
Supersedes: None
Related: ADR-003 (RVF Format), ADR-006 (Unified Self-Learning RVF), ADR-007 (Full Capability Integration), ADR-008 (Chat UI RVF)
Package: @agentdb/causal-atlas
ADR-008 demonstrated that a single RVF artifact can embed a minimal Linux userspace, an LLM inference engine, and a self-learning pipeline into one portable file. This ADR extends that pattern to scientific computing: a portable RVF runtime that ingests public astronomy and physics datasets, builds a multi-scale interaction graph, maintains a dynamic coherence field, and emits replayable witness logs for every derived claim.
The design draws engineering inspiration from causal sets, loop-gravity-style discretization, and holographic boundary encoding, but it is implemented as a practical data system, not a physics simulator. The holographic principle manifests as a concrete design choice: primarily store and index boundaries, and treat interior state as reconstructable from boundary witnesses and retained archetypes.
| Component | Package | Relevant APIs |
|---|---|---|
| RVF segments | @ruvector/rvf, @ruvector/rvf-node |
embedKernel, extractKernel, embedEbpf, segments, derive |
| HNSW indexing | @ruvector/rvf-node |
ingestBatch, query, compact, HNSW with metadata filters |
| Witness chains | @ruvector/rvf-node, RvfSolver |
verifyWitness, SHAKE-256 witness chains, signed root hash |
| Graph transactions | NativeAccelerator |
graphTransaction, graphBatchInsert, Cypher queries |
| SIMD embeddings | @ruvector/ruvllm |
768-dim SIMD embed, cosine/dot/L2, HNSW memory search |
| SONA learning | SonaLearningBackend |
Micro-LoRA, trajectory recording, EWC++ |
| Federated coordination | FederatedSessionManager |
Cross-agent trajectories, warm-start patterns |
| Contrastive training | ContrastiveTrainer |
InfoNCE, hard negative mining, 3-stage curriculum |
| Adaptive index | AdaptiveIndexTuner |
5-tier compression, Matryoshka truncation, health monitoring |
| Kernel embedding | KernelBuilder (ADR-008) |
Minimal Linux boot from KERNEL_SEG + INITRD_SEG |
| Lazy model download | ChatInference (ADR-008) |
Deferred GGUF load on first inference call |
- Domain adapters for astronomy data (light curves, spectra, galaxy catalogs)
- Compressed causal atlas with partial-order event graph
- Coherence field index with cut pressure and partition entropy
- Multi-scale interaction memory with budget-controlled tiered retention
- Boundary evolution tracker with holographic-style boundary-first storage
- Planet detection pipeline (Kepler/TESS transit search)
- Life candidate scoring pipeline (spectral disequilibrium signatures)
- Progressive data download from public sources on first activation
A single RVF artifact that boots a minimal Linux userspace, progressively downloads and ingests public astronomy and physics datasets on first activation (lazy, like ADR-008's GGUF model download), builds a multi-scale interaction graph, maintains a dynamic coherence field, and emits replayable witness logs for every derived claim.
| # | Output | Description |
|---|---|---|
| 1 | Atlas snapshots | Queryable causal partial order plus embeddings |
| 2 | Coherence field | Partition tree plus cut pressure signals over time |
| 3 | Multi-scale memory | Delta-encoded interaction history from seconds to micro-windows |
| 4 | Boundary tracker | Boundary changes, drift, and anomaly alerts |
| 5 | Planet candidates | Ranked list with traceable evidence |
| 6 | Life candidates | Ranked list of spectral disequilibrium signatures with traceable evidence |
- Proving quantum gravity
- Replacing astrophysical pipelines end-to-end
- Claiming life detection without conventional follow-up observation
All data is progressively downloaded from public archives on first activation. The RVF artifact ships with download manifests and integrity hashes, not the raw data itself.
| Source | Access | Reference |
|---|---|---|
| Kepler light curves and pixel files | MAST bulk and portal | archive.stsci.edu/kepler |
| TESS light curves and full-frame images | MAST portal | archive.stsci.edu/tess |
| Source | Access | Reference |
|---|---|---|
| JWST exoplanet spectra | exo.MAST and MAST holdings | archive.stsci.edu |
| NASA Exoplanet Archive parameters | Cross-linking to spectra and mission products | exoplanetarchive.ipac.caltech.edu |
| Source | Access | Reference |
|---|---|---|
| SDSS public catalogs (spectra, redshifts) | DR17 | sdss4.org/dr17 |
Following the lazy-download pattern established in ADR-008 for GGUF models:
- Manifest-first: RVF ships with
MANIFEST_SEGcontaining download URLs, SHA-256 hashes, expected sizes, and priority tiers - Tier 0 (boot): Minimal curated dataset (~50 MB) for offline demo — 100 Kepler targets with known confirmed planets, embedded in VEC_SEG
- Tier 1 (first run): Download 1,000 Kepler targets on first pipeline activation. Background download, progress reported via CLI/HTTP
- Tier 2 (expansion): Full Kepler/TESS catalog download on explicit
rvf ingest --expandcommand - Tier 3 (spectra): JWST and archive spectra downloaded when life candidate pipeline is first activated
- Seal-on-complete: After download, data is ingested into VEC_SEG and INDEX_SEG, a new witness root is committed, and the RVF is sealed into a reproducible snapshot
Download state machine:
[boot] ──first-inference──> [downloading-tier-1]
│ │
│ (offline demo works) │ (progress: 0-100%)
│ │
▼ ▼
[tier-0-only] [tier-1-ready]
│
rvf ingest --expand
│
▼
[tier-2-ready]
│
life pipeline activated
│
▼
[tier-3-ready] ──seal──> [sealed-snapshot]
Each tier download:
- Resumes from last byte on interruption (HTTP Range headers)
- Validates SHA-256 after download
- Commits a witness record for the download event
- Can be skipped with
--offlineflag (uses whatever is already present)
Extends the ADR-003 segment model with domain-specific segments.
| # | Segment | Contents |
|---|---|---|
| 1 | MANIFEST_SEG |
Segment table, hashes, policy, budgets, version gates, download manifests |
| 2 | KERNEL_SEG |
Minimal Linux kernel image for portable boot (reuse ADR-008) |
| 3 | INITRD_SEG |
Minimal userspace: busybox, RuVector binaries, data ingest tools, query server |
| 4 | EBPF_SEG |
Socket allow-list and syscall reduction. Default: local loopback + explicit download ports only |
| 5 | VEC_SEG |
Embedding vectors: light-curve windows, spectrum windows, graph node descriptors, partition boundary descriptors |
| 6 | INDEX_SEG |
HNSW unified attention index for vectors and boundary descriptors |
| 7 | GRAPH_SEG |
Dynamic interaction graph: nodes, edges, timestamps, authority, provenance |
| 8 | DELTA_SEG |
Append-only change log of graph updates and field updates |
| 9 | WITNESS_SEG |
Deterministic witness chain: canonical serialization, signed root hash progression |
| 10 | POLICY_SEG |
Data provenance requirements, candidate publishing thresholds, deny rules, confidence floors |
| 11 | DASHBOARD_SEG |
Vite-bundled Three.js visualization app — static assets served by runtime HTTP server |
interface Event {
id: string;
t_start: number; // epoch seconds
t_end: number;
domain: 'kepler' | 'tess' | 'jwst' | 'sdss' | 'derived';
payload_hash: string; // SHA-256 of raw data window
provenance: Provenance;
}
interface Observation {
id: string;
instrument: string; // 'kepler-lc' | 'tess-ffi' | 'jwst-nirspec' | ...
target_id: string; // e.g., KIC or TIC identifier
data_pointer: string; // segment offset into VEC_SEG
calibration_version: string;
provenance: Provenance;
}
interface InteractionEdge {
src_event_id: string;
dst_event_id: string;
type: 'causal' | 'periodicity' | 'shape_similarity' | 'co_occurrence' | 'spatial';
weight: number;
lag: number; // temporal lag in seconds
confidence: number;
provenance: Provenance;
}
interface Boundary {
boundary_id: string;
partition_left_set_hash: string;
partition_right_set_hash: string;
cut_weight: number;
cut_witness: string; // witness chain reference
stability_score: number;
}
interface Candidate {
candidate_id: string;
category: 'planet' | 'life';
evidence_pointers: string[]; // event and edge IDs
score: number;
uncertainty: number;
publishable: boolean; // based on POLICY_SEG rules
witness_trace: string; // WITNESS_SEG reference for replay
}
interface Provenance {
source: string; // 'mast-kepler' | 'mast-tess' | 'mast-jwst' | ...
download_witness: string; // witness chain entry for the download
transform_chain: string[]; // ordered list of transform IDs applied
timestamp: string; // ISO-8601
}Input: flux time series + cadence metadata (Kepler/TESS FITS)
Output: Event nodes for windows
InteractionEdges for periodicity hints and shape similarity
Candidate nodes for dip detections
Input: wavelength, flux, error arrays (JWST NIRSpec, etc.)
Output: Event nodes for band windows
InteractionEdges for molecule feature co-occurrence
Disequilibrium score components
Input: galaxy positions and redshifts (SDSS)
Output: Graph of spatial adjacency and filament membership
Definition: A partial order of events plus minimal sufficient descriptors to reproduce derived edges.
Construction:
-
Windowing — Light curves into overlapping windows at multiple scales
- Scales: 2 hours, 12 hours, 3 days, 27 days
-
Feature extraction — Robust features per window
- Flux derivative statistics
- Autocorrelation peaks
- Wavelet energy bands
- Transit-shaped matched filter response
-
Embedding — RuVector SIMD embed per window, stored in VEC_SEG
-
Causal edges — Add edge when window A precedes window B and improves predictability of B (conditional mutual information proxy or prediction gain, subject to POLICY_SEG constraints)
- Edge weight: prediction gain magnitude
- Provenance: exact windows, transform IDs, threshold used
-
Atlas compression
- Keep only top-k causal parents per node
- Retain stable boundary witnesses
- Delta-encode updates into DELTA_SEG
Output API:
| Endpoint | Returns |
|---|---|
atlas.query(event_id) |
Parents, children, plus provenance |
atlas.trace(candidate_id) |
Minimal causal chain for a candidate |
Definition: A field over the atlas graph that assigns coherence pressure and cut stability over time.
Signals:
| Signal | Description |
|---|---|
| Cut pressure | Minimum cut values over selected subgraphs |
| Partition entropy | Distribution of cluster sizes and churn rate |
| Disagreement | Cross-detector disagreement rate |
| Drift | Embedding distribution shift in sliding window |
Algorithm:
- Maintain a partition tree. Update with dynamic min-cut on incremental graph changes
- For each update epoch:
- Compute cut witnesses for top boundaries
- Emit boundary events into GRAPH_SEG
- Append witness record into WITNESS_SEG
- Index boundaries via descriptor vector:
- Cut value, partition sizes, local graph curvature proxy, recent churn
Query API:
| Endpoint | Returns |
|---|---|
coherence.get(target_id, epoch) |
Field values for target at epoch |
boundary.nearest(descriptor) |
Similar historical boundary states via INDEX_SEG |
Definition: A memory that retains interactions at multiple time resolutions with strict budget control.
Three tiers:
| Tier | Resolution | Content |
|---|---|---|
| S | Seconds to minutes | High-fidelity deltas |
| M | Hours to days | Aggregated deltas |
| L | Weeks to months | Boundary summaries and archetypes |
Retention rules:
- Preserve events that are boundary-critical
- Preserve events that are candidate evidence
- Compress everything else via archetype clustering in INDEX_SEG
Mechanism:
- DELTA_SEG is append-only
- Periodic compaction produces a new RVF root with a witness proof of preservation rules applied
Definition: A tracker that treats boundaries as primary objects that evolve over time.
This is where the holographic flavor is implemented. You primarily store and index boundaries, and treat interior state as reconstructable from boundary witnesses and retained archetypes.
Output API:
| Endpoint | Returns |
|---|---|
boundary.timeline(target_id) |
Boundary evolution over time |
boundary.alerts |
Alerts when: cut pressure spikes, boundary identity flips, disagreement exceeds threshold, drift persists beyond policy |
Input: Kepler or TESS light curves from MAST (progressively downloaded)
- Normalize flux
- Remove obvious systematics (detrending)
- Segment into windows and store as Event nodes
- Matched filter bank for transit-like dips
- Period search on candidate dip times (BLS or similar)
- Create Candidate node per period hypothesis
Candidate must pass all gates:
| Gate | Requirement |
|---|---|
| Multi-scale stability | Stable across multiple window scales |
| Boundary consistency | Consistent boundary signature around transit times |
| Low drift | Drift below threshold across adjacent windows |
Score components:
| Component | Description |
|---|---|
| SNR-like strength | Signal-to-noise of transit dip |
| Shape consistency | Cross-transit shape agreement |
| Period stability | Variance of period estimates |
| Coherence stability | Coherence field stability around candidate |
Emit: Candidate with evidence pointers + witness trace listing exact windows, transforms, and thresholds used.
Life detection here means pre-screening for non-equilibrium atmospheric chemistry signatures, not proof.
Input: Published or mission spectra tied to targets via MAST and NASA Exoplanet Archive (progressively downloaded on first pipeline activation)
- Normalize and denoise within instrument error model
- Window spectra by wavelength bands
- Create band Event nodes
- Identify absorption features and confidence bands
- Encode presence vectors for key molecule families (H2O, CO2, CH4, O3, NH3, etc.)
- Build InteractionEdges between features that co-occur in physically meaningful patterns
Core concept: Life-like systems maintain chemical ratios that resist thermodynamic relaxation.
Implementation as graph scoring:
- Build a reaction plausibility graph (prior rule set in POLICY_SEG)
- Compute inconsistency score between observed co-occurrences and expected equilibrium patterns
- Track stability of that score across epochs and observation sets
Score components:
| Component | Description |
|---|---|
| Persistent multi-molecule imbalance | Proxy for non-equilibrium chemistry |
| Feature repeatability | Agreement across instruments or visits |
| Contamination risk penalty | Instrument artifact and stellar contamination |
| Stellar activity confound penalty | Host star variability coupling |
Output: Life candidate list with explicit uncertainty + required follow-up observations list generated by POLICY_SEG rules.
- RVF boots minimal Linux from KERNEL_SEG and INITRD_SEG (reuse ADR-008
KernelBuilder) - Starts
rvf-runtimedaemon exposing local HTTP and CLI - On first inference/query, progressively downloads required data tier
CLI:
rvf run artifact.rvf # boot the runtime
rvf query planet list # ranked planet candidates
rvf query life list # ranked life candidates
rvf trace <candidate_id> # full witness trace for any candidate
rvf ingest --expand # download tier-2 full catalog
rvf status # download progress, segment sizes, witness countHTTP:
GET / # Three.js dashboard (served from DASHBOARD_SEG)
GET /assets/* # Dashboard static assets
GET /api/atlas/query?event_id=... # causal parents/children
GET /api/atlas/trace?candidate_id=... # minimal causal chain
GET /api/coherence?target_id=...&epoch= # field values
GET /api/boundary/timeline?target_id=...
GET /api/boundary/alerts
GET /api/candidates/planet # ranked planet list
GET /api/candidates/life # ranked life list
GET /api/candidates/:id/trace # witness trace
GET /api/status # system health + download progress
GET /api/memory/tiers # tier S/M/L utilization
WS /ws/live # real-time boundary alerts, pipeline progress, candidate updates
- Fixed seeds for all stochastic operations
- Canonical serialization of every intermediate artifact
- Witness chain commits after each epoch
- Two-machine reproducibility: identical RVF root hash for identical input
- Network off by default
- If enabled, eBPF allow-list: MAST/archive download ports + local loopback only
- No remote writes without explicit policy toggle in POLICY_SEG
- Downloaded data verified against MANIFEST_SEG hashes before ingestion
The RVF embeds a Vite-bundled Three.js dashboard in DASHBOARD_SEG. The
runtime HTTP server serves it at / (root). All visualizations are driven
by the same API endpoints the CLI uses, so every rendered frame corresponds
to queryable, witness-backed data.
DASHBOARD_SEG (inside RVF)
dist/
index.html # Vite SPA entry
assets/
main.[hash].js # Three.js + D3 + app logic (tree-shaken)
main.[hash].css # Tailwind/minimal styles
worker.js # Web Worker for graph layout
Runtime serves:
GET / -> DASHBOARD_SEG/dist/index.html
GET /assets/* -> DASHBOARD_SEG/dist/assets/*
GET /api/* -> JSON API (atlas, coherence, candidates, etc.)
WS /ws/live -> Live streaming of boundary alerts and pipeline progress
Build pipeline: Vite builds the dashboard at package time into a single
tree-shaken bundle. The bundle is embedded into DASHBOARD_SEG during RVF
assembly. No Node.js required at runtime — the dashboard is pure static
assets served by the existing HTTP server.
Interactive 3D force-directed graph of the causal atlas.
| Feature | Implementation |
|---|---|
| Node rendering | THREE.InstancedMesh for events — color by domain (Kepler=blue, TESS=cyan, JWST=gold, derived=white) |
| Edge rendering | THREE.LineSegments with opacity mapped to edge weight |
| Causal flow | Animated particles along causal edges showing temporal direction |
| Scale selector | Toggle between window scales (2h, 12h, 3d, 27d) — re-layouts graph |
| Candidate highlight | Click candidate in sidebar to trace its causal chain in 3D, dimming unrelated nodes |
| Witness replay | Step through witness chain entries, animating graph state forward/backward |
| LOD | Level-of-detail: far=boundary nodes only, mid=top-k events, close=full subgraph |
Data source: GET /api/atlas/query, GET /api/atlas/trace
Real-time coherence field rendered as a colored surface over the atlas graph.
| Feature | Implementation |
|---|---|
| Field surface | THREE.PlaneGeometry subdivided grid, vertex colors from coherence values |
| Cut pressure | Red hotspots where cut pressure is high, cool blue where stable |
| Partition boundaries | Glowing wireframe lines at partition cuts |
| Time scrubber | Scrub through epochs to see coherence evolution |
| Drift overlay | Toggle to show embedding drift as animated vector arrows |
| Alert markers | Pulsing icons at boundary alert locations |
Data source: GET /api/coherence, GET /api/boundary/timeline, WS /ws/live
Split view combining data panels with 3D orbital visualization.
| Panel | Content |
|---|---|
| Ranked list | Sortable table: candidate ID, score, uncertainty, period, SNR, publishable status |
| Light curve viewer | Interactive D3 chart: raw flux, detrended flux, transit model overlay, per-window score |
| Phase-folded plot | All transits folded at detected period, with confidence band |
| 3D orbit preview | THREE.Line showing inferred orbital path around host star, sized by uncertainty |
| Evidence trace | Expandable tree showing witness chain from raw data to final score |
| Score breakdown | Radar chart: SNR, shape consistency, period stability, coherence stability |
Data source: GET /api/candidates/planet, GET /api/candidates/:id/trace
Split view for spectral disequilibrium analysis.
| Panel | Content |
|---|---|
| Ranked list | Sortable table: candidate ID, disequilibrium score, uncertainty, molecule flags, publishable |
| Spectrum viewer | Interactive D3 chart: wavelength vs flux, molecule absorption bands highlighted |
| Molecule presence matrix | Heatmap of detected molecule families vs confidence |
| 3D molecule overlay | THREE.Sprite labels at absorption wavelengths in a 3D wavelength space |
| Reaction graph | Force-directed graph of molecule co-occurrences vs equilibrium expectations |
| Confound panel | Bar chart: stellar activity penalty, contamination risk, repeatability score |
Data source: GET /api/candidates/life, GET /api/candidates/:id/trace
Operational health and download progress.
| Panel | Content |
|---|---|
| Download progress | Per-tier progress bars with byte counts and ETA |
| Segment sizes | Stacked bar chart of RVF segment utilization |
| Memory tiers | S/M/L tier fill levels and compaction history |
| Witness chain | Scrolling log of recent witness entries with hash preview |
| Pipeline status | P0/P1/P2 and L0/L1/L2 stage indicators with event counts |
| Performance | Query latency histogram, events/second throughput |
Data source: GET /api/status, GET /api/memory/tiers, WS /ws/live
// WS /ws/live — server pushes events as they happen
interface LiveEvent {
type: 'boundary_alert' | 'candidate_new' | 'candidate_update' |
'download_progress' | 'witness_commit' | 'pipeline_stage' |
'coherence_update';
timestamp: string;
data: Record<string, unknown>;
}The dashboard subscribes on connect and updates all views in real-time as pipelines process data and boundaries evolve.
// vite.config.ts for dashboard build
import { defineConfig } from 'vite';
export default defineConfig({
build: {
outDir: 'dist/dashboard',
assetsDir: 'assets',
rollupOptions: {
output: {
manualChunks: {
three: ['three'], // ~150 KB gzipped
d3: ['d3-scale', 'd3-axis', 'd3-shape', 'd3-selection'],
},
},
},
},
});Bundle budget: < 500 KB gzipped total (Three.js ~150 KB, D3 subset ~30 KB, app logic ~50 KB, styles ~10 KB). The dashboard adds minimal overhead to the RVF artifact.
The Three.js dashboard is bundled at build time and embedded in DASHBOARD_SEG
rather than served from an external CDN or requiring a separate install. This
ensures:
- Fully offline: Works without network after boot
- Version-locked: Dashboard always matches the API version it queries
- Single artifact: One RVF file = runtime + data + visualization
- Witness-aligned: Dashboard renders exactly the data the witness chain can verify
packages/agentdb-causal-atlas/
src/
index.ts # createCausalAtlasServer() factory
CausalAtlasServer.ts # HTTP + CLI runtime + dashboard serving + WS
CausalAtlasEngine.ts # Core atlas, coherence, memory, boundary
adapters/
PlanetTransitAdapter.ts # Kepler/TESS light curve ingestion
SpectrumAdapter.ts # JWST/archive spectral ingestion
CosmicWebAdapter.ts # SDSS spatial graph (Phase 2)
pipelines/
PlanetDetection.ts # P0-P2 planet detection pipeline
LifeCandidate.ts # L0-L2 life candidate pipeline
constructs/
CausalAtlas.ts # Compressed causal partial order
CoherenceField.ts # Partition tree + cut pressure
MultiScaleMemory.ts # Tiered S/M/L retention
BoundaryTracker.ts # Boundary evolution + alerts
download/
ProgressiveDownloader.ts # Tiered lazy download with resume
DataManifest.ts # URL + hash + size manifests
KernelBuilder.ts # Reuse/extend from ADR-008
dashboard/ # Vite + Three.js visualization app
vite.config.ts # Build config — outputs to dist/dashboard/
index.html # SPA entry point
src/
main.ts # App bootstrap, router, WS connection
api.ts # Typed fetch wrappers for /api/* endpoints
ws.ts # WebSocket client for /ws/live
views/
AtlasExplorer.ts # V1: 3D causal atlas (Three.js force graph)
CoherenceHeatmap.ts # V2: Coherence field surface + cut pressure
PlanetDashboard.ts # V3: Planet candidates + light curves + 3D orbit
LifeDashboard.ts # V4: Life candidates + spectra + molecule graph
StatusDashboard.ts # V5: System health, downloads, witness log
three/
AtlasGraph.ts # InstancedMesh nodes, LineSegments edges, particles
CoherenceSurface.ts # PlaneGeometry with vertex-colored field
OrbitPreview.ts # Orbital path visualization
CausalFlow.ts # Animated particles along causal edges
LODController.ts # Level-of-detail: boundary → top-k → full
charts/
LightCurveChart.ts # D3 flux time series with transit overlay
SpectrumChart.ts # D3 wavelength vs flux with molecule bands
RadarChart.ts # Score breakdown radar
MoleculeMatrix.ts # Heatmap of molecule presence vs confidence
components/
Sidebar.ts # Candidate list, filters, search
TimeScrubber.ts # Epoch scrubber for coherence replay
WitnessLog.ts # Scrolling witness chain entries
DownloadProgress.ts # Tier progress bars
styles/
main.css # Minimal Tailwind or hand-rolled styles
tests/
causal-atlas.test.ts
planet-detection.test.ts
life-candidate.test.ts
progressive-download.test.ts
coherence-field.test.ts
boundary-tracker.test.ts
dashboard.test.ts # Dashboard build + API integration tests
Scope: Kepler and TESS only. No spectra. No life scoring.
- Implement
ProgressiveDownloaderwith tier-0 curated dataset (100 Kepler targets) - Implement
PlanetTransitAdapterfor FITS light curve ingestion - Implement
CausalAtlaswith windowing, feature extraction, SIMD embedding - Implement
PlanetDetectionpipeline (P0-P2) - Implement
WITNESS_SEGwith SHAKE-256 chain - CLI:
rvf run,rvf query planet list,rvf trace - HTTP:
/api/candidates/planet,/api/atlas/trace - Dashboard: Vite scaffold, V1 Atlas Explorer (Three.js 3D graph), V3 Planet
Dashboard (ranked list + light curve chart), V5 Status Dashboard (download
progress + witness log). Embedded in
DASHBOARD_SEG, served at/ - WebSocket
/ws/livefor real-time pipeline progress
Acceptance: 1,000 Kepler targets, top-100 ranked list includes >= 80 confirmed planets, every item replays to same score and witness root on two machines. Dashboard renders atlas graph and candidate list in browser.
- Implement
CoherenceFieldwith dynamic min-cut, partition entropy - Implement
BoundaryTrackerwith timeline and alerts - Implement
MultiScaleMemorywith S/M/L tiers and budget control - Add coherence gating to planet pipeline
- HTTP:
/api/coherence,/api/boundary/*,/api/memory/tiers - Dashboard: V2 Coherence Heatmap (Three.js field surface + cut pressure overlay + time scrubber), boundary alert markers via WebSocket
- Implement
SpectrumAdapterfor JWST/archive spectral data - Implement
LifeCandidatepipeline (L0-L2) - Implement disequilibrium scoring with reaction plausibility graph
- Tier-3 progressive download for spectral data
- CLI:
rvf query life list - HTTP:
/api/candidates/life - Dashboard: V4 Life Dashboard (spectrum viewer + molecule presence matrix
- reaction graph + confound panel)
Acceptance: Published spectra with known atmospheric detections vs nulls, AUC > 0.8, every score includes confound penalties and provenance trace. Dashboard renders spectrum analysis in browser.
CosmicWebAdapterfor SDSS spatial graph- Cross-domain coherence (planet candidates enriched by large-scale context)
- Dashboard: 3D cosmic web view, cross-domain candidate linking
- Full offline demo with sealed RVF snapshot
rvf ingest --expandfor tier-2 bulk download- Dashboard polish: LOD optimization, mobile-responsive layout, dark/light theme
| Metric | Requirement |
|---|---|
| Recall@100 | >= 80 confirmed planets in top 100 |
| False positives@100 | Documented with witness traces |
| Median time per star | Measured and reported |
| Reproducibility | Identical root hash on two machines |
| Metric | Requirement |
|---|---|
| AUC (detected vs null) | > 0.8 |
| Confound penalties | Present on every score |
| Provenance trace | Complete for every score |
| Test | Requirement |
|---|---|
| Boot reproducibility | Identical root hash across two machines |
| Query determinism | Identical results for same dataset snapshot |
| Witness verification | verifyWitness passes for all chains |
| Progressive download | Resumes correctly after interruption |
| Failure | Fix |
|---|---|
| Noise dominates coherence field | Strengthen policy priors, add confound penalties, enforce multi-epoch stability |
| Over-compression kills rare signals | Boundary-critical retention rules + candidate evidence pinning |
| Spurious life signals from stellar activity | Model stellar variability as its own interaction graph, penalize coupling |
| Compute blow-up | Strict budgets in POLICY_SEG, tiered memory, boundary-first indexing |
| Download interruption | HTTP Range resume, partial-ingest checkpoint, witness for partial state |
Phase 1 delivers a concrete, testable planet-detection system. Life scoring requires additional instrument-specific adapters and more nuanced policy rules. Separating them de-risks the schedule.
The RVF artifact ships with a curated ~50 MB tier-0 dataset for fully offline demonstration. Full catalog data is downloaded lazily, following the pattern proven in ADR-008 for GGUF model files. This keeps the initial artifact small (< 100 MB without kernel) while supporting the full 1,000+ target benchmark.
Boundaries are stored as first-class indexed objects. Interior state is reconstructed on-demand from boundary witnesses and retained archetypes. This reduces storage by 10-50x for large graphs while preserving queryability and reproducibility.
Every candidate, every coherence measurement, and every boundary change is committed to the SHAKE-256 witness chain. This enables two-machine reproducibility verification and provides a complete audit trail from raw data to final score.
- MAST — Kepler
- MAST — TESS
- MAST Home
- NASA Exoplanet Archive
- SDSS DR17
- ADR-003: RVF Native Format Integration
- ADR-006: Unified Self-Learning RVF Integration
- ADR-007: RuVector Full Capability Integration
- ADR-008: Chat UI RVF Kernel Embedding