The Model The Method Dimensions Integrations Request a Demo
DATA Compass · The Platform

AI readiness,measured — not asserted.

The data platform that evaluates datasets for AI readiness — and turns the evaluation into recommendations, an executable action plan, and a quantified ROI.

Built for life sciences · Your data never moves · Zero egress

§ 01 · The Problem

AI-ready dataisn't the same as AI-ready.

Clean data is not ready data. A dataset can pass every hygiene check and still be the wrong shape for the model, silent on provenance, ambiguous to a machine, or out of compliance. Teams discover the gap in production — the most expensive place to find it.

Blind Deployment Risk

No way to know if agents will succeed

…until they fail in production.

Compliance Gaps

Regulators want audit trails

for autonomous and AI-assisted decisions.

Data Fragmentation

Critical data locked in silos

Databricks, Snowflake, Veeva — never assessed together.

Unquantified ROI

No line from data spend to AI outcome

Which fixes actually move readiness?

“Everyone says their data is AI-ready. Almost no one has measured it.”


§ 02 · What Compass Does

Evaluate readiness.Then close the gap.

DATA Compass evaluates datasets for AI readiness across four dimensions, grounded in FAIR — then turns every finding into a specific, executable recommendation and a quantified ROI, so AI investments succeed instead of stalling in a proof-of-concept.

01

Assess

Score the dataset across four dimensions of readiness, grounded in FAIR.

02

Recommend

Bridge AI turns findings into prioritized, causal recommendations.

03

Act

Every recommendation ships an executable remediation — an action plan, not a report.

04

Prove

Quantify the readiness lift and the ROI of each fix.


§ 03 · The Model

Four dimensions.One foundation.

Everyone sells FAIR and stops. Compass treats FAIR as the foundation — then measures the four dimensions of readiness FAIR enables but does not guarantee, with Bridge AI synthesizing it all into what to do next.

Synthesis
Bridge AI  →  Recommendations · Action Plan · ROI
Four dimensions
Semantic Clarityontologies · meaning
Statistical Healthsoundness · ML-fit
Contextual Validityfit for your use
Regulatory Compliancethe integrity bar
Foundation
FAIR — triangulated (Technical · Perceived · Governed) → defensible
Platforms
Databricks · Snowflake · Collibra · Veeva — your data, in place · zero egress

§ 04 · Where FAIR Sits

FAIR is the floor.Not the finish line.

FAIR makes data legible to machines and people — findable, accessible, interoperable, reusable. That is necessary, and it is not sufficient. A perfectly FAIR dataset can still be statistically broken, wrong for your use, non-compliant, or semantically ambiguous. Compass treats FAIR as the substrate and measures what comes next.

“We treat FAIR as table stakes — and measure the readiness it can't.”

FAIR guaranteesThe data can move.Findable · Accessible · Interoperable · Reusable — the plumbing is in place.
FAIR does not guaranteeThe data is ready.Statistically sound · fit for purpose · compliant · semantically unambiguous.
So Compass addsFour dimensions on the FAIR floor.Each measured, evidenced, and triangulated for defensibility.

§ 05 · Connect

Your data never moves.We query it in place.

Compass is an API overlay, not a data lake. It connects to your warehouse, runs read-only queries through your own compute, and analyzes the results in memory. No ETL, no copies, no new storage. Source data never leaves your environment — the answer to the first question every pharma security review asks.

Databricks · Snowflake · Collibra · Veeva
ConnectUnity Catalog / Snowflake, PAT or OAuth.Read-only. Credentials AES-256 in Key Vault / Secrets Manager.
Analyze in place50+ algorithms run on your warehouse.Results return as JSON — scores, recommendations, ROI, fix scripts.
Zero egressSource data never leaves.Most single-table analyses cost under $0.10 of warehouse compute.
§ 06 · The Method

Three reads on one domain.The disagreement is the finding.

Compass doesn't take the data's word for it. Every FAIR assessment triangulates three independent reads. Where they agree, you have a finding you can defend. Where they diverge, the divergence itself is the insight — and it's named, not hand-waved.

TechnicalWhat the data shows.SHACL + scoring across 16 FAIR rules.
PerceivedWhat people experience.Persona-weighted organizational interviews.
GovernedWhat the documents say.Governance documents scored against eight frameworks.
Agreement = Confidence  ·  Divergence = The Finding
Confirmed Maturityall three legs agree, high
Usability Gaptechnically strong, people struggle
Tribal Knowledgepeople know it, nothing is written
Unnormalized Practicedocumented, not yet in the data
Abandoned Capabilitywas there, now decayed
Policy TheaterSOPs exist, practice doesn't follow
Greenfieldlittle signal anywhere — start clean
Seven named patternseach one an actionable diagnosis
§ 07 · The Four Dimensions

What we measure,dimension by dimension.

Dimension 1 — Semantic Clarity

Does the data meanwhat it says — to a machine?

Whether a machine can resolve the meaning of your data unambiguously, against the ontology you actually use. Compass reads your private ontologies, infers which columns map to which classes, and scores the alignment.

Ontology inferenceInfer column→class alignment.Credit the meaning that's there even before it's tagged (Tier-2).
Private ontologiesScore against your vocabulary.Your CDISC / IDMP / internal taxonomy — not a generic dictionary.
I-pillar depthInteroperability, made first-class.Declared · Inferred · None, at labelled confidence.
Dimension 2 — Statistical Health

Is the dataquantitatively sound?

Whether the numbers hold up before a model ever touches them. Compass runs two engines — data quality and ML readiness — across 50+ algorithms, from distribution and anomaly detection to bias, class separability, and feature sufficiency.

Data qualityEight algorithms.Information content · outliers · sparsity · anomalies · cleanliness · more.
ML readinessFour algorithms.Data sufficiency · bias detection · feature engineering · class separability.
EvidencedNamed statistical tests.Benford · Pearson / Chi-square · Cohen's Kappa · Shapley · LOO.
Dimension 3 — Contextual Validity

Is the data rightfor your specific use?

A dataset that's clean, sound, and FAIR can still be wrong for the job in front of you. Contextual validity scores fit-for-purpose against a declared use — using customer-authoritative Domain Packs that encode your sub-domains, archetypes, and rules.

Domain PacksCustomer-authoritative.Your sub-domains and record archetypes, versioned and yours.
Fit-for-purposeScored against a declared use.Validity is relative to the question you're asking of the data.
5-step pipelineSignal → rules → coverage → LLM → judgment.Deterministic where it can be, synthesized where it must be.
Dimension 4 — Regulatory Compliance

Does the data meetthe integrity bar?

In regulated work, readiness includes defensibility to a regulator. Compass scores governance evidence against the frameworks pharma actually answers to — line by line, with citations — so compliance is measured, not asserted.

ALCOA+Nine data-integrity attributes.Attributable · legible · contemporaneous · original · accurate · + four.
FDA credibilitySeven steps + Step 8.Context of use → fit-for-use, plus agent decision provenance.
PrivacyGDPR · HIPAA · CCPA.Cited line-by-line, mapped to FAIR pillars.

§ 08 · Synthesis

Bridge AI reads it alland tells you what to do.

Four dimensions produce a lot of signal. Bridge AI is the synthesis layer: it runs causal analysis across the findings, separates root causes from symptoms, and writes prioritized recommendations in plain language — each one traced back to the evidence that produced it. No black-box conclusions; every claim cites its source.

Causal analysis

Root cause, not symptom.

Why the score is what it is — and what fixing it unlocks.

Prioritized recommendations

Ranked by impact and effort.

With an effort estimate on every action.

Cited, not conjured

Every recommendation traces to evidence.

From finding to source — auditable end to end.

“From four-dimensional signal to a short list of things worth doing.”


§ 09 · The Output

Every finding shipswith the fix attached.

Compass is not a scorecard that rots on SharePoint. Every recommendation routes to a structured, executable action plan — an auto-generated remediation script you can read, edit, and run. The deliverable is momentum, not a verdict.

Executable

Auto-generated remediation, ready to edit.

A Python script per finding — not a to-do line.

Routed

Findings connect to the plan automatically.

No re-keying scores into a separate tracker.

Traceable

Plan → recommendation → finding → source.

The whole chain survives an audit.

§ 10 · The Value

Readiness, in dollars.Not just a grade.

Every gap carries a cost and every fix carries a return. Compass quantifies the business value of closing each gap — grounded in the Pistoia Alliance FAIR business-value model — so a data-readiness program competes for budget on the same terms as everything else: ROI.

Prioritize by return

Which fixes move readiness most, per dollar.

Rank the backlog by impact, not by who shouted loudest.

Quantified benefit

Trust · speed · cost · effectiveness.

Level-0 outcomes decomposed into measurable benefits.

A budget case

Defensible numbers for the steering committee.

The readiness program earns its line item.

“The first FAIR conversation a CFO will sit through.”

§ 11 · Know Before You Deploy

How much autonomycan your data actually support?

Readiness isn't a yes/no. Compass tells you which level of AI autonomy your data can support today — and exactly what to fix to climb to the next one.

Level 1AdvisoryAI suggests; a human decides everything.
Level 2CopilotAI drafts; a human reviews and approves.
Level 3SupervisedAI acts within bounds; a human monitors.
Level 4AutonomousAI acts; humans audit the trail.

Integrations

Connect your datawhere it lives.

Native connectors for enterprise data platforms. Assess data in place — without extraction or movement.

Databricks

Unity Catalog browser, SQL profiling, and MLflow model assessment via PAT or OAuth2.

Snowflake

Database and schema browser with SQL API profiling for warehouses and data shares.

Veeva Vault

QMS, CDMS, and RIM module adapters with OAuth2 SSO and Direct Data extraction.

Stardog

SPARQL querying, ICV constraint validation, and reasoning-powered ontology analysis.

File Upload & API

Drag-and-drop CSV, Excel, Parquet, JSON — or a REST API for CI/CD and pipelines.

Know before you deploy.Prove compliance. Accelerate AI with confidence.

Get a full data-readiness assessment in minutes. No code changes, no data movement, no black boxes.

Request a Demo Download & Try It

Or download the self-hosted installer and assess your first dataset in under five minutes. No signup required.