TL;DR — Direct Answer
Self-serve fails in two symmetrical ways: the wild west (everyone queries raw tables, numbers diverge, costs explode) and the locked tower (governance so heavy analysts export to spreadsheets and govern nothing). The design that holds at 200+ analysts: three data tiers with different promises (certified / curated / sandbox), metadata standards enforced in CI not in wikis, lineage as a publishing requirement, and an operating model where domain teams publish products and the platform team builds rails — never reviews every dashboard. Govern the tiers, not the analysts.
The three-tier contract with consumers
| Tier | Promise | Who writes | Who reads |
|---|---|---|---|
| Certified | Contracted schema, SLAs, owned KPIs, lineage published, breaking changes versioned | Domain data teams via PR + review | Everyone — execs make decisions on this tier only |
| Curated | Documented, tested, owned — but evolving; no exec-reporting guarantee | Domain analysts/engineers via lighter PR | All analysts |
| Sandbox | None. Auto-expires in 30–90 days. Cannot feed dashboards. | Any analyst, instantly | Author's team |
The tiers do the governing: analysts get instant freedom in sandbox, a clear promotion path upward, and an unambiguous answer to "which revenue number is real." The single most effective rule in the system: BI tools may only connect scheduled dashboards to certified or curated schemas — enforced by warehouse grants, not policy documents.
Metadata standards as code, not wiki pages
Publishing to curated or certified requires, mechanically in CI: an owner (team handle), a description on every model and column, tags for domain and PII classification, tests appropriate to tier, and — for certified — a contract with SLAs. A model without an owner does not merge. This replaces the data-steward bureaucracy most governance programs drown in: the standard is a linter, the reviewer is the domain peer, the catalog (DataHub, Atlan, OpenMetadata) renders what CI enforced.
Lineage as a publishing requirement
Lineage isn't a tool you buy; it's a property you refuse to lose. dbt gives transformation lineage free; ingestion and BI edges come from your catalog's integrations. The rule: if the platform can't trace a certified dataset source-to-dashboard, it doesn't get certified. Lineage pays off in the two moments that justify the whole platform: impact analysis before a change, and root cause during an incident — both shrink from days of Slack archaeology to minutes.
Access model: roles by tier × PII, not by request ticket
Four warehouse roles cover 95% of cases: analyst (certified + curated, no PII), analyst_pii (adds masked-or-clear PII per policy), domain_publisher (write to own domain's curated), platform. Joining the analyst group is onboarding, not a ticket queue. Row/column policies handle the exceptions. Access reviews become group-membership reviews — quarterly, boring, done.
The operating model that scales
- Platform team builds rails: ingestion patterns, CI checks, catalog, cost guardrails (per-team warehouses/quotas — sandbox compute is where bills go to die).
- Domain teams own products: they publish, they answer breach tickets, their name is on the dataset.
- Certification council (1 hr/fortnight): platform + domain leads approve promotions to certified and retire dead certified sets. The only committee in the whole design.
- Adoption metrics decide success: % of dashboard queries hitting certified/curated (target >80%), sandbox-to-curated promotion rate, time-to-first-query for a new analyst (<1 day). If analysts route around the platform, the platform is wrong — not the analysts.
This is the blueprint behind our Fortune 500 governance engagement — 40% less manual reconciliation precisely because self-service got safer than spreadsheets, not harder.