UNTITLED RESEARCH

Organisation & Culture

Commitment · Culture · Leadership · Enablement · Adoption

Strategy & Roadmap

Vision · Prioritization · Trade-offs · Use Cases · Outcomes

Teams & Execution

Execution · Cadence · Knowledge Sharing · Skillsets · Training · Project Management

Activation

Governance & Discovery

Managing data as a discoverable, governed asset.

  • Catalog & Discovery

    The directory of every dataset — searchable, owned, used.

  • Business Glossary

    Shared definitions — "active customer" means one thing.

  • Access Control

    Who can read what — decided once and enforced everywhere.

  • Classification & Tagging

    Sensitivity, ownership, domain — tagged at scale.

  • Privacy Engineering

    PII, consent, retention, deletion — engineered in.

  • Audit & Compliance

    Logs and reports that satisfy regulators and auditors.

Activation

Analytics & BI

Where data meets decisions.

  • Dashboards & Reports

    The board's view, the team's view, the operator's view.

  • Reporting & Distribution

    Scheduled reports, operational reports, board packs.

  • Visualisation & Statistical Analysis

    Beyond dashboards — exploratory viz, statistical analysis, custom notebooks.

  • Ad-hoc & Exploratory Analysis

    Notebook deep-dives and exploratory queries.

  • Self-Service Analytics

    Business users answering their own questions, governed.

  • Conversational Q&A

    Asking questions in natural language, getting answers from the data.

Activation

AI & Machine Learning

Predictive, generative, evaluated.

  • Predictive Modelling

    Classical ML — train classifiers, regressors, forecasters; serve predictions to production.

  • MLOps & Deployment

    Versioning, deployment, monitoring — the production side of models.

  • LLMs & Generative

    Foundation models, fine-tunes, prompts, applications.

  • Retrieval-Augmented Generation

    Grounding LLM answers in your own knowledge.

  • Agents & Tooling

    LLM-driven workflows with tools, memory, and decisions.

  • Evaluation & Testing

    Testing what a model does — before and after deployment.

Activation

Data Products & Apps

Putting data to work.

  • Data Products

    Data as a managed offering — owned, versioned, contracted.

  • Custom Data Apps

    Full-stack apps where the data is the point.

  • Embedded Analytics

    Charts inside your customers' product, not your dashboard.

  • Data APIs

    Serving data to internal and external consumers.

  • Reverse Sync

    Pushing modelled data back into Salesforce, Hubspot, and ops tools — the reverse-ETL outcome.

  • Data Marketplace

    Internal data exchange — find, request, subscribe.

Engineering

Transformation & Modelling

Shaping data for the question being asked.

  • SQL Transformations

    dbt, SQL, in-warehouse compute — declarative, the workhorse.

  • Code Transforms

    Python, Spark, PySpark, Polars — procedural transforms when SQL isn't enough.

  • Modelling Patterns

    Dimensional/star, snowflake, 3NF, OBT, data vault — chosen for fit.

  • Semantic Layer

    Metrics, dimensions, contracts — defined once, used across BI and apps.

  • Marts & Cubes

    Domain-shaped output tables for the way the business asks questions.

  • Materialisations

    Views, tables, incremental builds, snapshots, SCDs — how transforms get persisted.

Engineering

Master Data Management

One customer, one product, everywhere.

  • Entity Resolution

    Matching records across systems — same customer, different IDs.

  • Golden Records

    The single canonical version of each entity.

  • Reference Data

    Lookups, codes, classifications — managed centrally.

  • Knowledge Graphs & Hierarchies

    Org charts, product trees, taxonomies, semantic networks — entities and how they relate.

  • Stewardship Workflows

    Who approves changes; how exceptions get handled.

  • Cross-system Identity

    ID mapping across CRM, ERP, marketing, support, warehouse.

Engineering

Quality & Observability

Keeping data trustworthy.

  • Quality Tests

    Asserting what should be true — failing loud when it isn't.

  • Data Profiling

    Looking at the data to understand it before modelling it.

  • Quality Rules

    Codified expectations about what good data looks like.

  • Quality Monitoring

    Continuous checks; alerts when something drifts.

  • Cost & FinOps

    Cloud spend, query cost, FinOps practices — what your data work actually costs.

  • Data Contracts

    What producers promise consumers, in writing.

Engineering

Orchestration & Automation

Running and shipping the work.

  • Workflow DAGs

    Airflow, Prefect, Dagster — the graph of what runs when.

  • Scheduling & Triggers

    Cron, intervals, file landings, webhooks — when work fires.

  • CI/CD for Data

    dbt CI, automated tests, deploy on green — shipping changes safely.

  • Schema Migrations

    Versioned schema changes, contract evolution, safe rollouts.

  • Environment Promotion

    Dev → Staging → Prod — automated promotion with checks at each gate.

  • Backfills & Reruns

    Rerunning history — for new logic, fixed bugs, missed days.

Foundation

Network & Identity

Who and what gets in — the access fabric.

  • Network & Connectivity

    VPCs, subnets, private endpoints, peering, transit gateways — the network fabric.

  • Identity & Access Management

    IAM, SSO, RBAC, federation — the human and role layer.

  • Workload Identity & Auth

    Service accounts, mTLS, workload identity federation, service mesh.

  • Secrets & Credentials

    Vault, AWS Secrets Manager, rotation, just-in-time access.

  • Encryption & Key Management

    KMS, HSM, BYOK — at rest and in transit.

  • DNS & Service Discovery

    How services find each other — the resolver every workload needs.

Foundation

Connectivity & Integration

How data moves — in any direction.

  • Batch Data Movement

    Scheduled bulk data movement — the workhorse pattern.

  • Streaming

    Real-time event flows over message brokers — Kafka, Kinesis, Pulsar.

  • API Integration

    Programmatic pull and push to SaaS and partner systems.

  • Unstructured Intake

    PDFs, documents, images, audio — the raw material for AI.

  • IoT & Edge Sources

    Sensors, devices, telemetry, clickstreams — data captured at the edge.

  • CDC & Replication

    Capturing every change from operational databases — log-based replication for analytics.

Foundation

Storage Architecture

Where data lives.

  • Operational Databases

    Postgres, MySQL — where applications actually live.

  • Cloud Data Warehouse

    Snowflake-class storage built for SQL analytics at scale.

  • Object & Lake Storage

    S3-class storage for raw files, media, archives, and lakes.

  • Lakehouse

    Delta, Iceberg — warehouse speed on lake storage.

  • Specialty Stores

    Time-series, graph, document, search, vector — fit-for-purpose.

  • Backup & Disaster Recovery

    Archival, retention, recovery — when the worst happens.

Foundation

Compute & Runtime

Where data work runs.

  • SQL Engines

    Postgres, MySQL, SQL Server, Oracle — the relational workhorses. Most data work actually runs here.

  • Distributed Query

    Trino, Spark, BigQuery, Snowflake compute, DuckDB — MPP and cluster-scale SQL.

  • Serverless Compute

    Lambdas, Cloud Functions, Cloud Run — pay-per-invocation event compute.

  • Stream Compute

    Flink, Kafka Streams, Spark Streaming — processing continuous data.

  • GPU & ML Compute

    Accelerated and special purpose hardware for training and serving models.

  • General Compute

    VMs, containers, Kubernetes — the runtime substrate that hosts the rest.

The Data Capability Map

Tap a group to see its capabilities.

Technology & Standards

Platforms · Tools · Vendors · Best Practices · Templates · Build vs Buy