Narrator Documentation

What is Narrator?

The Basics

What is Narrator?

Narrator is a fully managed data system that transforms data in your warehouse into a single standardized table called an Activity Stream. That table becomes the source for all dependent datasets in your data system (materialized views, views, datasets for analysis, etc). With a single, standardized table, every dataset uses the same source of truth so numbers always match, dependency management isn't a nightmare, and BI table updates become simple!

What is an activity?

Activities represent the important actions that your customer does (Completed Order, Viewed Page, etc). Everything in Narrator is built using activities. They each have a customer identifier (typically email address), a timestamp of when it occurred, and a set of feature columns that are specific to each activity.

What is the Activity Stream?

The Activity Stream is an 11 column, time series table with all of your customer activities. As customers engage with your company (purchase products, call support, churn, visit the website), more activities are added to the Activity Stream. You can think of the Activity Stream as a timeline of all interactions that all customers have had with your company.

Narrator uses activity transformations (simple SQL queries that you define) to add activities to the Activity Stream. The activity transformation becomes the definition that will be used to define a customer concept (activity) across your company.

Building datasets with an Activity Stream

The Activity Stream can be reassembled into any table you need for reporting, data science, or analysis. Narrator is SO powerful because any dataset can be created with this single data model. To make this possible, Narrator uses an innovative new-approach to joins called Relationships.

Because the Activity Stream structure is standardized, any query can be automatically generated using the Narrator platform. Dataset was built to handle the complex SQL logic required to query the Activity Stream (ie. self joins, relational joins, etc), so your queries are always accurate, optimized for performance, and easy to read!

Check out the dataset question bank for examples of the types of data tables that can be generated using the Activity Stream.

Why it's so powerful

  • Single source of truth / Universal definitions
    All tables are derived from the same centralized source (the Activity Stream). Once you define a concept (activity), that definition can be used by everyone at your company. No more re-defining what it means to "churn" each time you have a new question related to churn.
    If you make a change to the definition, those changes will automatically cascade to all datasets using it without having to change the data structure.

  • Bridge systems without foreign keys
    Foreign keys don't always exist so we built a model that joins on customer, time, and occurrence. By relating activities by customer and time, you can always bridge systems without relying on a foreign key. (ex. What was the last webpage someone visited before they submitted a support ticket, but only if it happened within 30 minutes?)

  • It's fast!
    The Activity Stream is a long columnar table. This is what warehouses were built for.

  • One layer of dependency
    One table, one dependency, so one layer of logic to trace the source of any piece of information. All datasets are derived from the same source. No more web of dependencies to manage.

  • Migrations are easy when data sources change
    Because the table structure doesn't change, just the definitions, migrations are easy.

  • A universal data model means analyses can be templatized and shared
    Because your data is in a standardized format, it becomes easy to share and reproduce analytical approaches, enabling data scientist and analysts to share and iterate on their methods in a hassle-free way.

  • Queries are standardized and optimized for performance
    The Dataset tool was built to handle complicated queries that you'll need to work with an Activity Stream (self joins, relational joins, window functions, etc). The SQL it generates is easy to read, always accurate, and optimized for performance.

What else?

In addition to the fundamentals, Narrator has taken care of the data engineering things so your team can focus on exploring the hard questions with data

  • Identity Resolution - automatically manage anonymous user ID mapping and handle cases when multiple users share a single device
  • Query Editor - query your warehouse directly from the Narrator UI
  • Validation on Transformations - catch errors with your SQL before the're merged to production
  • Automated Alerting - lets you know if the core assumptions about your data ever change
  • Narrative Editor - to share actionable analyses and keep them up to date

Who is it for?

Data teams! Data engineers, data analysts, data scientists, and BI engineers.

Narrator was built by data folks who have felt the pain caused by traditional data models. We built this platform to help people like us.


Updated 7 months ago

What's Next

Create an example activity and assemble a dataset

Getting Started Tutorial

What is Narrator?

Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.