Refresh stale datasets by clearing the cached query results

Sometimes an analysis will require quick updates to the underlying data.

In cases where you're modifying activity logic and re-running a dataset, it's important that you have the most up-to-date data. Narrator now enables you to bust the warehouse cache when running a dataset.

How to clear the cached query results

  1. Access the advanced menu by clicking the (...) button on the top left of any dataset
  2. Select Clear Cache to clear the cache and re-run the dataset

FAQ: What are cached query results?

Many warehouses will temporarily save the results of any query that has been recently run, so that when you (or anyone else) runs the exact same query you don't need to wait for the data to re-process again. The results appear almost instantly.

Different warehouses have different configurations for caching query results, but Google BigQuery's documentation describes the following caching process:

All query results, including both interactive and batch queries, are cached in temporary tables for approximately 24 hours with some exceptions.

When you run a duplicate query, BigQuery attempts to reuse cached results. To retrieve data from the cache, the duplicate query text must be the same as the original query.

FAQ: When should I clear the cache in Narrator?

  1. You recently modified the activity transformation logic and want to force refresh the results of your dataset. Note: this should happen automatically for most transformation logic updates.
  2. You suspect that the underlying data has recently changed and those changes are not reflected in your query results.

FAQ: How can I control the caching settings?

Caching is controlled by your warehouse. You will need to go to your warehouse configuration to control the caching settings.

Happy querying 🤓


User feedback helps Narrator improve. ❤️
We'd love to hear what you think. Email us @ [email protected]