Systematically Exclude Users from Activity Stream
Ensure your activity stream only includes "real" customers by removing fraudulent, test, and internal users automatically.
How it works
You can now use the Is a Remove-List Activity checkbox under the Transformation Processing Config to automatically remove certain users from the activity stream.
Used for:
- Excluding fraud users
- Excluding test and internal users
Steps
- Create an activity transformation with the users you wish to exclude from the activity stream.
- Save
- Go to Processing Config > Advanced Configurations
- Check "Is a Remove-List Activity"
- Validate and Merge the transformation
Example SQL
Remove internal company users
This transformation will remove any customer with an email domain that matches the company email domain
SELECT
DISTINCT
email as activity_id
, '2021-01-01'::TIMESTAMP as ts
, NULL as anonymous_customer_id
, 'internal_users' as activity
, email as customer
, NULL as feature_1
, NULL as feature_2
, NULL as feature_3
, NULL as revenue_impact
, NULL as link
FROM prod_db.users
WHERE lower(email) like '%@mycompany.com'
or lower(email) like '%+test%'
Remove bots from web traffic data
This transformation removes any web visitor that has more than 10 page views and the average time between those page views is less than 1 second (very fast behavior browsing behavior signals a bot)
SELECT
v.anonymous_id as activity_id
, '2021-01-01'::timestamp as ts
, 'robot_web_visitors' as activity
, v.anonymous_id as anonymous_customer_id
, NULL as customer
, NULL as feature_1
, NULL as feature_2
, NULL as feature_3
, NULL as revenue_impact
, NULL as link
FROM (
SELECT
p.message_id
, p.anonymous_id
, p.timestamp
, LAG(p.timestamp) over
(PARTITION BY p.anonymous_id ORDER BY p.timestamp ASC)
as last_pageview_ts
FROM segment.pages p
) v
WHERE true
and date_diff('minute', v.last_pageview_ts, v.timestamp) < 30 -- only look at pages within the same session
and v.last_pageview_ts is not null
GROUP BY 1,2,3,4
HAVING
avg(date_diff('second', v.last_pageview_ts, v.timestamp)) <= 1 -- views are more then 1 sec apart
and COUNT(1) > 10 -- more than 10 pages views
FAQ:
-
What should I name the activity?
Choose an activity that is unique from the other activities in the activity stream, but other than that you can name it anything you want. This activity won't show up in the dataset index or in dataset, so other Narrator users won't see it unless they are looking at the transformation. -
Can I remove a customer via their anonymous ID?
Yes, all you need to do is make sure that anonymous_customer_id is available in the transformation and the processing will remove any records from the activity stream that match this value.
We'd love to hear what you think
User feedback helps Narrator get better. ❤️ Email us @ [email protected]