added

Systematically Exclude Users from Activity Stream

Ensure your activity stream only includes "real" customers by removing fraudulent, test, and internal users automatically.

1326



How it works

You can now use the Is a Remove-List Activity checkbox under the Transformation Processing Config to automatically remove certain users from the activity stream.

Used for:

  • Excluding fraud users
  • Excluding test and internal users



Steps

  1. Create an activity transformation with the users you wish to exclude from the activity stream.
  2. Save
  3. Go to Processing Config > Advanced Configurations
  4. Check "Is a Remove-List Activity"
  5. Validate and Merge the transformation





Example SQL

Remove internal company users

This transformation will remove any customer with an email domain that matches the company email domain

SELECT 
	DISTINCT
	email as activity_id
	, '2021-01-01'::TIMESTAMP as ts
	, NULL as anonymous_customer_id
	, 'internal_users' as activity
	, email as customer
	, NULL as feature_1
	, NULL as feature_2
	, NULL as feature_3 
	, NULL as revenue_impact
	, NULL as link
FROM prod_db.users
WHERE lower(email) like '%@mycompany.com'
    or lower(email) like '%+test%'



Remove bots from web traffic data

This transformation removes any web visitor that has more than 10 page views and the average time between those page views is less than 1 second (very fast behavior browsing behavior signals a bot)

SELECT 
	v.anonymous_id as activity_id
	, '2021-01-01'::timestamp as ts 

	, 'robot_web_visitors' as activity 
	, v.anonymous_id as anonymous_customer_id
	, NULL as customer 
	, NULL as feature_1
	, NULL as feature_2
	, NULL as feature_3
	, NULL  as revenue_impact
	, NULL as link
FROM (
	SELECT 
        p.message_id
        , p.anonymous_id
		, p.timestamp
		, LAG(p.timestamp) over
		 (PARTITION BY p.anonymous_id ORDER BY p.timestamp ASC) 
			as last_pageview_ts
	FROM segment.pages p  
	) v


WHERE true 
	and date_diff('minute', v.last_pageview_ts, v.timestamp) < 30 -- only look at pages within the same session
    and v.last_pageview_ts is not null 

GROUP BY 1,2,3,4

HAVING 
	avg(date_diff('second', v.last_pageview_ts, v.timestamp)) <= 1 -- views are more then 1 sec apart
	and COUNT(1) > 10 -- more than 10 pages views



FAQ:

  • What should I name the activity?
    Choose an activity that is unique from the other activities in the activity stream, but other than that you can name it anything you want. This activity won't show up in the dataset index or in dataset, so other Narrator users won't see it unless they are looking at the transformation.

  • Can I remove a customer via their anonymous ID?
    Yes, all you need to do is make sure that anonymous_customer_id is available in the transformation and the processing will remove any records from the activity stream that match this value.



We'd love to hear what you think

User feedback helps Narrator get better. ❤️ Email us @ [email protected]