Snowflake customers who want to activate on prospects in the ZMP) but do not want to share their PII directly with Zeta, basically want to easily match upon the prospects without their data leaving their own Snowflake environment.

The Clean Room environment allows Zeta clients to run Zeta’s secure matching UDF in the client’s Snowflake environment. Listed below are the essential elements of this solution:

  • Encapsulate waterfall matching logic in Secure UDTF “matcher“ – encodes references to Encrypted Zeta Universe View and Encrypted Staging View Share

    • UDTF joins views using masked fields​

    • UDTF generates a table with matched users and other metadata that can be queried on the client-side (e.g. Zeta ID, match tier, etc.)​

  • Client PII data never leaves the client’s snowflake environment

  • Client doesn’t have direct read access to Zeta Global proprietary identity graph data

  • Client doesn’t have read access to the “matcher” UDF but can execute permissions on the “matcher“ UDF

Architecture

Encrypted Staging Table Schema

1. Compile all consumer data to be matched upon into one Snowflake table on AWS US East 1 (N. Virginia).

2. While the schema of the client’s encrypted staging table/view that the “matcher“ UDF is expected to run on should include all of the following columns, it is not necessary that all the columns have values. All the columns need to be normalized.

  • Required schema:

Columns

Description

cuid

Unique client specific id for each row

email_md5

md5 hash of normalized raw email addresses

postal_firstname

md5 hash of normalized raw postal firstname

postal_lastname

md5 hash of normalized raw postal lastname

postal_addressline1

md5 hash of normalized raw postal address line 1

postal_addressline2

md5 hash of normalized raw postal address line 2

postal_city

md5 hash of normalized raw postal city

postal_state

md5 hash of normalized raw postal state

postal_zip

md5 hash of normalized raw postal zip

postal_zipplus4

md5 hash of normalized raw postal zipplus4

  • Normalization steps that are performed before MD5 hashing the column data are as follows:

    • Remove white space and special characters in the beginning and end of the column value

    • Coalesce null column values to an empty string before appending the random_salt seed

    • Lower case all the input column values

Please make sure the client executes similar normalization and encryption at their end as these exact normalization steps are performed on Zeta’s snowflake account

Sample encrypted and normalized client view:

create or replace secure view STAGING_SALTED copy grants as
select a.cuid as cuid
,md5(lower(coalesce(trim(email_email,' -/\!@#$%^&*()_-=+'), ''))) as email_md5
,md5(lower(coalesce(trim(a.postal_firstname, ' -/\!@#$%^&*()_-=+'), '')||c.salt_seed)) as postal_firstname
,md5(lower(coalesce(trim(a.postal_lastname, ' -/\!@#$%^&*()_-=+'), '')||c.salt_seed)) as postal_lastname
,md5(lower(coalesce(trim(a.postal_addressline1, ' -/\!@#$%^&*()_-=+'), '')||c.salt_seed)) as postal_addressline1
,md5(lower(coalesce(trim(a.postal_addressline2, ' -/\!@#$%^&*()_-=+'), '')||c.salt_seed)) as postal_addressline2
,md5(lower(coalesce(trim(a.postal_city, ' -/\!@#$%^&*()_-=+'), '')||c.salt_seed)) as postal_city
,md5(lower(coalesce(trim(a.postal_state, ' -/\!@#$%^&*()_-=+'), '')||c.salt_seed)) AS postal_state
,md5(lower(coalesce(trim(a.postal_zip, ' -/\!@#$%^&*()_-=+'), '')||c.salt_seed))    AS postal_zip
,md5(lower(coalesce(trim(a.postal_zipplus4, ' -/\!@#$%^&*()_-=+'), '')||c.salt_seed))       AS postal_zipplus4
from CUSTOMER_UNIVERSE a join (select '<random_client_specific_seed>' as salt_seed ) c;
CODE

The client-specific seed to use in the above step will be provided by Zeta.

Reach out to David Konshuh (dkonshuh@zetaglobal.com), the Clean Room Product Manager, and Rameses Alfonzo (RAlfonzo@zetaglobal.com), DevOps, to coordinate on the client-specific seed.

Setup Instructions (Client-Side)

Part A: Creating a Secure SHARE on the Client’s Snowflake Account

The setup example assumes the following:

  • The secure view STAGING_SALTED was created in the schema US on the database called SALES.

  • The view will be shared with the account zetaglobal.

  • The outbound Share CLEANROOM_SHARE will contain shared objects.

  • The role ACCOUNTADMIN is used or an alternate role that can create share on account and the ability to grant privileges on the objects to be shared.

Step

Notes

1. Create a secure share on Snowflake Account.

create SHARE CLEANROOM_SHARE;

Create the container for shared objects.

2. Grant access to the database and schema.

grant usage on database SALES to share CLEANROOM_SHARE;

grant usage on schema SALES.US to share CLEANROOM_SHARE

The Share requires usage privilege to all databases and schemas for shared objects.

3. Share the encrypted view.

grant select on view SALES.US.STAGING_SALTED to share CLEANROOM_SHARE

4. Grant access to the share to Zeta Global’s Snowflake account.

alter share CLEANROOM_SHARE add account = zetaglobal ;

Part B: Create a Database from a Share Imported from Zeta Global

The example assumes the following:

  • Zeta Global will provide a Secure Share named ZETA_CLEANROOM to the Client.

  • The Share includes the database ZETA, the schema CLEANROOM, and the function MATCHER.

  • The Client will create an imported database named ZETA_CLEANROOOM.

  • The role ACCOUNTADMIN or an alternate role is used to execute the statements below.

Step

Description

1. show shares;

List incoming and outgoing Shares in the account - an inbound share called ZETAGLOBAL.ZETA_CLEANROOM should be listed once Zeta Global has completed the setup.

2. create database ZETA_CLEANROOM from share ZETAGLOBAL.ZETA_CLEANROOM;

Create a database using the inbound share.

3. grant imported privileges on database ZETA_CLEANROOM to role <taget_role>;

By default the role ACCOUNTADMIN or the role that created the database is granted usage/read privileges to all objects in the database. Use the command to grant access to any other role as needed.

4.desc function ZETA_CLEANROOM.CLEANROOM.MATCHER();

Verify that the function is available on the imported database.

5. create table sales.us.cleanroom_matched as select * from table(ZETA_CLEANROOM.CLEANROOM.MATCHER())

Test the matcher function. Note that this may be named differently per Zeta client.

6. grant select on sales.us.cleanroom_matched to share CLEANROOM_SHARE;

The resulting table should be shared back with Zeta Global for further activation or insights.

Matcher UDF Execution

Sample matcher UDF execution SQL on the client-side:

create or replace table sales.us.cleanroom_matched as select * from table(ZETA_CLEANROOM.CLEANROOM.MATCHER());
CODE

Schema of the matched output that the “matcher“ UDF is expected to generate:

Columns

Data type

Description

cuid

VARCHAR

Unique client specific id for each row

postal_firstname

VARCHAR

md5 hash of normalized raw postal firstname

postal_lastname

VARCHAR

md5 hash of normalized raw postal lastname

postal_addressline1

VARCHAR

md5 hash of normalized raw postal address line 1

postal_addressline2

VARCHAR

md5 hash of normalized raw postal address line 2

postal_city

VARCHAR

md5 hash of normalized raw postal city

postal_state

VARCHAR

md5 hash of normalized raw postal state

postal_zip

VARCHAR

md5 hash of normalized raw postal zip

postal_zipplus4

VARCHAR

md5 hash of normalized raw postal zipplus4

email_md5

VARCHAR

md5 hash of normalized raw email addresses

zeta_cid

NUMBER

zeta customer id if there is a match in Zeta’s Identity Graph otherwise returns NULL

tier_id

VARCHAR

1 – match found on fullname + address

2 – match found on email_md5

3 – match found on lastname + address

4 – match found on address only

NA - no match found

dt

TIMESTAMP_NTZ

date the connection between the linkages was established on Zeta’s side, NULL if no match found

rank

NUMBER

rank of the connection, derived from recency, in the event that an input user identifier has multiple zeta_cid linked to it

tier_id, dt, and rank are metadata fields for the matching run on the Zeta Identity Graph.

Match rate calculation:

select COUNT(distinct zeta_cid)*100/COUNT(*) as match_rate from sales.us.cleanroom_matched;
CODE

Snowflake Compute Cost Estimates

Clients will incur Snowflake compute costs to use the cleanroom. Cost estimate assumptions include:

  • Snowflake has a list price of $4/credit.

  • A Large (L) Snowflake warehouse consumes 8 credits/hour.

  • A 500K row input table may require up to 15 minutes to resolve in an L warehouse (as per Zeta’s testing).

Therefore, ~1M input records may take ~30 minutes of an L warehouse’s computing power, at an hourly rate of $4/credit * 8 credits/hour = $32/hour.

2021 Zeta Clean Room.pdf