Sync and delete catalog data

Step 1: Create a new catalog

Before creating a new Cloud Data Ingestion (CDI) integration for catalogs, you will need to create a new catalog or identify an existing catalog you want to use for the integration. There are a few ways to create a new catalog and any of these will work for the CDI integration:

Upload a CSV
Create a catalog in the Braze dashboard or during CDI setup.
Create a catalog using the Create catalog endpoint

Any changes to the catalog schema (for example, adding new fields or changing field type) must be made through the catalog dashboard before updated data is synced through CDI. We recommend making these updates when the sync is paused or not scheduled to run to avoid conflicts between your data warehouse data and the schema in Braze.

Step 2: Integrate Cloud Data Ingestion with catalog data

The setup for a catalog sync closely follows the process for user-data CDI integrations.

Set up a source table in Snowflake. You can use the names in the following example or choose your own database, schema, and table names. You may also use a view or a materialized view instead of a table.

 CREATE DATABASE BRAZE_CLOUD_PRODUCTION;
 CREATE SCHEMA BRAZE_CLOUD_PRODUCTION.INGESTION;
 CREATE OR REPLACE TABLE BRAZE_CLOUD_PRODUCTION.INGESTION.CATALOGS_SYNC (
      UPDATED_AT TIMESTAMP_NTZ(9) NOT NULL DEFAULT SYSDATE(),
      --ID of the catalog item to be created or updated
      ID VARCHAR(16777216) NOT NULL,
      --Catalog fields and values that should be added or updated
      PAYLOAD VARCHAR(16777216) NOT NULL,
      --The catalog item associated with this ID should be deleted
      DELETED BOOLEAN
 );

Set up a role, warehouse, and user and grant proper permissions. If you already have credentials from an existing sync, you can reuse them, just make sure to extend access to the catalog source table.

 CREATE ROLE BRAZE_INGESTION_ROLE;

 GRANT USAGE ON DATABASE BRAZE_CLOUD_PRODUCTION TO ROLE BRAZE_INGESTION_ROLE;
 GRANT USAGE ON SCHEMA BRAZE_CLOUD_PRODUCTION.INGESTION TO ROLE BRAZE_INGESTION_ROLE;
 GRANT SELECT ON TABLE BRAZE_CLOUD_PRODUCTION.INGESTION.CATALOGS_SYNC TO ROLE BRAZE_INGESTION_ROLE;

 CREATE WAREHOUSE BRAZE_INGESTION_WAREHOUSE;
 GRANT USAGE ON WAREHOUSE BRAZE_INGESTION_WAREHOUSE TO ROLE BRAZE_INGESTION_ROLE;

 CREATE USER BRAZE_INGESTION_USER;
 GRANT ROLE BRAZE_INGESTION_ROLE TO USER BRAZE_INGESTION_USER;

If your Snowflake account has network policies, allowlist the Braze IPs so the CDI service can connect. For a list of IPs, see the Cloud Data Ingestion.
In the Braze dashboard, navigate to Technology Partners > Snowflake, and create a new sync.
Enter connection details (or reuse existing credentials) and the source table.
Proceed to step 2 of the setup flow, select the “Catalogs” sync type, and input the integration name and schedule. Note that the name of the integration should exactly match the name of the catalog you previously created.
Choose a sync frequency and proceed to the next step.
Add the public key displayed on the dashboard to the user you created for Braze to connect to Snowflake. To complete this step, you will need someone with SECURITYADMIN access or higher in Snowflake.
Click Test Connection to ensure everything works as expected.
Save the sync, and use the synced catalog data for all your personalization use cases.

Set up a source table in Redshift. You can use the names in the following example or choose your own database, schema, and table names. You may also use a view or a materialized view instead of a table.

 CREATE DATABASE BRAZE_CLOUD_PRODUCTION;
 CREATE SCHEMA BRAZE_CLOUD_PRODUCTION.INGESTION;
 CREATE TABLE BRAZE_CLOUD_PRODUCTION.INGESTION.CATALOGS_SYNC (
    updated_at timestamptz default sysdate not null,
    --ID of the catalog item to be created or updated
    id varchar not null,
    --Catalog fields and values that should be added or updated
    payload varchar(max),
    --The catalog item associated with this ID should be deleted
    deleted boolean
 )

Set up a user and grant proper permissions. If you already have credentials from an existing sync, you can reuse them, just make sure to extend access to the catalog source table.

 CREATE USER braze_user PASSWORD '{password}';
 GRANT USAGE ON SCHEMA BRAZE_CLOUD_PRODUCTION.INGESTION to braze_user;
 GRANT SELECT ON TABLE CATALOGS_SYNC TO braze_user;

If you have a firewall or other network policies, you must give Braze network access to your Redshift instance. Allow access from the below IPs corresponding to your Braze dashboard’s region. For a list of IPs, see the Cloud Data Ingestion.

Optionally, set up a new project or dataset to hold your source table.

CREATE SCHEMA BRAZE-CLOUD-PRODUCTION.INGESTION;

Create one or more tables to use for your CDI integration with the following fields:

CREATE TABLE `BRAZE-CLOUD-PRODUCTION.INGESTION.CATALOGS_SYNC`
(
  updated_at TIMESTAMP DEFAULT current_timestamp,
  id STRING,
  payload JSON,
  deleted BOOLEAN
);

FIELD NAME	TYPE	MODE
UPDATED_AT	TIMESTAMP	REQUIRED
PAYLOAD	JSON	REQUIRED
ID	STRING	REQUIRED
DELETED	BOOLEAN	OPTIONAL

Set up a user and grant proper permissions. If you already have credentials from an existing sync, you can reuse those—just make sure to extend access to the catalog source table. The service account should have the below permissions:
- BigQuery Connection User: This will allow Braze to make connections.
- BigQuery User: This will provide Braze access to run queries, read dataset metadata, and list tables.
- BigQuery Data Viewer: This will provide Braze access to view datasets and their contents.
- BigQuery Job User: This will provide Braze access to run jobs
  
  After creating the service account and granting permissions, generate a JSON key. Refer to Keys create and delete for more information. You’ll update this to the Braze dashboard later.

If you have network policies in place, you must give Braze network access to your BigQuery instance. For a list of IPs, see the Cloud Data Ingestion.

Set up a source table in Databricks. You can use the names in the following example or choose your catalog, schema, and table names. You can also use a view or a materialized view instead of a table.

CREATE SCHEMA BRAZE-CLOUD-PRODUCTION.INGESTION;

CREATE TABLE `BRAZE-CLOUD-PRODUCTION.INGESTION.CATALOGS_SYNC`
(
  updated_at TIMESTAMP DEFAULT current_timestamp(),
  id STRING,
  deleted BOOLEAN,
  payload STRING
);

FIELD NAME	TYPE	MODE
UPDATED_AT	TIMESTAMP	REQUIRED
PAYLOAD	JSON	REQUIRED
ID	STRING	REQUIRED
DELETED	BOOLEAN	NULLABLE

Create a personal access token in your Databricks workspace.

a. Select your Databricks username, then select User Settings from the dropdown menu.
b. On the Access tokens tab, select Generate new token.
c. Enter a comment that helps you to identify this token, such as “Braze CDI”.
d. Change the token’s lifetime to no lifetime by leaving the Lifetime (days) box blank. Select Generate.
e. Copy the displayed token, and then select Done.
f. Keep the token in a safe place until you need to enter it during the credential creation step in the Braze dashboard.

If you have network policies in place, you must give Braze network access to your Databricks instance. For a list of IPs, see the Cloud Data Ingestion page.

How the integration works

Each time the sync runs, Braze will pull in all rows where UPDATED_AT is after the last timestamp synced. We recommend creating a view in your data warehouse from your catalog data to set up a source table that will fully refresh each time a sync runs. With views, you won’t need to rewrite the query each time.

For example, if you have a table of product data (product_catalog_1) with product_id and three additional attributes, you could sync the below view:

CREATE VIEW BRAZE_CLOUD_PRODUCTION.INGESTION.CATALOGS_SYNC AS 
SELECT
    CURRENT_TIMESTAMP as UPDATED_AT,
    product_id as id,
    TO_JSON(
        OBJECT_CONSTRUCT (
            'attribute_1',
            attribute_1,
            'attribute_2',
            attribute_2,
            'attribute_3',
            attribute_3)
    )as PAYLOAD FROM "product_catalog_1";

CREATE TABLE BRAZE_CLOUD_PRODUCTION.INGESTION.CATALOGS_SYNC AS
SELECT
    CURRENT_TIMESTAMP as UPDATED_AT,
    Product_id as id,
    JSON_SERIALIZE(
        OBJECT (
            'attribute_1',
            attribute_1,
            'attribute_2',
            attribute_2,
            'attribute_3',
            attribute_3)
    ) as PAYLOAD FROM "product_catalog_1";

CREATE view IF NOT EXISTS BRAZE_CLOUD_PRODUCTION.INGESTION.CATALOGS_SYNC AS (SELECT
    last_updated as UPDATED_AT,
    product_id as ID,
    TO_JSON(
      STRUCT(
      attribute_1,
      attribute_2,
      attribute_3,
      )
    ) as PAYLOAD 
  FROM `BRAZE_CLOUD_PRODUCTION.INGESTION.product_catalog_1`);

CREATE view IF NOT EXISTS BRAZE_CLOUD_PRODUCTION.INGESTION.CATALOGS_SYNC AS (SELECT
    last_updated as UPDATED_AT,
    product_id as ID,
    TO_JSON(
      STRUCT(
      attribute_1,
      attribute_2,
      attribute_3,
      )
    ) as PAYLOAD 
  FROM `BRAZE_CLOUD_PRODUCTION.INGESTION.product_catalog_1`);

The data fetched from the integration will be used to create or update items in the target catalog based on the id provided.
If DELETED is set to true, the corresponding catalog item will be deleted.
The sync won’t consume data points, but all data synced will count toward your total catalog usage; this usage is measured based on the total data stored, so you don’t need to worry about only syncing changed data.

Edit this page on GitHub

HOW HELPFUL WAS THIS PAGE?

Edit this page on GitHub