Skip to content

Braze Cloud Data Ingestion overview

Braze Cloud Data Ingestion allows you to set up a direct connection from your data warehouse to Braze to sync relevant user attributes, events, and purchases. Once synced to Braze, this data can be leveraged for use cases such as personalization or segmentation. Cloud Data Ingestion can connect to Snowflake and Redshift data warehouses.

How it works

With Braze Cloud Data Ingestion, you set up an integration between your data warehouse instance and Braze workspace to sync data on a recurring basis. This sync runs on a schedule you set, and each integration can have a different schedule. Syncs can run as frequently as every 15 minutes or as infrequently as once per month. For customers who need syncs to occur more frequently than 15 minutes, please speak with your customer success manager, or consider using REST API calls for real-time data ingestion.

When a sync runs, Braze will directly connect to your data warehouse instance, retrieve all new data from the specified table, and update the corresponding user profiles on your Braze dashboard. Each time the sync runs, any updated data will be reflected on the user profiles.

Supported data types

Sync user attributes, custom events, and purchases through Cloud Data Ingestion. Data for a user can be updated by external ID, user alias, or Braze ID. CDI can support nested custom attributes, arrays of objects, and can be used to updated subscription statuses.

What gets synced

Each time a sync runs, Braze looks for rows that have not previously been synced. We check this using the UPDATED_AT column in your table or view. Any rows where UPDATED_AT is later than the last synced row will be selected and pulled into Braze.

In your data warehouse, you add the following users and attributes to your table, setting the UPDATED_AT time to the time you add this data:

UPDATED_AT EXTERNAL_ID PAYLOAD
2022-07-19 09:07:23 customer_1234 {
    “attribute_1”:”abcdefg”,
    “attribute_2”: {
        “attribute_a”:”example_value_2”,
        “attribute_b”:”example_value_2”
    },
    “attribute_3”:”2019-07-16T19:20:30+1:00”
}
2022-07-19 09:07:23 customer_3456 {
    “attribute_1”:”abcdefg”,
    “attribute_2”:42,
    “attribute_3”:”2019-07-16T19:20:30+1:00”,
    “attribute_5”:”testing”
}
2022-07-19 09:07:23 customer_5678 {
    “attribute_1”:”abcdefg”,
    “attribute_4”:true,
    “attribute_5”:”testing_123”
}

During the next scheduled sync, all rows with a UPDATED_AT timestamp later than the most recent timestamp will be synced to the Braze user profiles. Fields will be updated or added, so you do not need to sync the full user profile each time. After the sync, users will reflect the new updates:

1
2
3
4
5
6
7
8
9
10
11
12
{
  "external_id":"customer_1234",
  "email":"[email protected]",
  "attribute_1":"abcdefg",
  "attribute_2":{
        "attribute_a":"example_value_1",
        "attribute_b":"example_value_2"
    },
  "attribute_3":"2019-07-16T19:20:30+1:00",
  "attribute_4":false,
  "attribute_5":"testing"
}
1
2
3
4
5
6
7
8
9
{
  "external_id":"customer_3456",
  "email":"[email protected]",
  "attribute_1":"abcdefg",
  "attribute_2":42,
  "attribute_3":"2019-07-16T19:20:30+1:00",
  "attribute_4":true,
  "attribute_5":"testing"
}
1
2
3
4
5
6
7
8
9
{
  "external_id":"customer_5678",
  "email":"[email protected]",
  "attribute_1":"abcdefg",
  "attribute_2":42,
  "attribute_3":"2017-08-10T09:20:30+1:00",
  "attribute_4":true,
  "attribute_5":"testing_123"
}

Data point usage

Each attribute sent for a user will consume one data point. It’s up to you to only send the required data. Data point tracking for Cloud Data Ingestion is equivalent to tracking through the /users/track endpoint. Refer to Data points for more information.

Data setup recommendations

Only write new or updated attributes to minimize consumption

We will sync all attributes in a given row, regardless of whether they are the same as what’s currently on the user profile. Given that, we recommend only syncing attributes you want to add or update.

Use a UTC timestamp for the UPDATED_AT column

The UPDATED_AT column should be in UTC to prevent issues with daylight savings time. Prefer UTC-only functions, such as SYSDATE() instead of CURRENT_DATE() whenever possible.

Separate EXTERNAL_ID from PAYLOAD column

The PAYLOAD object should not include an external id or other id type.

Removing an attribute

You can set it to’ null’ if you want to completely remove an attribute from a user’s profile. If you want an attribute to remain unchanged, don’t send it to Braze until it’s been updated.

Create JSON string from another table

If you prefer to store each attribute in its own column internally, you need to convert those columns to a JSON string to populate the sync with Braze. To do that, you can use a query like:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
CREATE TABLE "EXAMPLE_USER_DATA"
    (attribute_1 string,
     attribute_2 string,
     attribute_3 number,
     my_user_id string);

SELECT
    CURRENT_TIMESTAMP as UPDATED_AT,
    my_user_id as EXTERNAL_ID,
    TO_JSON(
        OBJECT_CONSTRUCT (
            'attribute_1',
            attribute_1,
            'attribute_2',
            attribute_2,
            'yet_another_attribute',
            attribute_3)
    )as PAYLOAD FROM "EXAMPLE_USER_DATA";
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
CREATE TABLE "EXAMPLE_USER_DATA"
    (attribute_1 string,
     attribute_2 string,
     attribute_3 number,
     my_user_id string);

SELECT
    CURRENT_TIMESTAMP as UPDATED_AT,
    my_user_id as EXTERNAL_ID,
    JSON_SERIALIZE(
        OBJECT (
            'attribute_1',
            attribute_1,
            'attribute_2',
            attribute_2,
            'yet_another_attribute',
            attribute_3)
    ) as PAYLOAD FROM "EXAMPLE_USER_DATA";
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
CREATE OR REPLACE TABLE BRAZE.EXAMPLE_USER_DATA (attribute_1 string,
     attribute_2 STRING,
     attribute_3 NUMERIC,
     my_user_id STRING);

SELECT
    CURRENT_TIMESTAMP as UPDATED_AT,
    my_user_id as EXTERNAL_ID,
    TO_JSON(
      STRUCT(
        'attribute_1' AS attribute_1,
        'attribute_2'AS attribute_2,
        'yet_another_attribute'AS attribute_3
      )
    ) as PAYLOAD 
  FROM BRAZE.EXAMPLE_USER_DATA;

Using the UPDATED_AT timestamp

We use the UPDATED_AT timestamp to track what data has been synced successfully to Braze. If many rows are written with the same timestamp while a sync is running, this may lead to duplicate data being synced to Braze. Some suggestions to avoid duplicate data:

  • If you are setting up a sync against a VIEW, do not use CURRENT_TIMESTAMP as the default value. This will cause all data to sync every time the sync runs because the UPDATED_AT field will evaluate to the time our queries are run.
  • If you have very long-running pipelines or queries writing data to your source table, avoid running these concurrently with a sync, or avoid using the same timestamp for every row inserted.
  • Use a transaction to write all rows that have the same timestamp.

Example table configuration

We have a public GitHub repository for customers to share best practices or code snippets. To contribute your own snippets, create a pull request!

Sample data formatting

Any operations that are possible through the Braze /users/track endpoint are supported through Cloud Data Ingestion, including updating nested custom attributes, adding subscription status, and syncing custom events or purcahses.

You may include nested custom attributes in the payload column for a custom attributes sync.

1
2
3
4
5
6
7
8
9
10
11
12
{
      "most_played_song": {
        "song_name": "Solea",
        "artist_name": "Miles Davis",
        "album_name": "Sketches of Spain",
        "genre": "Jazz",
        "play_analytics": {
            "count": 1000,
            "top_10_listeners": true
        }
      }
}

To sync events, an event name and timestamp, as a string in ISO 8601 or in yyyy-MM-dd'T'HH:mm:ss:SSSZ format, are required. Other fields including app_id and properties are optional.

1
2
3
4
5
6
7
8
9
{
    "app_id" : "your-app-id",
    "name" : "rented_movie",
    "time" : "2013-07-16T19:20:45+01:00",
    "properties": {
        "movie": "The Sad Egg",
        "director": "Dan Alexander"
    }
} 

To sync purchase events, event name, product_id, currency, price, and timestamp (as a string in ISO 8601 or in yyyy-MM-dd'T'HH:mm:ss:SSSZ format) are required. Other fields, including app_id, quantity and properties are optional.

1
2
3
4
5
6
7
8
9
10
11
12
{
    "app_id" : "11ae5b4b-2445-4440-a04f-bf537764c9ad",
    "product_id" : "Completed Order",
    "currency" : "USD",
    "price" : 219.98,
    "time" : "2013-07-16T19:20:30+01:00",
    "properties" : {
        "products" : [ { "name": "Monitor", "category": "Gaming", "product_amount": 19.99, },
        { "name": "Gaming Keyboard", "category": "Gaming ", "product_amount": 199.99, }
        ]
    }
}

Product limitations

Limitations Description
Number of integrations There is no limit on how many integrations you can set up. However, you will only be able to set up one integration per table or view.
Number of rows There is no limit on the number of rows you can sync. Each row will only be synced once, based on the UPDATED column.
Attributes per row Each row should contain a single user ID and a JSON object with up to 50 attributes. Each key in the JSON object counts as one attribute (i.e., an array counts as one attribute).
Data type You can sync user attributes, events, and purchases through Cloud Data Ingestion.
Braze region This product is available in all Braze regions. Any Braze region can connect to any Snowflake region
Snowflake region You can connect your Snowflake instance in any region or cloud to Braze using this product.



WAS THIS PAGE HELPFUL?
New Stuff!