Event Delivery Semantics
Please note that this content only applies to the flat file event data we send to Data Warehouse partners (Google Cloud Storage, Amazon S3, and Microsoft Azure Blob Storage). For content that applies to the other partners, please check their respective pages.
Currents for Data Storage is a continuous stream of data from our platform to a storage bucket on one of our data warehouse partner connections. Currents writes Avro files to your storage bucket at regular thresholds, allowing you to process and analyze the event data using your own Business Intelligence toolset.
As a high-throughput system, Currents guarantees “at-least-once” delivery of events, meaning that duplicate events can occasionally be written to your storage bucket. This can happen when events are reprocessed from our queue for any reason.
If your use cases require exactly-once delivery, you can use the unique identifier field that is sent with every event (
id) to deduplicate events. Since the file leaves our control once it’s written to your storage bucket, we have no way to guanrantee deduplication from our end.
All timestamps exported by Currents are sent in the UTC timezone. For some events where it is available, a timezone field is also included, which delivers the IANA format of the user’s local timezone at the time of the event.
The Braze Currents data storage integrations output data in the
.avro format. We chose Avro because it is a flexible data format that natively supports schema evolution and is supported by a wide variety of data products:
- Avro is supported by nearly every major data warehouse.
- In the event that you desire to leave your data in S3, Avro compresses better than CSV and JSON, so you pay less for storage and potentially can use less CPU to parse the data.
- Avro requires schemas when data is written or read. Schemas can be evolved over time to handle the addition of fields without breaking.
Currents will create a file for each event type using the format below:
Can’t see the code because of the scroll bar? See how to fix that here.
||The prefix set for this Currents integration.|
||For internal use by Braze. Will be a string such as “prod-01”, “prod-02”, “prod-03”, or “prod-04”. All files will have the same cluster identifier.|
||The identifier for type of connection. Options are “S3”, “AzureBlob”, or “GCS”.|
||The unique ID for this Currents integration.|
||The type of the event in the file (see event list below).|
||The hour that events are queued in our system for processing. Formatted YYYY-MM-DD-HH.|
||Used to version
||For internal use by Braze. Single letter.|
||For internal use by Braze. Integer.|
||For internal use by Braze. Integer.|
File naming conventions may change in the future, Braze recommends searching all keys in your bucket that have a prefix of <your-bucket-prefix>.
Data files will be written to your storage bucket at set thresholds:
|Amazon AWS S3||Every 5 minutes, 15,000 events, or on the hour.|
|Microsoft Azure Blob Storage||Every 5 minutes, 5,000 events, or on the hour.|
|Google Cloud Storage||Every 5 minutes, 5,000 events, or on the hour.|
Currents will never write empty files.
Avro Schema Changes
From time to time, Braze may make changes to the Avro schema when fields are added, changed, or removed. For our purposes here, there are two types of changes: breaking and non-breaking. In all cases, the
<schema-id> will be advanced to indicate the schema was updated.
When a field is added to the Avro schema, we consider this a non-breaking change. Added fields will always be “optional” Avro fields (i.e. with a default value of
null), so they will “match” older schemas according to the Avro schema resolution spec. These additions should have no effect on existing ETL processes as the field will simply be ignored until it is added to your ETL process. We recommend that your ETL setup is explicit about the fields it processes to avoid breaking the flow when new fields are added.
While we will strive to give advance warning in the case of all changes, we may include non-breaking changes to the schema at any time.
When a field is removed from or changed in the Avro schema, we consider this a breaking change. Breaking changes may require modifications to existing ETL processes as fields that were in use may no longer be recorded as expected.
All breaking changes to the schema will be communicated in advance of the change.