Data & Infrastructure Agility


Exploring the Technical Side of Ingesting Data into Braze

By Matt D’Abreu and Maciej Olko Mar 29, 2023

Data is central to marketing today, allowing brands to power relevant, highly personalized experiences for their customers at scale on a global level. But while we’ve seen a strong focus on data collection and data privacy over the past five years, there hasn’t always been as much focus on an equally important component of the data equation—namely, data management, which refers to the ways companies control how data moves in and out of the digital systems that provide the foundations for their customer engagement programs.

At Braze, our customer engagement platform is built on a foundation of streaming data. That architecture allows our customers to gain an in-the-moment understanding of how their users engage (or don’t, as the case may be) with their messages, their website, their mobile app, and more. Having that data—and the ability to dig deeper or act on it, as the situation dictates—makes it possible for brands to build better relationships with their audience and hit their business goals.

That’s why we prioritized making it easy to surface data from the Braze platform by building our high-volume data export feature, Braze Currents, and adding support for Snowflake Secure Data Sharing. It’s also why we offer a suite of different data ingestion options for brands looking to push data into Braze to enhance their customer engagement efforts. To provide you with a richer understanding of how to pass information into the Braze platform, let’s look at the four main ways that Braze ingests data and the nuances of each approach.

1. Cloud Data Ingestion

Cloud Data Ingestion (CDI) allows brands to integrate data directly from their data warehouse into the Braze platform for further segmentation, message targeting, and personalization. With CDI, brands can automatically sync data between the platforms on the schedule of their choice: When a sync runs, Braze will connect to your brand’s data warehouse, pulling in new data and updating relevant user profiles within Braze, supporting more relevant, effective customer experiences.

Because the Braze data model is built to support arrays and nested attributes, brands can easily sync both structured and unstructured data from any Braze-supported source (and avoid the headaches that come with building out cumbersome data pipelines). That puts more information into the hands of marketers more quickly, allowing them to better leverage data in personalization tools to support relevant, value-add experiences for consumers across the full range of devices, platforms, and channels.

This feature is designed to be a data solution for the long term—while we currently support syncs from Snowflake and Amazon Redshift, we plan to expand the Cloud Data Ingestion integration to other partners while also adding more updates to data types and platform usability in the future.

This new integration is a pre-built, direct connection that’s designed to be turnkey in nature. Your organization should be able to easily set it up right within the Braze dashboard in just a few simple steps—no complex coding required. The process is as follows:

1. Set up data table parameters in your Data Warehouse

2. Navigate to the Snowflake or Redshift pages in the Braze dashboard, under the “Technology Partners” section

3. Begin a new import sync between the two systems

4. Provide user authentication data from your data warehouse provider

5. Set a name, data type, and frequency for your integration

6. Run a quick test to ensure that the data in question is all in order

    Once you’ve got that set up, you’re ready to start taking advantage of your data from your brand’s data warehouse to support nuanced user personalization and to inform rich, responsive campaigns across the full range of digital messaging channels (e.g. email, SMS, and mobile and web channels).

    2. SDKs

    At Braze, software development kits (SDKs) serve as a foundational element of our customers’ data ingestion efforts, making it possible to collect detailed data about user attributes and their engagement within a given brand’s mobile app, website, connected TV app, and more into the Braze platform. SDKs can collect nuanced session data and user events in real time as consumers engage with your app/website/etc., supporting audience segmentation, the creation and delivery of campaigns and individualized customer journeys, and other elements of a best-in-class customer engagement program. (Check out the full range of Braze SDKs here.)

    To take advantage of Braze SDKs to support data collection, brands need to integrate them into their app or website. Different levels of data collection will require different approaches to integration, but the baseline integration for Braze SDKs is simple and straightforward, requiring limited engineering support and having minimal impact on app/website sizes. When it comes to scaling, our SDKs are built to handle data at a massive scale, with brands using them to collect and act on information for millions of daily active users without having to worry that they’ll overburden their digital systems.

    How does it work? Once the SDK has been integrated into an app or website and you’ve determined what information you want to track, it begins collecting device and session data on any individual engaging with that digital platform, even those whose identities are not known by the brand. Both these so-called “anonymous” users and “known” users (i.e. individuals who have shared their identity with the brand) will automatically have live-updating customer profiles created within Braze, allowing marketers to leverage segmentation, personalization, and triggered messages to reach even anonymous users, supporting a more tailored, meaningful brand experience across the board. Once a user does identify themselves, brands can use the data they’ve been collecting pre-identification to map that user back to any other systems they may be using, allowing them to provide a cohesive, ongoing brand experience even across different channels, platforms, and devices.

    The upshot? If you want to be able to collect, manage, and take action based on user engagement with your app or website, taking the time to integrate the relevant Braze SDK makes that process simple, automatic, and highly scalable, supporting high-volume ingestion of essential customer information.

    3. APIs

    While SDKs are great when it comes to gathering data directly from the front end of your app or website, sometimes you may want to integrate data from other sources, such as loyalty databases or your own back end. To supplement that foundational element of their data ingestion strategy, many brands turn to application programming interfaces (APIs), which are services designed to handle and respond to requests made between different systems. In this case, they make it possible for brands to pass information from internal systems and third-party solutions into Braze in real time, complementing the automatic data collection carried out by their SDKs. Our APIs are able to flexibly accept data from almost anywhere, as long as it’s formatted correctly for transmission to Braze.

    Perhaps the most common use cases we see for API-related data ingestion is when a brand is looking to upload historical data into the Braze platform. In that situation, where a company may have information they’re bringing from a previous customer engagement platform or some other relevant system, the easiest way to make that happen is by leveraging an API to do an initial transfer of user data, create user profiles, and upload users’ push tokens via the Braze platform’s API endpoints. Other use cases include:

    • Importing other user information that isn’t being tracked via your SDK into Braze. For instance, point-of-sale (POS) system information, or any other offline data that isn’t directly related to engagement with your app, website, or customer messaging.

    • Changing a given consumer’s external user ID, deleting users, or aliasing users within Braze.

    While APIs can be leveraged for a wide variety of data ingestion uses (as well as other needs, such as dynamic content personalization), they are more likely to be impacted by the scale and volume of your data ingestion efforts than SDKs. For that reason, it’s important to be mindful of how many API calls are being carried out and any relevant rate limits and functional traffic limits associated with your API endpoints.

    4. CSVs

    While SDKs and APIs support the vast majority of potential data ingestion needs and use cases associated with the Braze platform, we do offer a third native way to transfer information into Braze—namely, the ability to upload comma-separated values files, known as CSVs. This approach tends to make the most sense when used for quick, one-off use cases or for teams that lack the technical resources to set up SDKs or leverage APIs. What does that look like in practice? One example might be collecting user emails in a Google sheet, downloading that sheet as a CSV, and then uploading it into Braze to update user profiles.

    CSV uploads can be a helpful way to transfer information into Braze in a pinch, but this approach is less scalable than using APIs or SDKs to ingest data. However, there is a new way to make more effective use of the Braze CSV import function and minimize some of the scale disadvantages associated with this feature when working to upload user attributes. By taking advantage of an open source Amazon Web Services (AWS) serverless application created by the Braze Growth team and leveraging AWS Lambda and S3, Braze customers can avoid the manual splitting and uploading of large CSV files containing user attributes.

    With this application, brands can upload CSV files containing user attributes (including files that are larger than 100 megabytes) to an S3 bucket, automatically triggering a communication to Lambda that leads it to leverage the application’s code, processing the CSV file, processing it, and passing it into the Braze platform. The application avoids the manual file preparation associated with large CSV file upload in Braze and makes the process more seamless and more automated; however, the process is not as frictionless as passing data into Braze via SDK or API, so brands that are able to leverage those tools to transfer user attributes into Braze are strongly encouraged to do so.

    5. Braze Technology Partners

    The Braze platform’s three major data ingestion tools don’t exist in a vacuum. For many brands, they’re supplemented by direct integrations between Braze and best-in-class technologies designed to support the seamless movement of data into other systems. These Braze Alloys technology partners come in three distinct flavors:

    Customer Data Platforms

    Customer data platforms (CDPs) are designed to support increased data agility within your marketing technology stack, allowing you to more easily more key information in and out of different solutions—including Braze—as needed. These technologies invest in creating custom integrations with a wide variety of different systems and platforms, reducing the need for ad hoc data ingestion approaches (such as CSV imports) and supporting the seamless flow of data from other parts of your stack into Braze, where that information can be leveraged to power customer messaging and other brand experiences. These solutions include:

    • Amperity, which helps brands deliver a comprehensive, actional 360-degree view of their customers by supporting seamless data management across a wide variety of platforms and technologies.

    • mParticle, which allows marketers to manage data across the entirety of their brand’s growth stack, supporting an impactful, coherent customer journey.

    • Segment, which helps brands to collect, unify, and manage first-party data in connection with email, web, advertising, POS, and other technologies.

    • Tealium, which provides a turnkey integration ecosystem spanning web, mobile, and offline data sources and technologies.

    Analytics Solutions

    Data analytics platforms are built to allow brands to dive deep into the information at their disposal in order to find hidden insights, better understand high-level trends, and use those discoveries to inform the creation of future experiments, audiences, and segments. By integrating with Braze, these technologies make it possible for brands to leverage the insights and information contained within their analytics provider to support more impactful brand experiences powered by Braze. These solutions include:

    • Amplitude, an analytics platform designed to help brands drive growth through robust product and behavioral analytics, with a focus on supporting a better understanding of the customer and their behaviors and traits.

    • Looker, a business intelligence and big data analytics platform, allows brands to seamlessly explore, analyze, and circulate real-time business analytics to support a better understanding of their customers and the user lifecycle.

    • Mixpanel, which is an analytics solution built to help brands to analyze everything from conversions and retention to product usage, with the goal of upleveling your customer experience.

    Workflow Automation and Reverse ETL Solutions

    By leveraging workflow automation and reverse ETL solutions that have direct integrations with Braze, brands can automate the process of reshaping data to ensure that it works within the Braze platform and passing the resulting information into our customer engagement platform. These solutions include:

    • Census, an integration platform that allows you to bring the customer data that is locked in your other tools or databases into Braze without requiring engineering support.

    • Hightouch allows top brands like Auto Trader to create a live sync of data from cloud data warehouses like Snowflake, BigQuery, Redshift, and DataBricks into Braze without engineers through a process called “reverse ETL.”

    • Rudderstack enables developers to deploy real-time customer data pipelines quickly and easily between their data warehouses and the business tools that matter to them.

    "Everyone from startups to the Fortune 500 are already building a single source of truth for customer data in their warehouse,” says Tejas Manohar, Cofounder and co-CEO, Hightouch. “With Hightouch and Braze, marketing teams can activate their data warehouse to not only power reporting but also power relevant, personalized customer experiences in real-time without engineers."

    Final Thoughts

    Effective data management is key if you want to make the most of your customer engagement efforts. With Braze, we’ve made that easier by supporting a range of different data ingestion approaches designed to allow you to easily pass key customer data into the Braze platform to support audience segmentation, personalization, triggered campaigns, nuanced testing, and more.

    To learn more about how the Braze platform can support data management and work within a best-in-class marketing technology ecosystem, check out “Connected Engagement: Understanding Today’s Marketing Stacks and Ecosystems.”

    *This article was originally published on October 14, 2021. The last update occurred on March 29, 2023..


    Matt D’Abreu and Maciej Olko

    Related Content

    A phone with bubbles naming different data about a user

    A Complete Guide to First-Party Data: The Solution to a Cookieless World

    Read More

    Data & Infrastructure Agility

    The Ultimate Guide to Data Streaming Technologies

    Read More

    Braze Alloys

    Activating Your Data Warehouse for Segmentation and Personalization

    Read More

    What Effective Data Integration and Activation Looks Like

    Read More