How Braze Effectively Manages Holiday Messaging Capacity
By its nature, customer engagement is cyclical, with brands experiencing quieter and busier periods throughout the year. But for a wide range of companies—from retailers to travel brands and beyond—the winter holiday season stands as the pinnacle of their engagement efforts, the time of year when they’re building out the biggest campaigns and sending the most messages.
This dynamic has been around for long enough that it predates many of today’s most popular customer messaging channels (for instance, push notifications, which only came to market in 2009). But the trend isn’t just holding on, it’s growing—beginning with religious and cultural holidays like Christmas and Hanukkah, then expanding with the advent of Black Friday, Cyber Monday, Singles Day, and other winter shopping holidays. Our research has found that the holiday shopping season continues to expand, giving brands more opportunities to engage customers, but also putting more pressure on the systems and processes they depend on during this key time of year.
That’s where Braze comes in. Meeting the customer engagement demands of the holiday shopping season has been front of mind for Braze since our founding in 2011, but doing that successfully takes careful planning, teamwork, and thoughtful use of technology at scale. And that work has paid off, allowing us to maintain 100% uptime while sending more than 21 billion messages, including 3.7 billion emails, over 3 billion webhooks, over 2 billion Content Card impressions—nearly 370 million in-app and in-browser messages, and nearly 12 billion push notifications across iOS, Android and web—at peak speeds of over 21 million messages per minute. Read on for a deeper look at the planning and processes that make it possible for us to successfully support some of the world’s biggest brands throughout the holiday shopping season.
Why the Holiday Season Matters at Braze
Braze is built to power contextually relevant, real-time interactions between consumers and brands they love. And that doesn’t go away when the holidays roll around. We want to empower brands to look beyond generic sale notifications and focus instead on providing wanted, useful messages that allow them to engage with their customers in a first-class way.
That focus means that our customers’ engagement programs during the holiday season are not about sending blast campaigns on Cyber Monday and calling it a day. Instead, they might send a message earlier in the season, see who clicks and who opens that message, and then adjust their strategy based on the findings in an iterative way, allowing them to optimize each customer’s journey over the course of the weeks leading up to Thanksgiving and on into Black Friday and the shopping holidays.
This tailored, customer-centric approach is the gold standard of customer engagement, but it also has technical implications. For one thing, it means that Braze can’t just scale up our sending capabilities for Black Friday and assume that we’re covered. We have to be very thoughtful about capacity planning and incident management when it comes to the holiday season. To ensure that we’re able to stay ahead of those demands, we’ve worked hard to implement comprehensive processes that allow us to predict, plan for, and meet our customers’ holiday engagement needs.
How Braze Safeguards Holiday Customer Engagement at Scale
Braze has a great track record when it comes to supporting our customers during the winter holiday shopping season. A big part of that success comes from understanding that it’s not about a particular day, or even a set of days; it’s about having a holistic understanding of the peaks and valleys of the season, both for high message-volume brands and for our customer base as a whole. That understanding has allowed us to scale our preparations and planning for the holiday seasons as our business scales and to be proactive about making the changes needed to support the massive send volumes associated with this time of year.
At Braze, our holiday season capacity management program has two key pillars:
1. Capacity Planning
It’s been clear since our founding that the winter holiday shopping season was the highest-volume time of year for the Braze customer base. But how high those volumes get has evolved swiftly and significantly over time, with especially strong growth during the past three years. To ensure that the Braze platform and the teams that support it are ready for the significant holiday volume increases, Braze does significant capacity planning at both the customer and system levels.
This effort is overseen by our Capacity Planning Change Advisory Board (CAB), which is made up of site reliability engineers, devops engineers, software developers, database administrators, and engineering leadership here at Braze. The board meets monthly throughout the year (and more frequently in the run-up to the holiday shopping season) to ensure alignment around required preparations and customer needs. Because of work done over the years to identify potential bottlenecks in our systems, we’ve been able to create a heavily federated, distributed approach where our processing servers automatically scale up to handle higher volume—and automatically scale back down when volumes are lower. The CAB looks beyond message volume and throughout to ensure that we’re covering every aspect of our process, from data ingestion and storage to our sending services and the APIs that inform real-time personalization.
To ensure that we have the information we need, award-winning Braze Customer Success Managers (CSMs) proactively partner with select customers to understand their specific marketing plans and anticipated needs. All that information is collected by the end of September and shared with the Capacity Planning Change Advisory Board, as well as with our database partner at ObjectRocket. This data, along with analysis of historic usage patterns, allows us to identify specific time periods and days within the holiday season where additional capacity is required, above and beyond the known challenges associated with Black Friday and Cyber Monday. A few years back, this process played a key role in helping us anticipate that, due to our growing presence in APAC, our Singles Day (11/11) volume was going to exceed the size suggested by historic models.
2. Risk Mitigation
While ensuring sufficient capacity and flexibility to support our customers' holiday messaging needs is a key priority, our holiday season management efforts don’t stop there. We’re also focused on identifying areas where technology-related risks associated with the season can be minimized, as we all know. issues crop up when things change. Sometimes that might be one of our customers doing new or innovative things with the Braze platform, or sending at volumes we’ve never seen from them before; other times, those changes could result from a code release or infrastructure update that has the potential to change how the Braze platform and related systems function. So to minimize the amount of change in the system, we’ve implemented a holiday code freeze at Braze. We believe this will help us reduce risk and better support our customers during this critical period.
As part of the code freeze plan, the last major code release of the year occurs at the beginning of December; after that, we don’t make any significant net-new changes or introduce any new features until the new year. The next two weeks are a soft code freeze, where our engineers are empowered to do bug fixes or minor tweaks, followed by two weeks of a hard code freeze where no non-emergency changes are made. With Braze generally making 200 to 300 code changes per week during the rest of the year, pausing updates for the full two-month time period is a non-starter; thankfully, this layered approach allows us to ensure that the Braze platform is in strong working order heading into the heart of the holiday season while minimizing change-related risks.
In addition to the code freeze, Braze has implemented a strategic staffing plan and expanded escalation paths for issues during the holiday season, so that if any issues do arise, we’re well-equipped to respond to them. This approach adds additional on-call staffing while ensuring that team-specific support (e.g. Email Team on Call, DevOps Team on Call) is always available. Each of those teams has an on-call representative, a backup on-call representative, and an escalation point on call, with any escalation chain leading to the incident commander on call, usually myself or our head of DevOps, and then, if needed, to our SVP of Engineering or CTO, ensuring that customers have all the support they need to respond to any technical issues or challenges that might crop up.
The Braze platform is built to support impactful, responsive customer engagement for brands big and small all over the world. The power of Braze doesn’t just come from the technology itself, but rather the people who work here play an essential role in making these exceptional outcomes possible for our customers. Supporting some of the world’s largest brands during the year’s biggest shopping season is a major collaborative effort that touches a range of teams across our company and requires them to stay aligned and work closely together to anticipate and avoid potential issues.
Interested in taking part in supporting next year’s holiday shopping season? Visit our careers page to learn more about Braze and to check out our open roles.
Jamie Doheny is the Senior Director, Engineering Operations & Chief of Staff, Engineering, Braze.