Transforming real time data at scale

Wunderkind background
Wunderkind is the leading AI-driven performance marketing solution that collects consent-based, first-party data and identifies anonymous traffic for brands in order to scale hyper-personalized one-to-one messages. Brands lean on the Wunderkind Identity Network, a proprietary database recognizing 9 billion devices and 1 billion consumers, and observes 2 trillion digital transactions every year, to trigger the most impactful offers to their target audience at the right moment and in the right channel. This proprietary data is accessed by Wunderkind’s Autonomous Marketing Platform, an AI engine that integrates seamlessly into a brands’ existing ESP to boost performance across email, text and advertising channels.
Wunderkind is the only performance solution that guarantees a lift in revenue for its clients and delivers over $5 billion in directly attributable revenue annually for brands across a number of industries, often ranking as a top 3 revenue channel in clients’ own analytics platforms. Brands such as Harley-Davidson, Perry Ellis and Shoe Carnival partner with Wunderkind to drive top-line revenue through its guaranteed results.
Unique challenges
Wunderkind operates across thousands of websites and applications, collecting information from end users on behalf of our clients. This staggering amount of data ranges into the trillions of rows across pebibytes.
Data is collected at rates reaching hundreds of thousands of events per second, and all of it must be analyzed and acted upon in real time.
This analysis powers the engines behind our decision-making, which in turn activates target marketing campaigns for our clients. This allows for highly granular targeting and personalization of content and promotions.
The problem
We are a highly talented but small team that needs to quickly and easily manage our data and workflows. We also require tools that can scale to meet the demand without requiring huge time and resource investments.
The existing tool landscape left us with either bespoke custom implementations to handle our use cases or one of the incumbent ETL solutions, such as Spark or Dataflow.
The underlying problem we faced spanned a few different dimensions:
- Existing technologies require specialists to operate at scale, of which we had very few on the team.
- Due to our somewhat unique requirements, the existing ETL solutions didn’t work well out of the box.
The requirements
Our requirements were simple:
- Support for analyzing/aggregating hundreds of thousands to millions of messages a second.
- Support for E2E latencies of sub-10s from the moment of collection to being actionable in our algorithms.
- An efficient and simple development process that any backend engineer can slot into.
- A flexible toolbox of functionality we can use to build out our pipelines.
- Out-of-the-box composability of pipelines.
- A rich community for support and ideating.
The last two requirements were key to our success. We knew we needed a setup that would allow us to break out of the existing ETL pipeline and create custom implementations when needed. Additionally, we sought an active community with strong support that we could lean on as needed.
The solution
While researching alternative solutions to technologies like Spark/Flink/Dataflow, our engineers spotted Deephaven and immediately found it could be a great solution for Wunderkind. Specifically, the live table architecture built into Deephaven caught our attention.
Deephaven was handling hundreds of thousands of messages a second without breaking a sweat.
We dove in and found that it ticked all our requirements. First impressions from performance testing found that running on local workstations, Deephaven was handling hundreds of thousands of messages a second without breaking a sweat.
Given that the performance metrics aligned, we dove into the code and found a massive toolbox of functionality that we could directly use, and then ultimately modify to our specific needs as our use cases with Deephaven grew.
The community was the last piece of the puzzle. We have to say that the Deephaven team is one of the best we have ever had the pleasure of working with. Ultimately the solution became clear that we should use Deephaven to handle our aggregations.
Before & after architecture
We leverage Redpanda for our event bus, handling 200-250k messages per second at peak, totaling billions of messages a day.
Deephaven handles the entire event load processing and data analysis via Kafka and deployed as statefulsets in a GKE cluster, split out across partitions in the various Kafka topics. The heap value ranges from 10 to 32GiB, with 1-3 CPUs per pod, each with roughly 25 total instances on average per job.
The result
We didn’t know we needed it, but we now have a full featured IDE environment for debugging analysis problems and tracing data quality issues. This allows anyone with Python experience to add additional components on the fly.
The new solution also allows for a shift from a bespoke batch-oriented and cumbersome ETL system to incrementally pre-computing analysis in real time as data is collected from real users.
After implementing Deephaven, we saw a ~20% overall reduction in costs compared to the solutions in place pre-migration.
How Wunderkind Reduced Costs by 20% with Deephaven