Case Studies

How Tremor Video Process Data from 20B Devices and Makes Predictions in Real-Time Using FeatureBase

Processing Data from 20B Devices in Milliseconds to Dynamically Personalize Ad Content

“Tremor has successfully implemented FeatureBase as its Audience store database, dramatically shortening the time between data ingestion to data availability (from hours to minutes). As an additional benefit, the hardware footprint decreased substantially (~90%), ensuring a cost-effective solution.” Tal Mor, CTO, Tremor International

Serving personalized ads in real time requires a mutable database that can handle updates, inserts, and deletes in real-time without impacting performance, and while maintaining high-throughput ingest and low-latency querying. This is achieved far more efficiently with a database built on bitmaps than it would be on a traditional columnar database.

TL;DR

  • Tracking 20 billion devices and 35,000 attributes of those devices
  • Seeing 120 billion updates/day across the above attributes (1.4 million updates/second)
  • FeatureBase enables 1000x faster query speeds than previous Hadoop cluster
  • 90% reduction in infrastructure footprint with FeatureBase implemented
  • 70% reduction in cost (saving ~$5 million/year)
  • Time to refresh predictions dropped from 48 hours to less than a second with FeatureBase

About Tremor Video

Tremor International is a global leader in video and CTV advertising. The company’s demand-side platform Tremor Video collects data from over 20 billion devices and need to process and act on that data in real time to create accurate predictive customer segments for highly effective advertising. Tremor Video owns rich data for over 44 million US households – a compilation of data from smart TVs and almost 20 other data providers. The company combines multiple datasets including (but not limited to) set-top box data, automatic content recognition (ACR), streaming viewership behavior, customer profile information, and cross-device panel data to offer customized audience segmentation campaigns. Because of this rich dataset, Tremor Video is able to create targeted campaigns that help clients fulfill their priority KPIs due to the relevancy they can achieve through matching the right content with the right audience segment at an extremely granular level…in real time. But the power needed to run these intricate processes has not been easy to manage and is extremely expensive and complex at scale when relying on traditional columnar databases (e.g. Druid and Pinot).

Tremor Video is “processing hundreds of billions of advertising events each day, coping with scalability and performance challenges. These challenges manifest as ever-growing hardware and maintenance costs while simultaneously struggling to provide a real-time feedback loop to Ad serving machine learning logic.” Tal Mor, CTO, Tremor International

So, Why FeatureBase?

Tremor Video chose to build its architecture with FeatureBase at its core for a couple of key reasons.

Accuracy

When Tremor Video approached us, they were having to approximate campaign effectiveness using sampling and other estimation techniques in combination with offline processing and preaggregation. While this is industry standard, Tremor knew they had an opportunity to gain a competitive edge if they could move away from estimations, approximations, and preaggregations – FeatureBase’s ability to maintain high ingest rates while simultaneously allowing for immediate, flexible querying on data as soon as it hits the database, enables a higher grade of accuracy when dynamically serving ads.

Efficiency

Prior to FeatureBase, Tremor Video was forced to devote huge amounts of resources (both financial and people-based) to manage several hundred, even thousands of Hadoop servers just to batch process and preaggregate their data for segmentation and targeting. Even with these extremely large Hadoop clusters the process still took 24-72 hours before a person could be targeted with an ad. 

Tremor Video tested the performance of FeatureBase against a field of well-known candidate technologies including Cloudera (Hadoop), Druid, Vertica, ArangoDB, Redis, and Aerospike. None of these tools could meet the dual demand of throughput and latency that Tremor Video required: ingesting data at the required rate (peaking at >1M records/second) while simultaneously providing millisecond query responses. FeatureBase was able to meet (and exceed!) Tremor Video’s throughput and latency requirements, and on a fraction of the hardware due largely to its unique use of bitmaps. FeatureBase is ingesting over 100B updates and inserts per day  from Kafka topics, while also serving 1000x faster query responses than the legacy Hadoop solution, and delivering 5-10x faster query response times than the alternative solutions they evaluated – and all of this was without requiring any preaggregation of the data and on a fraction of the nodes.

Mutability (Update and Delete Functionality)

Tremor Video typically sees 120 billion events/day, which means they needed a database that could handle those 120 billion events/day (or 1.4 million events/second) across 35,000 different attributes in near real time. As Tremor Video put FeatureBase into production, they realized that their data volume would (and could!) grow so rapidly, that they would need to allow outdated data (constantly changing device IDs) to expire. In addition to FeatureBase’s unique ability to update existing data without doing a copy-on-write (upsert), the FeatureBase team also worked with Tremor Video to build and offer robust delete and Time-to-Live (TTL) functionality, preventing unbounded storage and memory growth. Tremor Video is ingesting massive amounts of data each day, so it’s not always feasible or efficient to continue building it up indefinitely. Once the data is no longer relevant to their use case, TTL allows Tremor to automatically expire a record key translation and reuse a given record’s ‘slot’ after it has been deleted. This is key to efficiently accessing data in real-time at this scale without wasting resources on irrelevant or unnecessary data. 

In Summary: 1000x faster queries at less than ⅓ the cost

Following a successful testing period, Tremor Video selected FeatureBase as their new real-time storage engine for their audience database. Upon implementation, Tremor Video has been able to reduce their production costs by 70%. Using FeatureBase, predictive segments are created within minutes as data streams in to update existing data and insert new data. Tremor Video is able to act on streaming data within minutes rather than waiting 24-72 hours for offline processes to prepare the data (and that timeframe will easily be reduced to seconds and even sub-second with further optimizations).

* FeatureBase brings dramatic improvements in speed and efficiency, but also simplicity by removing the preprocessing steps and infrastructure.

Architecture Before and After FeatureBase Implementation: 60% reduction in vCPU

What’s Next?

Tremor Video is already looking for new ways to use FeatureBase and increase their return on investment. For example, they’re planning a suite of new analytical tools for Tremor Video customers to analyze enormous amounts of data in real-time - without the costly offline processing being employed today. Current solutions require duplicate data, complex and brittle pipelines, and ever changing preaggregation code in order to serve new fields or new use cases, which in the end increases the overall data footprint, computing costs, and maintenance.

While the FeatureBase technology is critical to Tremor Video’s ability to process data from over 6 billion devices in real-time and serve dynamic ad content, we’re most proud of the experience Tremor Video has had working with our team: 

“The FeatureBase [formerly Molecula] team has demonstrated utmost dedication towards Tremor’s success, and as such served as a trustworthy partner, which we intend to rely on in the years to come.” - Tal Mor, CTO, Tremor International.

Experience how FeatureBase can bring real-time capabilities to your data stack by installing FeatureBase (open source) or starting your free FeatureBase Cloud trial today - no credit card required! 

SCHEDULE A DEMO