Skip to content

System Overview

The Blue Compass RV data pipeline is designed to centralize data ownership, ensure comprehensive data collection, and enable rapid analysis and personalization.

The pipeline follows a linear flow from data generation to activation:

  1. Event Tracking: User interactions are captured via Cloudflare, Tag Manager, and Rudderstack.
  2. Extract & Load: Data is ingested from various sources (Ads, CRM, etc.) using Airbyte and stored in the data lake.
  3. Transform & Store: Raw data is processed and modeled in Databricks, creating a clean and reliable source of truth.
  4. Visualize & Activate: Modeled data is consumed by Power BI for reporting and Braze for customer engagement.

This detailed view illustrates the specific components and their interactions:

  • Sources: DigitalOcean (API Proxy), Airbyte (Ad Networks), and Cloudflare/Rudderstack (Web Events).
  • Storage: Data is staged in ADLS/S3 buckets before being processed.
  • Processing: Databricks handles the heavy lifting, moving data through Raw, Cleaned, and Production catalogs.
  • Identity: Rudderstack resolves user identities across devices and sessions.
  • Destinations: The final output powers business intelligence dashboards and marketing automation platforms.