Join as a Senior BI & Data Architect for a fixed-term engagement to reshape the foundation of a growing data ecosystem. This role drives the technical direction of a unified data platform, centered on Databricks Lakehouse, with a strong emphasis on governance, performance, and long-term maintainability.
Key Responsibilities
- Define and implement a structured Unity Catalog environment, organizing data into clear, secure, and well-documented catalogs, schemas, and volumes to establish organization-wide data trust.
- Lead the modernization of legacy data workflows by migrating complex logic into modular, reusable components within the Lakehouse, improving clarity and reducing technical debt.
- Design and maintain a transformation framework using open-source tools such as Delta Live Tables or custom Spark pipelines in Python and SQL, ensuring scalability without dependency on proprietary SaaS platforms.
- Optimize query performance by analyzing Spark execution plans and tuning workloads—addressing data skew, refining join patterns, and improving efficiency across large datasets.
- Enhance compute efficiency through intelligent use of Z-Ordering, Liquid Clustering, partitioning strategies, and Serverless SQL Warehouse configurations to deliver high performance at lower cost.
- Develop automated CI/CD pipelines using GitHub Actions or similar tools to streamline testing, validation, and deployment of data models and pipelines.
- Design the semantic layer in Omni to enable fast, intuitive self-service reporting with minimal latency for interactive dashboards.
- Occasionally build executive-facing visualizations, translating complex data into clear, insight-driven narratives.
- Collaborate with teams across Finance, Sales, Product, Marketing, and Trust & Safety to turn business challenges into scalable, future-proof data solutions.
- Communicate technical findings and cost-performance trade-offs to leadership, aligning data strategy with business outcomes.
Technology Environment
Work within a modern data stack including Databricks Lakehouse, Unity Catalog, Delta Live Tables, Python, SQL, Spark, Omni, GitHub Actions, Z-Ordering, Liquid Clustering, and Serverless SQL Warehouses.
