Remote Hybrid $158,000 - $168,000 a year

Plume Clinic is hiring a Senior Data Engineer (Data + Applied AI)

About the Role

Senior Data Engineer (Data + Applied AI)

What You'll Do

Design and manage end-to-end data pipelines in cloud data warehouses, ensuring reliability, scalability, and compliance with healthcare regulations. You'll implement data transformation workflows using dbt across multiple layers, enforce data quality through automated testing, and maintain clear documentation and lineage tracking.

Develop and optimize Airflow DAGs to orchestrate complex data workflows, including scheduling, error recovery, and alerting. Build dimensional models and data marts that support both business intelligence and machine learning use cases, following established modeling standards.

Integrate data from diverse sources—including electronic health records, payment systems, and third-party APIs—into a unified platform. Apply strict data handling protocols for PHI and PII, implementing masking, tokenization, and access controls across all systems.

Architect and deploy retrieval-augmented generation (RAG) pipelines using frameworks like LangChain or LangGraph, covering document processing, embedding generation, and semantic retrieval. Support MLOps practices by maintaining model training pipelines, monitoring performance, and enabling retraining workflows.

Collaborate with product managers, analysts, and clinical stakeholders to deliver actionable dashboards in Looker. Review peer code, contribute to engineering standards, and troubleshoot pipeline failures. Document technical designs and evaluate emerging tools through prototyping and hands-on testing.

What We're Looking For

  • 5+ years of experience in data engineering or analytics engineering roles
  • 2+ years working with healthcare data, including familiarity with clinical workflows and regulatory environments
  • Proven work with HIPAA-compliant systems, including data classification and access governance
  • Hands-on expertise with cloud data warehouses (BigQuery, Snowflake, or Redshift) and advanced SQL optimization
  • Production experience with dbt, including model layering, testing, and documentation
  • Deep knowledge of Apache Airflow for workflow orchestration and monitoring
  • Experience building star or snowflake schemas and managing slowly changing dimensions
  • Skill in delivering reports and dashboards using enterprise BI tools such as Looker or Power BI
  • Python proficiency for pipeline development and API integrations (Pandas, PySpark)
  • Practical experience with RAG pipelines and LLM integration frameworks
  • Understanding of MLOps lifecycle components, including deployment and monitoring
  • Experience with CI/CD systems for data workflows (e.g., GitHub Actions, dbt Cloud CI)
  • Familiarity with data governance tools like OpenMetadata and principles such as data contracts and lineage
  • Strong communication skills and ability to work independently while aligning with team goals

Nice to Have

  • Experience with streaming data platforms like Kafka, Kinesis, or Pub/Sub, especially for clinical event data
  • Knowledge of vector databases including Pinecone, Weaviate, FAISS, or Chroma
  • Understanding of responsible AI practices in healthcare, such as bias evaluation and explainability
  • Exposure to data observability platforms like Monte Carlo, Bigeye, or Soda
  • Familiarity with data lakehouse architectures (Delta Lake, Iceberg, Hudi)
  • Experience supporting SOC2 or HITRUST compliance efforts
  • Working knowledge of semantic modeling tools such as Looker’s LookML or dbt Semantic Layer
  • Background with population health, revenue cycle, or clinical quality metrics
  • Experience deploying ML workloads using Kubernetes or containerized environments

Technology Environment

Our stack includes Google BigQuery, dbt (Core or Cloud), Airflow, Looker, Python, Pandas, PySpark, LangChain, LangGraph, LlamaIndex, GitHub Actions, OpenMetadata, Kafka, Kinesis, Pub/Sub, Pinecone, Weaviate, FAISS, Chroma, Monte Carlo, Bigeye, Soda, Delta Lake, Iceberg, Apache Hudi, and Kubernetes.

Required Skills
Google BigQuerySnowflakeRedshiftdbtApache AirflowLookerPower BITableauQlikPythonSQLquery optimizationhealthcare data standardsHIPAAPHI/PII Google BigQuerydbtAirflowLookerPythonPandasPySparkLangChainLangGraphLlamaIndexSQLHIPAAhealthcare data standardsdata maskingaccess control
Landing international contracts?

Invoice globally with an EU company

GloPay creates an Estonian partnership for you automatically. Your clients get proper invoices, you keep 95% of payments. Setup takes 5 minutes, works in 100+ currencies.

EU-registered company for compliance
Multi-currency invoicing & payments
Expense tracking & tax reports
Money in your bank in 1 business day
Start invoicing free
5% per invoice • No subscriptions
About company
Plume Clinic
Plume is a passion-fueled, mission-driven company that is trans-founded with a vision to transform healthcare for every trans life. We hope to make gender-affirming hormone therapy easily accessible at the touch of a button in every state of the US. This work is deeply personal and heart-driven, and we want teammates who care about the mission and the people we serve. For the right candidates, we present a rare opportunity to do well by doing good. Plume offers an affirming, trans-centered, culturally inclusive, and fun work environment filled with purpose.
All jobs at Plume Clinic Visit website
Job Details
Department Product & Technology
Category data
Posted 9 days ago