Remote Hybrid $158,000 - $168,000 a year

Plume Clinic is hiring a Senior Data Engineer (Data + Applied AI)

About the Role

Senior Data Engineer (Data + Applied AI)

What You'll Do

Design and manage end-to-end data pipelines in cloud data warehouses, ensuring reliability, scalability, and compliance with healthcare regulations. You'll implement data transformation workflows using dbt across multiple layers, enforce data quality through automated testing, and maintain clear documentation and lineage tracking.

Develop and optimize Airflow DAGs to orchestrate complex data workflows, including scheduling, error recovery, and alerting. Build dimensional models and data marts that support both business intelligence and machine learning use cases, following established modeling standards.

Integrate data from diverse sources—including electronic health records, payment systems, and third-party APIs—into a unified platform. Apply strict data handling protocols for PHI and PII, implementing masking, tokenization, and access controls across all systems.

Architect and deploy retrieval-augmented generation (RAG) pipelines using frameworks like LangChain or LangGraph, covering document processing, embedding generation, and semantic retrieval. Support MLOps practices by maintaining model training pipelines, monitoring performance, and enabling retraining workflows.

Collaborate with product managers, analysts, and clinical stakeholders to deliver actionable dashboards in Looker. Review peer code, contribute to engineering standards, and troubleshoot pipeline failures. Document technical designs and evaluate emerging tools through prototyping and hands-on testing.

What We're Looking For

5+ years of experience in data engineering or analytics engineering roles
2+ years working with healthcare data, including familiarity with clinical workflows and regulatory environments
Proven work with HIPAA-compliant systems, including data classification and access governance
Hands-on expertise with cloud data warehouses (BigQuery, Snowflake, or Redshift) and advanced SQL optimization
Production experience with dbt, including model layering, testing, and documentation
Deep knowledge of Apache Airflow for workflow orchestration and monitoring
Experience building star or snowflake schemas and managing slowly changing dimensions
Skill in delivering reports and dashboards using enterprise BI tools such as Looker or Power BI
Python proficiency for pipeline development and API integrations (Pandas, PySpark)
Practical experience with RAG pipelines and LLM integration frameworks
Understanding of MLOps lifecycle components, including deployment and monitoring
Experience with CI/CD systems for data workflows (e.g., GitHub Actions, dbt Cloud CI)
Familiarity with data governance tools like OpenMetadata and principles such as data contracts and lineage
Strong communication skills and ability to work independently while aligning with team goals

Nice to Have

Experience with streaming data platforms like Kafka, Kinesis, or Pub/Sub, especially for clinical event data
Knowledge of vector databases including Pinecone, Weaviate, FAISS, or Chroma
Understanding of responsible AI practices in healthcare, such as bias evaluation and explainability
Exposure to data observability platforms like Monte Carlo, Bigeye, or Soda
Familiarity with data lakehouse architectures (Delta Lake, Iceberg, Hudi)
Experience supporting SOC2 or HITRUST compliance efforts
Working knowledge of semantic modeling tools such as Looker’s LookML or dbt Semantic Layer
Background with population health, revenue cycle, or clinical quality metrics
Experience deploying ML workloads using Kubernetes or containerized environments

Technology Environment

Our stack includes Google BigQuery, dbt (Core or Cloud), Airflow, Looker, Python, Pandas, PySpark, LangChain, LangGraph, LlamaIndex, GitHub Actions, OpenMetadata, Kafka, Kinesis, Pub/Sub, Pinecone, Weaviate, FAISS, Chroma, Monte Carlo, Bigeye, Soda, Delta Lake, Iceberg, Apache Hudi, and Kubernetes.

Required Skills

Google BigQuerySnowflakeRedshiftdbtApache AirflowLookerPower BITableauQlikPythonSQLquery optimizationhealthcare data standardsHIPAAPHI/PII Google BigQuerydbtAirflowLookerPythonPandasPySparkLangChainLangGraphLlamaIndexSQLHIPAAhealthcare data standardsdata maskingaccess control

Landing international contracts?

Invoice globally with an EU company

GloPay creates an Estonian partnership for you automatically. Your clients get proper invoices, you keep 95% of payments. Setup takes 5 minutes, works in 100+ currencies.

EU-registered company for compliance

Multi-currency invoicing & payments

Expense tracking & tax reports

Money in your bank in 1 business day

Start invoicing free

5% per invoice • No subscriptions

About company

Plume is a passion-fueled, mission-driven company that is trans-founded with a vision to transform healthcare for every trans life. We hope to make gender-affirming hormone therapy easily accessible at the touch of a button in every state of the US. This work is deeply personal and heart-driven, and we want teammates who care about the mission and the people we serve. For the right candidates, we present a rare opportunity to do well by doing good. Plume offers an affirming, trans-centered, culturally inclusive, and fun work environment filled with purpose.

All jobs at Plume Clinic Visit website

Job Details

Department Product & Technology

Category data

Posted 9 days ago