Data Engineer

Abtrace

Job Description

Abtrace is solving one of the most complex, impactful problems in healthcare for a generation. The company is at an inflection point. Intelligent analysis underpins everything Abtrace does and is key to driving improvements for patients and healthcare workers.

Abtrace

The NHS is under immense pressure. Primary care teams deliver more care for a larger, longer-living population with limited time and workforce capacity. Improving health outcomes at scale requires healthcare to be more proactive, preventative, and operationally efficient. We must automate wherever possible, and thoughtfully digitise the rest.

Abtrace supports over 500 primary care practices across the UK, serving 6 million people. We automate the delivery of measurements, vaccinations, blood tests, reviews, and other routine care. We improve healthcare outcomes, reduce operational burden, and create better experiences for both staff and patients.

Healthcare professionals deserve software that is reliable, safe, modern, thoughtful, and well designed. Deep analytics and robust data infrastructure are the bedrock of that work – it's how we understand what's working, where opportunities to improve care are, and how we get better.

Role Overview

We're hiring our first dedicated Data Engineer to build the foundation our analytics and product teams depend on. Today, our data flows through a mix of warehouse tables and external sources that have evolved alongside the business. As we scale to support more practices and more sophisticated analysis, we need someone to shape this into a clean, centralised, well-governed platform - reliable, easy to build on, and trusted across the company.

This is a senior individual-contributor role. You'll be hands-on day to day – writing pipelines, models, and infrastructure – while owning the architectural direction as we build the foundations of our data platform. Ideally you've watched data infrastructure outgrow its early foundations before, and you know what good looks like on the other side.

Key Responsibilities

Design, build, and maintain data pipelines that ingest from a variety of sources – third-party APIs, operational databases, and file-based exports – primarily in Python on AWS.
Own and evolve our data warehouse architecture and shape where it goes next – assessing and moving toward a cleaner, centralised warehouse or lakehouse that's well-structured, reliable, and managed as code.
Build a fast, safe path from "new data needed" to "available to analysts and the business." Our current release flow is reliable but slow; you'll streamline testing, releases, and the overall experience of adding and changing models.
Implement transformation tooling so analytics logic is version-controlled, tested, and reviewable. We use dbt and intend to keep building around it.
Make it easy and safe for engineers, analysts, and product teams to access the data they need, with appropriate controls and auditability in place.
Establish monitoring, alerting, and data quality checks across critical pipelines.
Partner with analytics, engineering, and product teams to make their work faster, safer, and more reliable – including code review, mentorship on engineering practices, and improving developer experience.
Contribute to our data security and compliance posture in line with healthcare regulatory standards (ISO 27001, GDPR).
Help define our longer-term data platform strategy as the team grows.

What we're looking for

Solid experience as a data engineer or backend engineer working on production data systems.
Strong SQL and strong Python for data work, including with large or distributed datasets.
Experience designing and operating data pipelines in production – ingestion, transformation, orchestration.
Experience with cloud data platforms, ideally AWS. Hands-on experience choosing and standing up a warehouse or lakehouse (e.g., Redshift, Snowflake, Databricks, BigQuery, or comparable) is highly valued.
Familiarity with modern transformation and orchestration tooling – dbt, plus orchestration such as Airflow, Dagster, Step Functions, or equivalent.
Infrastructure-as-code experience (e.g., Terraform/CloudFormation/CDK) and a habit of managing data infrastructure the same way.
You've worked somewhere that grew quickly and felt first-hand how data systems built for an earlier stage start to creak – and you know how to rebuild them without bringing the business to a halt.
Comfort working in a small team where you'll make architectural decisions, not just execute them.
Clear communication – you'll work with engineers, analysts, and clinical/operational stakeholders.

Nice to Have

Experience in healthcare or other regulated industries (ISO 27001, GDPR, HIPAA).
Experience with data governance at scale: classification, masking, fine-grained (row/column-level) access control, and audited access patterns.
Experience being an early data engineer at a startup.
Background in data quality, observability, or platform engineering.

Benefits

Competitive compensation
Opportunity to make a meaningful impact on healthcare outcomes

Collaborative, inclusive culture focused on learning and innovation

Ongoing professional development in emerging data technologies

Flexible working with a commitment to work-life balance