Azure Databricks Engineer

Huxley
05/05/2026

Full time Information Technology Telecommunications

Job Description

This is a rare opportunity to apply serious data engineering in a domain where latency, correctness, and reliability carry direct commercial weight.

Requirements

6+ years data engineering in production environments; Python expertise - idiomatic, well tested, production grade code, not notebook scripts
ETL/ELT pipeline design and implementation at scale; orchestration with Airflow, Prefect, or equivalent; reliability first mindset including backfill, retry, and exactly once semantics
Azure data platform - Azure Data Factory, Azure Databricks, Azure Synapse Analytics, Azure Data Lake Storage; infrastructure as code for data workloads (Terraform or Bicep)
Databricks - Delta Lake, Unity Catalog, job cluster vs interactive cluster trade offs, cost aware compute management, Spark job optimisation
Relational databases: PostgreSQL at production scale - query optimisation, indexing strategies, table partitioning, replication, schema design for both OLTP and analytical workloads
MongoDB - document modelling, aggregation pipelines, indexing strategy, replica sets; clear judgment on when document vs relational storage is the right architectural call
Containerisation: Docker and Kubernetes based deployment of data workloads; reproducible, environment agnostic data infrastructure
Data modelling for analytical workloads - dimensional modelling, data vault, or equivalent; schema evolution, slowly changing dimensions, and downstream impact analysis
Stream and batch processing patterns; late data handling, watermarking, and backfill strategies; throughput vs latency trade offs in pipeline design
Production data observability - data lineage, quality checks, SLA monitoring, alerting on freshness and completeness; treating data correctness as a first class concern
CI/CD for data infrastructure - version controlled pipelines, automated data quality testing, reproducible and auditable deploys
Ability to work directly with quant researchers, risk managers, and traders - translate business requirements into reliable, well documented data products

Nice to Have

Financial markets data - market data feeds (Bloomberg, Refinitiv), tick data, trade history, reference data, or instrument master management
Apache Spark or Flink for large scale stream and batch processing beyond the Databricks ecosystem
dbt or equivalent SQL transformation layer; experience building and maintaining dbt projects in a production data warehouse
Event streaming with Kafka or Confluent Platform - topic design, consumer group management, exactly once delivery guarantees
OLAP optimised stores - ClickHouse, DuckDB, or equivalent; understanding of columnar storage and vectorised query execution
Energy, commodities, or broader financial markets domain knowledge

What We're Looking For

You treat data as a product, not a side effect. You know what it takes to make a pipeline trustworthy - not just running, but observable, tested, and recoverable when something upstream changes at 3am. You think in systems: schema evolution, lineage, freshness SLAs, and the downstream impact of every modelling decision. At ETrading, that data is the foundation of billion dollar trading decisions. You are the reason it is right.

Azure Databricks Engineer

Job Description

Modal Window