Site Reliability Engineer

Huxley
15/06/2026

Full time Information Technology Telecommunications

Job Description

Site Reliability Engineer (Cloud & Automation) - London - 2 Days on Site per week.

A leading global financial services organisation is seeking a Site Reliability Engineer (SRE) to drive reliability, automation, and performance across its cloud-hosted platforms.

The Opportunity

This role sits within a high-performing Platform Operations function, acting as a central point of expertise for SRE methodologies and automation. You will play a key role in improving system resilience, scalability, and operational excellence across a complex, regulated environment.

Key Responsibilities

Lead the implementation of SRE best practices across cloud infrastructure
Drive improvements in observability, alerting, and capacity planning (SLA / SLO / SLI)
Identify and reduce operational toil through automation and remediation frameworks
Build and enhance GitOps and Infrastructure-as-Code capabilities (e.g. Terraform, Ansible)
Develop and review production grade code to support automation initiatives
Support incident management and on call processes, ensuring production stability
Contribute to post incident reviews, embedding SRE principles to reduce risk

Requirements

Demonstrable experience in SRE or infrastructure operations within cloud environments (AWS / GCP)
Strong scripting skills (Python, Ansible, or PowerShell)
Experience with Infrastructure as Code and GitOps methodologies
Hands on knowledge of observability / APM tools (e.g. Grafana, Datadog, Dynatrace)
Proven experience managing incidents, root cause analysis, and on call support
Understanding of SLA/SLO/SLI frameworks and reliability engineering principles

Desirable

Background in software development
Experience working within regulated financial services environments
Familiarity with ITIL and enterprise service management frameworks
Relevant certifications (e.g. AWS, Terraform)

Why Apply

Opportunity to shape cloud reliability strategy in a large scale environment
Work with modern tooling across automation, DevOps, and SRE practices
Strong emphasis on engineering excellence and continuous improvement
Competitive compensation and long term career progression

Site Reliability Engineer

Job Description

Modal Window