Site Reliability Engineer

  • Huxley
  • 15/06/2026
Full time Information Technology Telecommunications

Job Description

Site Reliability Engineer (Cloud & Automation) - London - 2 Days on Site per week.

A leading global financial services organisation is seeking a Site Reliability Engineer (SRE) to drive reliability, automation, and performance across its cloud-hosted platforms.

The Opportunity

This role sits within a high-performing Platform Operations function, acting as a central point of expertise for SRE methodologies and automation. You will play a key role in improving system resilience, scalability, and operational excellence across a complex, regulated environment.

Key Responsibilities
  • Lead the implementation of SRE best practices across cloud infrastructure
  • Drive improvements in observability, alerting, and capacity planning (SLA / SLO / SLI)
  • Identify and reduce operational toil through automation and remediation frameworks
  • Build and enhance GitOps and Infrastructure-as-Code capabilities (e.g. Terraform, Ansible)
  • Develop and review production grade code to support automation initiatives
  • Support incident management and on call processes, ensuring production stability
  • Contribute to post incident reviews, embedding SRE principles to reduce risk
Requirements
  • Demonstrable experience in SRE or infrastructure operations within cloud environments (AWS / GCP)
  • Strong scripting skills (Python, Ansible, or PowerShell)
  • Experience with Infrastructure as Code and GitOps methodologies
  • Hands on knowledge of observability / APM tools (e.g. Grafana, Datadog, Dynatrace)
  • Proven experience managing incidents, root cause analysis, and on call support
  • Understanding of SLA/SLO/SLI frameworks and reliability engineering principles
Desirable
  • Background in software development
  • Experience working within regulated financial services environments
  • Familiarity with ITIL and enterprise service management frameworks
  • Relevant certifications (e.g. AWS, Terraform)
Why Apply
  • Opportunity to shape cloud reliability strategy in a large scale environment
  • Work with modern tooling across automation, DevOps, and SRE practices
  • Strong emphasis on engineering excellence and continuous improvement
  • Competitive compensation and long term career progression