Salary: £70,000 - 90,000 per year
Requirements
- Deep expertise in Kubernetes and/or OpenShift
- Experience working in multi-cloud or hybrid cloud environments
- Strong understanding of SRE principles, including SLOs, SLAs, error budgets, and reliability engineering
- Hands-on experience with observability tooling such as Prometheus, Grafana, OpenTelemetry, Loki, and Tempo
- Strong knowledge of Infrastructure as Code and GitOps tools such as Helm, Kustomize, ArgoCD, and Tekton
- Experience with CI/CD pipelines and automation
- Proven ability to operate as a technical leader in complex, multi-team environments
- Must be eligible for Security Clearance
Responsibilities
- Act as the technical authority for Site Reliability Engineering across complex, large-scale platforms
- Drive reliability, availability, and operational excellence across multi-team and multi-vendor environments
- Define and implement SRE strategy, standards, and best practices, including SLAs, SLOs, and error budgets
- Embed reliability principles into platform and service design from the outset
- Lead reliability reviews, operational readiness activities, and toil reduction initiatives
- Drive automation across monitoring, incident response, and remediation
- Act as the technical escalation point for major incidents and high-risk releases
- Lead blameless post-incident reviews and ensure continuous improvement
- Establish observability and capacity management practices using modern tooling
- Identify and eliminate systemic reliability risks and operational inefficiencies
- Collaborate with engineering, platform, security, and operations teams across multiple vendors
- Provide coaching and mentorship to engineers to raise SRE capability across the organisation
Technologies
- ArgoCD
- CI/CD
- Cloud
- GitOps
- Grafana
- Helm
- Kubernetes
- OpenTelemetry
- OpenShift
- Prometheus
- Security
- DevOps
- Embedded
More
We are hiring an experienced SRE Technical Lead for a senior, client-facing leadership role based in Reading with a hybrid working model across home, office, and client sites in the UK. We offer the opportunity to shape Site Reliability Engineering across complex, large-scale platforms, combining hands-on technical delivery with strategic leadership. In this role, you will help embed SRE practices across the full service lifecycle, improve reliability and operational excellence, and work closely with multi-team and multi-vendor environments.
last updated 25 week of 2026