Site Reliability Engineer (SRE)

  • Charles Simon Associates Ltd
  • 17/10/2025
Full time Information Technology Telecommunications

Job Description

Site Reliability Engineer - (SRE, Terraform, AKS, Azure, Kubernetes, PowerShell, Python, Bash, Datadog, Monitoring Tools) - Permanent - Remote

Location: Remote (occasional travel to Nottinghamshire HQ)

Salary: Up to £70,000 per annum + benefits

Start Date: ASAP

Charles Simon Associates are working with a global organisation who are looking to recruit a Site Reliability Engineer (SRE) on a permanent basis. This is an exciting opportunity to join a forward-thinking business where reliability, scalability, and automation are at the heart of technology delivery.

Responsibilities include:

  • Designing and enforcing SLOs, SLIs, and SLAs to ensure high reliability and performance.
  • Building and maintaining monitoring/observability solutions (Datadog, Grafana, Azure Application Insights, Log Analytics).
  • Managing Infrastructure as Code (Terraform, Pulumi, CloudFormation) for scalable, repeatable deployments.
  • Automating with PowerShell, Python, or Bash to drive efficiency.
  • Supporting Kubernetes and AKS environments in production.
  • Leading incident response, postmortems, and continuous improvement processes.
  • Driving cost optimisation, capacity planning, and load testing.
  • Championing best practices in cloud security and resilience.

Key Skills & Experience Required:

  • Proven Site Reliability Engineering background.
  • Strong Terraform skills with live environment deployment.
  • Kubernetes / AKS expertise.
  • Scripting in PowerShell, Python or Bash.
  • Monitoring experience (Datadog preferred, Azure or Grafana considered).
  • Background in web applications and distributed systems.

Desirable Skills:

  • Knowledge of Microservices Architecture.
  • Familiarity with Kanban.
  • Experience with Puppet or Chef

If you're passionate about Site Reliability Engineering and want to work in an environment where "that will do" is never good enough, this role is for you.

Site Reliability Engineer - (SRE, Terraform, AKS, Azure, Kubernetes, PowerShell, Python, Bash, Datadog, Monitoring Tools) - Permanent - Remote