Senior Site Reliability Engineer

Civica
03/05/2026

Full time Information Technology Telecommunications

Job Description

As a Senior Site Reliability Engineer at Civica, you will be at the heart of our SaaS transformation, owning the reliability, performance, and security of our cloud platform.

What you will be doing:

Designing and implementing for scale & resilience: Architect, implement and continuously improve our existing Data Center and Cloud environments on AWS, Azure, and VMware, ensuring they meet our SLAs and adapt dynamically to demand working alongside the Platform teams providing PaaS/IaaS.
Driving automation: Build and evolve infrastructure as code (Terraform, etc.) and CI/CD pipelines (GitHub Actions, etc.) to ship new features safely and at speed.
Defining and measuring reliability: Partner with teams to set up meaningful SLIs/SLOs, implement real time observability (Datadog, Prometheus, Grafana, ) and proactively identify risks before it impacts our users.
Leading incident response: Own the on call rota, coach teams through blameless post mortems, and embed a culture of continuous improvement so outages become learning opportunities.
Mentoring & evangelism: Share your deep expertise by pairing with engineers, running brown bag sessions on reliability best practices, and helping raise the bar across our global engineering organisation.
Securing our stack: Collaborate with our Security team and include security controls into CI/CD, runtime environments and disaster recovery plans; so our customers and citizens are always protected.

What you will do to be successful in this role

Demonstrable experience in a production SRE, DevOps or infrastructure role, ideally within a SaaS or large scale web environment.
Expert in at least one public cloud (AWS, Azure, or GCP) and comfortable designing hybrid migrations from on prem to cloud.
Strong coding/scripting and troubleshooting skills (on either of Go, .NET, Java, Python, etc.) and a passion for building reusable tested libraries and tooling.
Proven track record with IaC tools (Terraform, CloudFormation, or similar) and container orchestration (Kubernetes, ECS, AKS, OpenShift).
Proven track record with virtual machine orchestration / provisioning and resiliency strategies (Kubevirt, packer, ansible).
Deep understanding of monitoring, logging, and tracing frameworks (Prometheus/Grafana, ELK/Opensearch, Jaeger, etc.).
Excellent communicator who thrives in cross functional teams, with passion for translating complex technical issues into clear, actionable plans.

Benefits Time Off & Work-Life Balance

25 Days Annual Leave + bank holidays - plus the option to buy up to 10 extra days!
Days of Difference - Up to 3 extra days off for volunteering.

Financial Well-being & Security

Pension Contributions - 5% employer match to support your future.
Income Protection - Up to 75% salary cover for long term illness.
Life Assurance - 4x salary tax free lump sum.
Critical Illness Cover - £25,000 lump sum (extendable to dependents).

Health & Perks

Private Medical Insurance - Fast access to private healthcare.
Health Cash Plan - Claim back physio, therapies & more.
Dental Insurance - Cover for routine & emergency care.
Electric Vehicle (EV) Scheme - A wide range of electric & hybrid vehicles.
Affinity Groups - Join employee led communities.
Bounty Bonus - Refer a friend & get rewarded.

We are an equal opportunity employer. We do not discriminate based on race, ethnicity, religion, gender, sexual orientation, disability, age, or any other legally protected characteristic. Our recruitment process is designed to ensure fairness and transparency, so every candidate has an equal chance to contribute to our mission.

If you need any adjustments or accommodations to participate in our recruitment process, please let us know. We are here to support you.

Senior Site Reliability Engineer

Job Description

Modal Window