The Site Reliability Engineer plays a critical role in ensuring that our AI-driven, cloud-native platform is reliable, observable, secure, and able to scale with the organisation's growth. As we adopt intelligent agents, autonomous workflows, and increasingly complex distributed systems, the SRE ensures that resilience, performance, and operational excellence are built into everything we deliver. By partnering closely with Engineers, Architects, and the Engineering Manager, the SRE defines the patterns, tooling, and automation that enable fast, safe, and repeatable deployments. This role safeguards our production environment, drives continuous improvement across CI/CD and observability, and establishes the reliability practices that empower autonomous squads to move quickly without compromising stability. The SRE is essential to maintaining customer trust, supporting AI-first innovation, and ensuring our platform remains robust, secure, and highly available at scale. In this position you will ensure the reliability, scalability, and security of our engineering systems. Working closely with the Engineering Manager and Head of Engineering, the SRE will identify priorities to remove friction from engineering teams, streamline processes, and enhance operational excellence. This role combines software engineering principles with systems administration to deliver robust, automated, cost-effective, and secure-by-design solutions. Key Responsibilities Reliability, Performance & Security: Design and implement strategies to improve system reliability, availability, and security. Ensure all solutions follow secure-by-design principles, incorporating cybersecurity best practices from inception through deployment. Conduct regular security reviews and collaborate with security teams to address vulnerabilities. CI/CD Management: Own and optimise Continuous Integration and Continuous Deployment pipelines. Embed security checks (e.g., static analysis, dependency scanning) into CI/CD workflows. Ensure secure, efficient, and automated deployment processes across environments. Monitoring & Observability: Implement and maintain monitoring solutions for infrastructure and applications. Develop dashboards and alerting systems to ensure proactive incident and security event management. Evaluate and integrate new observability tools as needed. Automation & Tooling: Automate repetitive tasks to improve efficiency and reduce human error. Build and maintain internal tools that support engineering productivity and security compliance. Champion Infrastructure as Code (IaC) practices using tools like Terraform or ARM templates. Cloud Infrastructure Management: Manage and optimise services across AWS and Azure environments. Ensure scalability, resilience, and security of service-based architectures. Implement cost management strategies to optimise cloud spend without compromising performance or security. Incident Response & Root Cause Analysis: Lead incident response efforts, including security incidents, and conduct post-mortem reviews. Drive continuous improvement through lessons learned and preventive measures. Skills & Experience Proven experience in AWS and Azure cloud environments. Strong background in CI/CD tools (e.g., Azure DevOps, Pipelines, GitHub Actions, Jenkins). Expertise in monitoring and observability platforms (e.g., Prometheus, Grafana, Datadog). Proficiency in scripting and automation (Python, Bash, PowerShell). Familiarity with containerisation and orchestration (Docker, Kubernetes). Solid understanding of networking, security, and cost optimisation in cloud environments. Knowledge of cybersecurity principles, secure coding practices, and compliance frameworks. A problem-solver with a proactive mindset. Comfortable working in fast-paced, evolving environments. Strong communicator who can bridge gaps between operations, development, and security teams. Passionate about automation, scalability, cost efficiency, and security.
01/04/2026
Full time
The Site Reliability Engineer plays a critical role in ensuring that our AI-driven, cloud-native platform is reliable, observable, secure, and able to scale with the organisation's growth. As we adopt intelligent agents, autonomous workflows, and increasingly complex distributed systems, the SRE ensures that resilience, performance, and operational excellence are built into everything we deliver. By partnering closely with Engineers, Architects, and the Engineering Manager, the SRE defines the patterns, tooling, and automation that enable fast, safe, and repeatable deployments. This role safeguards our production environment, drives continuous improvement across CI/CD and observability, and establishes the reliability practices that empower autonomous squads to move quickly without compromising stability. The SRE is essential to maintaining customer trust, supporting AI-first innovation, and ensuring our platform remains robust, secure, and highly available at scale. In this position you will ensure the reliability, scalability, and security of our engineering systems. Working closely with the Engineering Manager and Head of Engineering, the SRE will identify priorities to remove friction from engineering teams, streamline processes, and enhance operational excellence. This role combines software engineering principles with systems administration to deliver robust, automated, cost-effective, and secure-by-design solutions. Key Responsibilities Reliability, Performance & Security: Design and implement strategies to improve system reliability, availability, and security. Ensure all solutions follow secure-by-design principles, incorporating cybersecurity best practices from inception through deployment. Conduct regular security reviews and collaborate with security teams to address vulnerabilities. CI/CD Management: Own and optimise Continuous Integration and Continuous Deployment pipelines. Embed security checks (e.g., static analysis, dependency scanning) into CI/CD workflows. Ensure secure, efficient, and automated deployment processes across environments. Monitoring & Observability: Implement and maintain monitoring solutions for infrastructure and applications. Develop dashboards and alerting systems to ensure proactive incident and security event management. Evaluate and integrate new observability tools as needed. Automation & Tooling: Automate repetitive tasks to improve efficiency and reduce human error. Build and maintain internal tools that support engineering productivity and security compliance. Champion Infrastructure as Code (IaC) practices using tools like Terraform or ARM templates. Cloud Infrastructure Management: Manage and optimise services across AWS and Azure environments. Ensure scalability, resilience, and security of service-based architectures. Implement cost management strategies to optimise cloud spend without compromising performance or security. Incident Response & Root Cause Analysis: Lead incident response efforts, including security incidents, and conduct post-mortem reviews. Drive continuous improvement through lessons learned and preventive measures. Skills & Experience Proven experience in AWS and Azure cloud environments. Strong background in CI/CD tools (e.g., Azure DevOps, Pipelines, GitHub Actions, Jenkins). Expertise in monitoring and observability platforms (e.g., Prometheus, Grafana, Datadog). Proficiency in scripting and automation (Python, Bash, PowerShell). Familiarity with containerisation and orchestration (Docker, Kubernetes). Solid understanding of networking, security, and cost optimisation in cloud environments. Knowledge of cybersecurity principles, secure coding practices, and compliance frameworks. A problem-solver with a proactive mindset. Comfortable working in fast-paced, evolving environments. Strong communicator who can bridge gaps between operations, development, and security teams. Passionate about automation, scalability, cost efficiency, and security.
Senior Site Reliability Engineer (Observability) Location: London/UK (Remote) Contract: 12 Months Initial Day rate : £55 Per Hour - £62 Per Hour Inside IR35 Job Overview We are looking for a Senior Site Reliability Engineer with strong experience in Observability, Monitoring and Distributed Systems to support large-scale cloud infrastructure supporting millions of devices globally. The role focuses on building and scaling monitoring, logging and alerting platforms to ensure high availability and performance of cloud services. Responsibilities Design, deploy and scale observability platforms Manage and scale Prometheus monitoring systems Deploy and maintain large Elasticsearch clusters Build and maintain data pipelines using Kafka Develop alerting and monitoring frameworks Automate infrastructure using Terraform and Ansible Develop tools and scripts using Python, Go, Ruby or Bash Work with Linux systems (Debian/Ubuntu) Participate in on-call rotation Improve system reliability, performance and scalability Required Skills 5+ years experience in Site Reliability Engineering / DevOps Strong Linux systems experience Observability and Monitoring tools experience Prometheus, Grafana, ELK Stack (Elasticsearch, Logstash, Kibana) Kafka Terraform / Infrastructure as Code Ansible / Configuration Management Programming experience (Python, Go, Ruby or Bash) Distributed systems and cloud infrastructure experience This is an urgent vacancy where the hiring manager is shortlisting for an interview immediately. Please apply with a copy of your CV or send it khushboo. Co. uk Randstad Technologies is acting as an Employment Business in relation to this vacancy.
01/04/2026
Contractor
Senior Site Reliability Engineer (Observability) Location: London/UK (Remote) Contract: 12 Months Initial Day rate : £55 Per Hour - £62 Per Hour Inside IR35 Job Overview We are looking for a Senior Site Reliability Engineer with strong experience in Observability, Monitoring and Distributed Systems to support large-scale cloud infrastructure supporting millions of devices globally. The role focuses on building and scaling monitoring, logging and alerting platforms to ensure high availability and performance of cloud services. Responsibilities Design, deploy and scale observability platforms Manage and scale Prometheus monitoring systems Deploy and maintain large Elasticsearch clusters Build and maintain data pipelines using Kafka Develop alerting and monitoring frameworks Automate infrastructure using Terraform and Ansible Develop tools and scripts using Python, Go, Ruby or Bash Work with Linux systems (Debian/Ubuntu) Participate in on-call rotation Improve system reliability, performance and scalability Required Skills 5+ years experience in Site Reliability Engineering / DevOps Strong Linux systems experience Observability and Monitoring tools experience Prometheus, Grafana, ELK Stack (Elasticsearch, Logstash, Kibana) Kafka Terraform / Infrastructure as Code Ansible / Configuration Management Programming experience (Python, Go, Ruby or Bash) Distributed systems and cloud infrastructure experience This is an urgent vacancy where the hiring manager is shortlisting for an interview immediately. Please apply with a copy of your CV or send it khushboo. Co. uk Randstad Technologies is acting as an Employment Business in relation to this vacancy.
Site Reliability Engineer Central London (3 days a week in the office) Up to £70,000 per annum + Bonus + Generous Benefits Package We are working with an exciting technology company that are looking to bring in a Site Reliability Engineer to help scale their cloud infrastructure and DevOps capability. They've built a high-performing engineering team and are now investing further into the platform side of things as demand grows. Think modern, cloud-native architecture, and a real emphasis on automation, scalability, and developer enablement. You'll have the autonomy to make technical decisions and help shape how platform engineering is done as the team continues to scale. Tech stack AWS (Core services - EC2, RDS, S3, IAM, etc.) Monitoring and Observability Grafana, Prometheus, Datadog Kubernetes (building and managing production clusters) Terraform (IaC provisioning) Python, Bash or Go (scripting, automation) GitHub Actions (CI/CD pipelines) What They're Looking For Experience in AWS cloud infrastructure (ideally in a regulated or high-traffic environment) Previous experience working with Monitoring and Observability Tools Hands-on Kubernetes know-how, specifically with EKS. Solid IaC experience with Terraform. Experience with containerisation (Docker, Helm) and CI/CD (GitHub Actions or similar) Solid scripting/Automation experience with Python, Bash or Go A good communicator who enjoys working collaboratively across product and engineering. Desirable Certifications - CKA, CKAD, AWS Solutions Architect etc. The client is willing to consider candidates without all the required skills and provide an environment to learn and grow on the job. Training and development is at the forefront of the business, where you will get plenty of opportunities to progress your career in whatever path you want. Site Reliability Engineer Central London (3 days a week in the office) Up to £70,000 per annum + Bonus + Generous Benefits Package Click APPLY NOW to be considered for this position! AWS, SRE, Cloud, Kubernetes, EKS, Terraform, CI/CD, Automation etc.
01/04/2026
Full time
Site Reliability Engineer Central London (3 days a week in the office) Up to £70,000 per annum + Bonus + Generous Benefits Package We are working with an exciting technology company that are looking to bring in a Site Reliability Engineer to help scale their cloud infrastructure and DevOps capability. They've built a high-performing engineering team and are now investing further into the platform side of things as demand grows. Think modern, cloud-native architecture, and a real emphasis on automation, scalability, and developer enablement. You'll have the autonomy to make technical decisions and help shape how platform engineering is done as the team continues to scale. Tech stack AWS (Core services - EC2, RDS, S3, IAM, etc.) Monitoring and Observability Grafana, Prometheus, Datadog Kubernetes (building and managing production clusters) Terraform (IaC provisioning) Python, Bash or Go (scripting, automation) GitHub Actions (CI/CD pipelines) What They're Looking For Experience in AWS cloud infrastructure (ideally in a regulated or high-traffic environment) Previous experience working with Monitoring and Observability Tools Hands-on Kubernetes know-how, specifically with EKS. Solid IaC experience with Terraform. Experience with containerisation (Docker, Helm) and CI/CD (GitHub Actions or similar) Solid scripting/Automation experience with Python, Bash or Go A good communicator who enjoys working collaboratively across product and engineering. Desirable Certifications - CKA, CKAD, AWS Solutions Architect etc. The client is willing to consider candidates without all the required skills and provide an environment to learn and grow on the job. Training and development is at the forefront of the business, where you will get plenty of opportunities to progress your career in whatever path you want. Site Reliability Engineer Central London (3 days a week in the office) Up to £70,000 per annum + Bonus + Generous Benefits Package Click APPLY NOW to be considered for this position! AWS, SRE, Cloud, Kubernetes, EKS, Terraform, CI/CD, Automation etc.
Site Reliability Engineer Bristol, Hybrid (3 days onsite, 2 from home) Up to £95K & Great Benefits Ready to take on high-impact engineering challenges that actually matter? Want to work on mission-critical systems used across the UK s most high-profile government organisations? This is your chance to join TwinStream a team of elite engineers who built their careers cracking complex cross-domain problems, and then built a company to do it even better. We re growing fast. Demand for our services is skyrocketing. And now we re looking for a Site Reliability Engineer who s ready to step into a role with real ownership, real influence, and real opportunities to innovate. Why You ll Love This Role As our new SRE, you ll be right at the heart of our evolving cloud and on-prem platforms. This isn t a keep the lights on job it s a role where you ll shape infrastructure strategy, partner closely with software and systems teams, and push performance, reliability, and automation to the next level. You'll help us evolve observability, enhance delivery pipelines, eliminate toil, drive reliability metrics, and make smart technical decisions that keep our systems robust as we scale. If you love solving gnarly problems, improving how things work, and innovating at speed this is the role for you. Key Responsibilities of the Site Reliability Engineer: Collaborating with Software Engineers to improve subsystem reliability and performance Partnering with System Administrators to automate toil and cut down alert noise Taking observability to the next level find issues before they hit the business Supporting development environments to boost speed and quality Researching & evaluating tools to guide key buy-vs-build decisions Deepening your expertise across multiple technical and business domains Expanding your knowledge of diverse tech stacks and platforms What You Bring Modern configuration management tools (Ansible, Chef or similar) Terraform Docker containers & orchestration (Kubernetes, OpenShift, Docker Swarm) CI/CD tooling (Jenkins or similar) Monitoring/metrics stack (InfluxDB, Prometheus, Grafana) MQ messaging (RabbitMQ or other AMQP solutions) SQL & relational databases Linux administration & shell scripting Network security fundamentals Cloud hosting (ideally AWS: EC2, RDS, S3, Lambda) Bonus points for: Experience with Java, Go, Python or similar Knowledge of cross-domain principles & tech Service management experience Hands-on observability implementation Proven ability to reduce downtime with smart reliability metrics Why You ll Love Working at TwinStream Competitive salary, £65k - £95k DOE 8% employer pension contribution Private medical healthcare (including dental & optical for the whole family) Flexible working culture Learning & development owned by YOU Electric vehicle salary-sacrifice scheme 28 days holiday + bank holidays Regular team events, plus Christmas & summer parties Life assurance & cycle-to-work scheme Security Clearance You ll need to be eligible for SC and/or DV clearance. Any offer will be subject to successful security screening. Ready to engineer impact? Apply now and shape the future of secure, high-performance cross-domain systems.
31/03/2026
Full time
Site Reliability Engineer Bristol, Hybrid (3 days onsite, 2 from home) Up to £95K & Great Benefits Ready to take on high-impact engineering challenges that actually matter? Want to work on mission-critical systems used across the UK s most high-profile government organisations? This is your chance to join TwinStream a team of elite engineers who built their careers cracking complex cross-domain problems, and then built a company to do it even better. We re growing fast. Demand for our services is skyrocketing. And now we re looking for a Site Reliability Engineer who s ready to step into a role with real ownership, real influence, and real opportunities to innovate. Why You ll Love This Role As our new SRE, you ll be right at the heart of our evolving cloud and on-prem platforms. This isn t a keep the lights on job it s a role where you ll shape infrastructure strategy, partner closely with software and systems teams, and push performance, reliability, and automation to the next level. You'll help us evolve observability, enhance delivery pipelines, eliminate toil, drive reliability metrics, and make smart technical decisions that keep our systems robust as we scale. If you love solving gnarly problems, improving how things work, and innovating at speed this is the role for you. Key Responsibilities of the Site Reliability Engineer: Collaborating with Software Engineers to improve subsystem reliability and performance Partnering with System Administrators to automate toil and cut down alert noise Taking observability to the next level find issues before they hit the business Supporting development environments to boost speed and quality Researching & evaluating tools to guide key buy-vs-build decisions Deepening your expertise across multiple technical and business domains Expanding your knowledge of diverse tech stacks and platforms What You Bring Modern configuration management tools (Ansible, Chef or similar) Terraform Docker containers & orchestration (Kubernetes, OpenShift, Docker Swarm) CI/CD tooling (Jenkins or similar) Monitoring/metrics stack (InfluxDB, Prometheus, Grafana) MQ messaging (RabbitMQ or other AMQP solutions) SQL & relational databases Linux administration & shell scripting Network security fundamentals Cloud hosting (ideally AWS: EC2, RDS, S3, Lambda) Bonus points for: Experience with Java, Go, Python or similar Knowledge of cross-domain principles & tech Service management experience Hands-on observability implementation Proven ability to reduce downtime with smart reliability metrics Why You ll Love Working at TwinStream Competitive salary, £65k - £95k DOE 8% employer pension contribution Private medical healthcare (including dental & optical for the whole family) Flexible working culture Learning & development owned by YOU Electric vehicle salary-sacrifice scheme 28 days holiday + bank holidays Regular team events, plus Christmas & summer parties Life assurance & cycle-to-work scheme Security Clearance You ll need to be eligible for SC and/or DV clearance. Any offer will be subject to successful security screening. Ready to engineer impact? Apply now and shape the future of secure, high-performance cross-domain systems.
eDV DevOps Engineer / Site Reliability Engineer (SRE) - AWS, Kubernetes - Contract Outside IR35. . We are supporting a specialist engineering consultancy delivering secure technology platforms to high-profile UK government organisations. They are seeking an eDV Cleared DevOps Engineer / Site Reliability Engineer (SRE) with strong experience across AWS, Kubernetes, Terraform, CI/CD and Linux environments to support the continued growth of critical cross-domain systems. This contract role will focus on improving platform reliability, automation, infrastructure as code, observability and DevOps practices across both cloud and on-premise environments. You will work closely with software engineers, platform engineers and operations teams to ensure highly secure, scalable and resilient systems supporting sensitive government programmes. Location: Cheltenham (Hybrid - 3 days onsite) Rate: 500- 650 per day Outside IR35 Security Clearance: Active eDV Clearance required Start Date ASAP As a DevOps / Site Reliability Engineer, you will be responsible for ensuring the availability, performance, and reliability of services supporting sensitive government programmes. You will collaborate with multiple feature development teams and BAU/support teams to evolve both cloud and on-premise infrastructure, delivery pipelines, and observability tooling. The role will focus on improving system reliability, monitoring, automation, and performance, while proactively identifying and mitigating operational risks. This position may also involve participation in an on-call rota, which could include occasional 24/7 call-out support. Key Responsibilities: Collaborate with software engineering teams to improve subsystem reliability and performance. Work with system administrators to automate operational processes and reduce manual effort. Enhance monitoring and observability capabilities to proactively detect and resolve issues. Support development environments to improve delivery speed and quality. Contribute to the evolution of infrastructure, DevOps practices, and CI/CD pipelines. Research and evaluate new technologies and tools to support engineering decisions. Develop expertise across multiple technical and business domains. Required Skills & Experience Active eDV clearance is essential configuration management tools such as Ansible, Chef, or similar Strong Terraform Docker containers and container orchestration platforms (Kubernetes, OpenShift, Docker Swarm) maintaining and using CI/CD tooling such as Jenkins Monitoring and observability experience with Prometheus, Grafana, or InfluxDB event-driven integration and messaging systems such as RabbitMQ or other AMQP solutions Strong Linux command line, administration, and shell scripting experience Solid understanding of relational databases and SQL network security protocols Working with cloud platforms, ideally AWS (EC2, RDS, S3, Lambda) Azure a plus Please send your CV to Laura at (url removed) to progress matters. Services Advertised are those of Employment Business.
31/03/2026
Contractor
eDV DevOps Engineer / Site Reliability Engineer (SRE) - AWS, Kubernetes - Contract Outside IR35. . We are supporting a specialist engineering consultancy delivering secure technology platforms to high-profile UK government organisations. They are seeking an eDV Cleared DevOps Engineer / Site Reliability Engineer (SRE) with strong experience across AWS, Kubernetes, Terraform, CI/CD and Linux environments to support the continued growth of critical cross-domain systems. This contract role will focus on improving platform reliability, automation, infrastructure as code, observability and DevOps practices across both cloud and on-premise environments. You will work closely with software engineers, platform engineers and operations teams to ensure highly secure, scalable and resilient systems supporting sensitive government programmes. Location: Cheltenham (Hybrid - 3 days onsite) Rate: 500- 650 per day Outside IR35 Security Clearance: Active eDV Clearance required Start Date ASAP As a DevOps / Site Reliability Engineer, you will be responsible for ensuring the availability, performance, and reliability of services supporting sensitive government programmes. You will collaborate with multiple feature development teams and BAU/support teams to evolve both cloud and on-premise infrastructure, delivery pipelines, and observability tooling. The role will focus on improving system reliability, monitoring, automation, and performance, while proactively identifying and mitigating operational risks. This position may also involve participation in an on-call rota, which could include occasional 24/7 call-out support. Key Responsibilities: Collaborate with software engineering teams to improve subsystem reliability and performance. Work with system administrators to automate operational processes and reduce manual effort. Enhance monitoring and observability capabilities to proactively detect and resolve issues. Support development environments to improve delivery speed and quality. Contribute to the evolution of infrastructure, DevOps practices, and CI/CD pipelines. Research and evaluate new technologies and tools to support engineering decisions. Develop expertise across multiple technical and business domains. Required Skills & Experience Active eDV clearance is essential configuration management tools such as Ansible, Chef, or similar Strong Terraform Docker containers and container orchestration platforms (Kubernetes, OpenShift, Docker Swarm) maintaining and using CI/CD tooling such as Jenkins Monitoring and observability experience with Prometheus, Grafana, or InfluxDB event-driven integration and messaging systems such as RabbitMQ or other AMQP solutions Strong Linux command line, administration, and shell scripting experience Solid understanding of relational databases and SQL network security protocols Working with cloud platforms, ideally AWS (EC2, RDS, S3, Lambda) Azure a plus Please send your CV to Laura at (url removed) to progress matters. Services Advertised are those of Employment Business.
Senior Site Reliability Engineer (SRE) Remote 12-month contract (high chance of extension) Job Description Join a global pioneer in the video game industry and own the reliability of high-traffic, revenue-critical platforms used by millions worldwide. As a Senior SRE, you'll shape the architecture, improve platform-wide resiliency, and ensure services stay performant, scalable, and secure. This isn't just about maintaining a single system, you'll influence reliability across multiple services, driving improvements that touch the entire ecosystem. Key Responsibilities Lead incident response and troubleshooting for production systems, resolving high-severity issues and driving post-incident improvements. Influence architecture to improve platform-wide reliability, resiliency, and operational efficiency, ensuring services remain available under heavy load. Drive containerisation best practices and manage Kubernetes-based workloads at scale. Build and maintain event-driven architectures that scale globally while ensuring fault-tolerance and high availability. Automate infrastructure provisioning, deployment, and monitoring using Infrastructure as Code (Terraform, CloudFormation, Ansible, CDK). Collaborate with engineering, product, and security teams to define SLOs, SLIs, and error budgets across services. Provide mentorship, advocate SRE best practices, and ensure teams are empowered to deliver resilient, reliable systems. Experience / Must-Have Skills Extensive experience in AWS and AWS-managed services (EC2, Lambda, S3, VPC, CloudWatch, CloudTrail, IAM, EKS, Service Catalog, multi-account environments). Strong Kubernetes / container orchestration experience, including EKS, OpenShift, Docker, and service mesh. Deep understanding of networking fundamentals: DNS, VPCs, routing, load balancing, TCP/IP, firewall policies. Proven track record in incident response and troubleshooting at scale. Hands-on experience with infrastructure automation and CI/CD pipelines. Experience designing event-driven architectures and resilient systems. High level of autonomy, able to influence platform-wide decisions and architect for reliability across services. Ability and desire to mentor junior staff Bonus: experience in gaming, interactive entertainment, or other high-traffic, global-scale platforms. If you are interested in this role, please feel free to submit your CV.
31/03/2026
Contractor
Senior Site Reliability Engineer (SRE) Remote 12-month contract (high chance of extension) Job Description Join a global pioneer in the video game industry and own the reliability of high-traffic, revenue-critical platforms used by millions worldwide. As a Senior SRE, you'll shape the architecture, improve platform-wide resiliency, and ensure services stay performant, scalable, and secure. This isn't just about maintaining a single system, you'll influence reliability across multiple services, driving improvements that touch the entire ecosystem. Key Responsibilities Lead incident response and troubleshooting for production systems, resolving high-severity issues and driving post-incident improvements. Influence architecture to improve platform-wide reliability, resiliency, and operational efficiency, ensuring services remain available under heavy load. Drive containerisation best practices and manage Kubernetes-based workloads at scale. Build and maintain event-driven architectures that scale globally while ensuring fault-tolerance and high availability. Automate infrastructure provisioning, deployment, and monitoring using Infrastructure as Code (Terraform, CloudFormation, Ansible, CDK). Collaborate with engineering, product, and security teams to define SLOs, SLIs, and error budgets across services. Provide mentorship, advocate SRE best practices, and ensure teams are empowered to deliver resilient, reliable systems. Experience / Must-Have Skills Extensive experience in AWS and AWS-managed services (EC2, Lambda, S3, VPC, CloudWatch, CloudTrail, IAM, EKS, Service Catalog, multi-account environments). Strong Kubernetes / container orchestration experience, including EKS, OpenShift, Docker, and service mesh. Deep understanding of networking fundamentals: DNS, VPCs, routing, load balancing, TCP/IP, firewall policies. Proven track record in incident response and troubleshooting at scale. Hands-on experience with infrastructure automation and CI/CD pipelines. Experience designing event-driven architectures and resilient systems. High level of autonomy, able to influence platform-wide decisions and architect for reliability across services. Ability and desire to mentor junior staff Bonus: experience in gaming, interactive entertainment, or other high-traffic, global-scale platforms. If you are interested in this role, please feel free to submit your CV.
Job Title: SRE Transformation Lead/ Senior SRE Engineer (Global Banking & Payments) Contract Length: 12 months Location: Bromley / London (3 days a week) Working Pattern: Full Time Are you ready to lead a transformative journey in the world of Global Banking and Payments? Our client is seeking a passionate and experienced SRE Transformation Lead to help shape and scale Site Reliability Engineering (SRE) practises across a highly regulated banking environment. This is your chance to drive innovation, foster collaboration, and make a real impact on service reliability! Role Overview: As the SRE Transformation Lead/ Senior SRE Engineer, you will lead and accelerate transformation from traditional L2 production support toward an SRE operating model. Your hands-on experience will be crucial in defining and implementing SRE practises across critical banking and payment services, ensuring measurable reliability outcomes and streamlined operations. Required Skills: Significant experience in Site Reliability Engineering and implementing SRE practices across large scale, complex services in essential Demonstrated experience leading an SRE transformation in a corporate banking environment (or similarly regulated financial services enterprise). Proven ability to implement and scale SLO/SLI and error budget approaches, and to operationalize them across multiple teams and services. Strong engineering background with the ability to drive automation and reduce manual toil through code, tooling, and process redesign. Deep knowledge of incident response, problem management, root cause analysis, and operational resilience practices in mission critical environments. Strong stakeholder management skills, able to influence technology and business partners and communicate effectively at senior levels. Key Responsibilities: SRE Operating Model & Transformation : Lead the design and execution of the SRE adoption strategy, transitioning teams to a reliability engineering mindset. Reliability Measurement : Drive the implementation of Critical User Journeys, SLIs, SLOs, and error budgets to align metrics with user experience and business objectives. Toil Reduction & Automation : Identify and eliminate operational toil through automation, enhancing engineering practises and operational tooling. Incident & Problem Management : Strengthen incident response frameworks and improve production outcomes through effective root cause analysis and preventive engineering. Observability & Tooling : Establish observability standards to enhance service monitoring, partnering with teams to align SRE needs with enterprise tooling. Stakeholder Management : Influence leaders across operations and engineering, driving the adoption of SRE principles and fostering a culture of reliability. Preferred Qualifications: Experience with high-availability banking platforms and 24x7 operational expectations. Familiarity with observability tools and building SRE communities of practise. Why Join Us? Be a Pioneer : Lead the charge in transforming how reliability engineering is approached in the banking sector. Collaborative Environment : Work with a diverse team that values innovation, teamwork, and excellence. Professional Growth : Take on a pivotal role that will challenge and expand your skills in a dynamic and fast-paced industry. Are you ready to take the next step in your career and make a lasting impact? If you have the expertise and enthusiasm for driving SRE transformation, we want to hear from you! Apply Now! Join our client in revolutionising the Global Banking & Payments landscape. Your journey toward making a difference starts here! Pontoon is an employment consultancy. We put expertise, energy, and enthusiasm into improving everyone's chance of being part of the workplace. We respect and appreciate people of all ethnicities, generations, religious beliefs, sexual orientations, gender identities, and more. We do this by showcasing their talents, skills, and unique experience in an inclusive environment that helps them thrive. If you require reasonable adjustments at any stage, please let us know and we will be happy to support you.
31/03/2026
Contractor
Job Title: SRE Transformation Lead/ Senior SRE Engineer (Global Banking & Payments) Contract Length: 12 months Location: Bromley / London (3 days a week) Working Pattern: Full Time Are you ready to lead a transformative journey in the world of Global Banking and Payments? Our client is seeking a passionate and experienced SRE Transformation Lead to help shape and scale Site Reliability Engineering (SRE) practises across a highly regulated banking environment. This is your chance to drive innovation, foster collaboration, and make a real impact on service reliability! Role Overview: As the SRE Transformation Lead/ Senior SRE Engineer, you will lead and accelerate transformation from traditional L2 production support toward an SRE operating model. Your hands-on experience will be crucial in defining and implementing SRE practises across critical banking and payment services, ensuring measurable reliability outcomes and streamlined operations. Required Skills: Significant experience in Site Reliability Engineering and implementing SRE practices across large scale, complex services in essential Demonstrated experience leading an SRE transformation in a corporate banking environment (or similarly regulated financial services enterprise). Proven ability to implement and scale SLO/SLI and error budget approaches, and to operationalize them across multiple teams and services. Strong engineering background with the ability to drive automation and reduce manual toil through code, tooling, and process redesign. Deep knowledge of incident response, problem management, root cause analysis, and operational resilience practices in mission critical environments. Strong stakeholder management skills, able to influence technology and business partners and communicate effectively at senior levels. Key Responsibilities: SRE Operating Model & Transformation : Lead the design and execution of the SRE adoption strategy, transitioning teams to a reliability engineering mindset. Reliability Measurement : Drive the implementation of Critical User Journeys, SLIs, SLOs, and error budgets to align metrics with user experience and business objectives. Toil Reduction & Automation : Identify and eliminate operational toil through automation, enhancing engineering practises and operational tooling. Incident & Problem Management : Strengthen incident response frameworks and improve production outcomes through effective root cause analysis and preventive engineering. Observability & Tooling : Establish observability standards to enhance service monitoring, partnering with teams to align SRE needs with enterprise tooling. Stakeholder Management : Influence leaders across operations and engineering, driving the adoption of SRE principles and fostering a culture of reliability. Preferred Qualifications: Experience with high-availability banking platforms and 24x7 operational expectations. Familiarity with observability tools and building SRE communities of practise. Why Join Us? Be a Pioneer : Lead the charge in transforming how reliability engineering is approached in the banking sector. Collaborative Environment : Work with a diverse team that values innovation, teamwork, and excellence. Professional Growth : Take on a pivotal role that will challenge and expand your skills in a dynamic and fast-paced industry. Are you ready to take the next step in your career and make a lasting impact? If you have the expertise and enthusiasm for driving SRE transformation, we want to hear from you! Apply Now! Join our client in revolutionising the Global Banking & Payments landscape. Your journey toward making a difference starts here! Pontoon is an employment consultancy. We put expertise, energy, and enthusiasm into improving everyone's chance of being part of the workplace. We respect and appreciate people of all ethnicities, generations, religious beliefs, sexual orientations, gender identities, and more. We do this by showcasing their talents, skills, and unique experience in an inclusive environment that helps them thrive. If you require reasonable adjustments at any stage, please let us know and we will be happy to support you.
Senior Site Reliability Engineer (Observability) Location: London/UK (Remote) Contract: 12 Months Initial Day rate : 55 Per Hour - 62 Per Hour Inside IR35 Job Overview We are looking for a Senior Site Reliability Engineer with strong experience in Observability, Monitoring and Distributed Systems to support large-scale cloud infrastructure supporting millions of devices globally. The role focuses on building and scaling monitoring, logging and alerting platforms to ensure high availability and performance of cloud services. Responsibilities Design, deploy and scale observability platforms Manage and scale Prometheus monitoring systems Deploy and maintain large Elasticsearch clusters Build and maintain data pipelines using Kafka Develop alerting and monitoring frameworks Automate infrastructure using Terraform and Ansible Develop tools and scripts using Python, Go, Ruby or Bash Work with Linux systems (Debian/Ubuntu) Participate in on-call rotation Improve system reliability, performance and scalability Required Skills 5+ years experience in Site Reliability Engineering / DevOps Strong Linux systems experience Observability and Monitoring tools experience Prometheus, Grafana, ELK Stack (Elasticsearch, Logstash, Kibana) Kafka Terraform / Infrastructure as Code Ansible / Configuration Management Programming experience (Python, Go, Ruby or Bash) Distributed systems and cloud infrastructure experience This is an urgent vacancy where the hiring manager is shortlisting for an interview immediately. Please apply with a copy of your CV or send it khushboo. Co. uk Randstad Technologies is acting as an Employment Business in relation to this vacancy.
31/03/2026
Contractor
Senior Site Reliability Engineer (Observability) Location: London/UK (Remote) Contract: 12 Months Initial Day rate : 55 Per Hour - 62 Per Hour Inside IR35 Job Overview We are looking for a Senior Site Reliability Engineer with strong experience in Observability, Monitoring and Distributed Systems to support large-scale cloud infrastructure supporting millions of devices globally. The role focuses on building and scaling monitoring, logging and alerting platforms to ensure high availability and performance of cloud services. Responsibilities Design, deploy and scale observability platforms Manage and scale Prometheus monitoring systems Deploy and maintain large Elasticsearch clusters Build and maintain data pipelines using Kafka Develop alerting and monitoring frameworks Automate infrastructure using Terraform and Ansible Develop tools and scripts using Python, Go, Ruby or Bash Work with Linux systems (Debian/Ubuntu) Participate in on-call rotation Improve system reliability, performance and scalability Required Skills 5+ years experience in Site Reliability Engineering / DevOps Strong Linux systems experience Observability and Monitoring tools experience Prometheus, Grafana, ELK Stack (Elasticsearch, Logstash, Kibana) Kafka Terraform / Infrastructure as Code Ansible / Configuration Management Programming experience (Python, Go, Ruby or Bash) Distributed systems and cloud infrastructure experience This is an urgent vacancy where the hiring manager is shortlisting for an interview immediately. Please apply with a copy of your CV or send it khushboo. Co. uk Randstad Technologies is acting as an Employment Business in relation to this vacancy.
Principal Developer Team Lead
Salary: £51,400 - £68,800
Location: Cambridge/Hybrid
Contract: Permanent
This Principal Developer Team Lead position offers a pivotal opportunity to shape the technical future of a world-renowned academic organisation. You'll spearhead the migration of enterprise systems to cutting-edge cloud-native AWS architectures, while balancing hands-on technical leadership with people management responsibilities.
We are Cambridge University Press & Assessment, a world-leading academic publisher and assessment organisation and a proud part of the University of Cambridge.
About the role
We're seeking a hands-on Principal Developer Team Lead to drive the technical transformation of our Exam Technology Organisation as we migrate legacy enterprise applications to modern, cloud-native architectures on AWS.
You'll balance technical leadership with people management, leading a team of 4-8 developers while establishing the foundations for our future technology stack. Your initial focus will be on two strategic priorities:
Evolving our SRE function - Building the DevOps infrastructure, automation, and tooling that enables Site Reliability Engineering practices across development and operations teams
Advancing our AI development practice - Establishing standards, frameworks, and best practices for responsibly integrating AI capabilities into our education platforms.
What You'll Do
Technical Leadership
Lead migration of legacy applications to cloud-native AWS architectures
Build DevOps automation to support SRE practices
Establish AI/ML development standards and frameworks
Set observability, monitoring, and incident response standards
Promote best practices in web, event-driven, and cloud-native technologies
Provide technical expertise and oversee code reviews
People Leadership
Manage and mentor a team of 4–8 developers, providing coaching, development plan
Identifying training needs in AI/ML and SRE.
Support recruitment and foster a culture of continual improvement and wellbeing.
Delivery & Collaboration
Deliver software in agile squads
Collaborate with architects, SREs, product owners, and infrastructure teams
Liaise with stakeholders to identify education sector needs
Plan and estimate migrations and feature delivery
Coordinate with service management, security, and AWS experts
About you
Essential experience
Degree or equivalent
Proven technical team leadership
Skilled in two or more modern programming languages
Experience with AWS cloud and infrastructure
DevOps skills: automation, CI/CD, infrastructure-as-code
Understanding of SRE and observability
Experience in web-apps and modern frameworks
Strong communicator with technical and non-technical audiences
Technical Expertise
CI/CD pipelines, automation frameworks, and developer tooling
Observability tools, monitoring, logging, and alerting systems
Responsible AI practices and governance
Event-driven architecture and microservices patterns
Software design patterns and scalability best practices
Security principles in cloud environments
Leadership Qualities
Ability to set technical standards and provide thought leadership
Experience balancing people management with hands-on contribution
Strong mentoring and coaching skills
Collaborative approach that builds trust across teams
Passion for continuous learning in AI/ML and DevOps
Promotes inclusion and continuous improvement
You'll be instrumental in our digital transformation, establishing the foundations for reliable, innovative systems that serve millions of learners, teachers, and researchers worldwide. By evolving our SRE function and advancing our AI practice, you'll empower teams to deliver high-performance solutions while responsibly harnessing cutting-edge technologies.
If you would like to know more about this opportunity and what will make you successful, please see the full job description attached to the bottom of this vacancy on our careers site.
Rewards and benefits
We will support you to be at your best in work and to live well outside of it. In addition to competitive salaries, we offer a world-class, flexible rewards package , featuring family-friendly and planet-friendly benefits including:
28 days annual leave plus bank holidays
Private medical and Permanent Health Insurance
Discretionary annual bonus
Group personal pension scheme
Life assurance up to 4 x annual salary
Green travel schemes
We are a hybrid working organisation, and we offer a range of flexible working options from day one. We expect most hybrid-working colleagues to spend 40-60% of their time at their dedicated office or location. We will also consider other work arrangements if you wish to work more flexibly or require adjustments due to a disability.
Ready to pursue your potential? Apply now.
We review applications on an ongoing basis, with a closing date for all applications being 18 February 2026.
If you are shortlisted and progressed through the stages, you can expect:
A 40-minute screening call with the Hiring Manager.
First stage interview via MS Teams or in person. You will be provided with a brief to complete a role related task which will need to be returned by email in advance of your interview.
Please note that successful applicants will be subject to satisfactory background checks including DBS due to working in a regulated industry.
Cambridge University Press & Assessment is an approved UK employer for the sponsorship of eligible roles and applicants under the Skilled Worker visa route. Please refer to the gov.uk website for guidance to understand your own eligibility based on the role you are applying for.
Why join us
Joining us is your opportunity to pursue potential. You'll belong to a collaborative team that's exploring new and better ways to serve students, teachers and researchers across the globe – for the benefit of individuals, society and the world. Sharing our mission will inspire your own growth, development and progress, in an environment which embraces difference, change and aspiration.
Cambridge University Press & Assessment is committed to being a place where anyone can enjoy a successful career, where it's safe to speak up, and where we learn continuously to improve together. We welcome applications from all candidates, regardless of demographic characteristics (age, disability, educational attainment, ethnicity, gender, marital status, neurodiversity, religion, sex, gender identity and sexual identity), cultural, or social class/background.
We believe better outcomes come through diversity of thought, background and approach. We welcome applications from people from all backgrounds and communities, actively seeking to employ people from a wide range of different communities.
04/02/2026
Full time
Principal Developer Team Lead
Salary: £51,400 - £68,800
Location: Cambridge/Hybrid
Contract: Permanent
This Principal Developer Team Lead position offers a pivotal opportunity to shape the technical future of a world-renowned academic organisation. You'll spearhead the migration of enterprise systems to cutting-edge cloud-native AWS architectures, while balancing hands-on technical leadership with people management responsibilities.
We are Cambridge University Press & Assessment, a world-leading academic publisher and assessment organisation and a proud part of the University of Cambridge.
About the role
We're seeking a hands-on Principal Developer Team Lead to drive the technical transformation of our Exam Technology Organisation as we migrate legacy enterprise applications to modern, cloud-native architectures on AWS.
You'll balance technical leadership with people management, leading a team of 4-8 developers while establishing the foundations for our future technology stack. Your initial focus will be on two strategic priorities:
Evolving our SRE function - Building the DevOps infrastructure, automation, and tooling that enables Site Reliability Engineering practices across development and operations teams
Advancing our AI development practice - Establishing standards, frameworks, and best practices for responsibly integrating AI capabilities into our education platforms.
What You'll Do
Technical Leadership
Lead migration of legacy applications to cloud-native AWS architectures
Build DevOps automation to support SRE practices
Establish AI/ML development standards and frameworks
Set observability, monitoring, and incident response standards
Promote best practices in web, event-driven, and cloud-native technologies
Provide technical expertise and oversee code reviews
People Leadership
Manage and mentor a team of 4–8 developers, providing coaching, development plan
Identifying training needs in AI/ML and SRE.
Support recruitment and foster a culture of continual improvement and wellbeing.
Delivery & Collaboration
Deliver software in agile squads
Collaborate with architects, SREs, product owners, and infrastructure teams
Liaise with stakeholders to identify education sector needs
Plan and estimate migrations and feature delivery
Coordinate with service management, security, and AWS experts
About you
Essential experience
Degree or equivalent
Proven technical team leadership
Skilled in two or more modern programming languages
Experience with AWS cloud and infrastructure
DevOps skills: automation, CI/CD, infrastructure-as-code
Understanding of SRE and observability
Experience in web-apps and modern frameworks
Strong communicator with technical and non-technical audiences
Technical Expertise
CI/CD pipelines, automation frameworks, and developer tooling
Observability tools, monitoring, logging, and alerting systems
Responsible AI practices and governance
Event-driven architecture and microservices patterns
Software design patterns and scalability best practices
Security principles in cloud environments
Leadership Qualities
Ability to set technical standards and provide thought leadership
Experience balancing people management with hands-on contribution
Strong mentoring and coaching skills
Collaborative approach that builds trust across teams
Passion for continuous learning in AI/ML and DevOps
Promotes inclusion and continuous improvement
You'll be instrumental in our digital transformation, establishing the foundations for reliable, innovative systems that serve millions of learners, teachers, and researchers worldwide. By evolving our SRE function and advancing our AI practice, you'll empower teams to deliver high-performance solutions while responsibly harnessing cutting-edge technologies.
If you would like to know more about this opportunity and what will make you successful, please see the full job description attached to the bottom of this vacancy on our careers site.
Rewards and benefits
We will support you to be at your best in work and to live well outside of it. In addition to competitive salaries, we offer a world-class, flexible rewards package , featuring family-friendly and planet-friendly benefits including:
28 days annual leave plus bank holidays
Private medical and Permanent Health Insurance
Discretionary annual bonus
Group personal pension scheme
Life assurance up to 4 x annual salary
Green travel schemes
We are a hybrid working organisation, and we offer a range of flexible working options from day one. We expect most hybrid-working colleagues to spend 40-60% of their time at their dedicated office or location. We will also consider other work arrangements if you wish to work more flexibly or require adjustments due to a disability.
Ready to pursue your potential? Apply now.
We review applications on an ongoing basis, with a closing date for all applications being 18 February 2026.
If you are shortlisted and progressed through the stages, you can expect:
A 40-minute screening call with the Hiring Manager.
First stage interview via MS Teams or in person. You will be provided with a brief to complete a role related task which will need to be returned by email in advance of your interview.
Please note that successful applicants will be subject to satisfactory background checks including DBS due to working in a regulated industry.
Cambridge University Press & Assessment is an approved UK employer for the sponsorship of eligible roles and applicants under the Skilled Worker visa route. Please refer to the gov.uk website for guidance to understand your own eligibility based on the role you are applying for.
Why join us
Joining us is your opportunity to pursue potential. You'll belong to a collaborative team that's exploring new and better ways to serve students, teachers and researchers across the globe – for the benefit of individuals, society and the world. Sharing our mission will inspire your own growth, development and progress, in an environment which embraces difference, change and aspiration.
Cambridge University Press & Assessment is committed to being a place where anyone can enjoy a successful career, where it's safe to speak up, and where we learn continuously to improve together. We welcome applications from all candidates, regardless of demographic characteristics (age, disability, educational attainment, ethnicity, gender, marital status, neurodiversity, religion, sex, gender identity and sexual identity), cultural, or social class/background.
We believe better outcomes come through diversity of thought, background and approach. We welcome applications from people from all backgrounds and communities, actively seeking to employ people from a wide range of different communities.
Cambridge University Press & Assessment
Cambridge/Hybrid (with 2-3 days per week in office)
Job Title: English Technology Platform SRE Team Lead
Salary: £68,600 - £91,700
Location: Cambridge/Hybrid (with 2-3 days per week in office)
Contract: Permanent
Hours: Full time
Are you ready to shape the future of technology platforms at the heart of Cambridge's academic excellence? Join us as our English Technology Platform SRE Team Lead and help drive innovation, reliability, and intelligent automation in a world-class environment.
We are Cambridge University Press & Assessment, a world-leading academic publisher and assessment organisation and a proud part of the University of Cambridge.
About the role
The SRE Team Lead will lead a mature Site Reliability Engineering function within the Platform Operations Team, working closely with Platform Support and Engineering teams. This role demands strong thought leadership, technical depth, and strategic direction for the discipline, with a particular emphasis on leveraging AI-driven operations (AIOps) and FinOps practices to optimise reliability, performance, and cloud spend.
Although this is a hands-on technical role, the SRE Team Lead will also manage a small team of SRE, providing clear direction and ensuring consistent, data-driven, AI-enhanced service delivery across the platforms while working collaboratively with existing support and engineering groups.
Apply core SRE and DevOps principles—culture, automation, testing, measurement, and continuous improvement—to build and optimise pipelines focused on rapid, reliable software delivery. Integrate AIOps capabilities, such as automated anomaly detection and intelligent alerting, to further enhance operational excellence.
Work with Solutions Architecture, Development, and QA teams to automate processes wherever possible, creating and improving stable CI/CD pipelines for both software and infrastructure. Develop tools that enable rapid provisioning of environments and resources across all teams, incorporating AI-assisted automation where beneficial.
Use automation, observability, and monitoring tools to improve site reliability and proactively identify issues. Support development teams with troubleshooting, particularly in infrastructure, networking, and multi-tier application design. Serve as a subject matter expert for cloud services—especially AWS PaaS—while applying FinOps practices to ensure cloud cost transparency, optimisation, and efficient resource usage.
Create and maintain robust technical documentation for the infrastructure of the English platforms, including operational runbooks enhanced with predictive and AI-supported insights.
Stay engaged with developments in the SRE, DevOps, AIOps, and FinOps communities, continually introducing new practices and technologies to improve reliability, performance, automation, and cloud cost efficiency
This position has been classified as a hybrid role, requiring the selected candidate to typically spend 40-60% of their time collaborating and connecting face-to-face at their dedicated location. Aside from our hybrid principles, other flexible working requests will be considered from the first day of employment, including other work arrangements should you require adjustments due to a disability or long-term health condition.
About you
A passion for Site reliability engineering and driven to understand, anticipate, and counter platform related issues before they become problems and staying up to date with the latest technological trends and developments
Great communication allowing effective collaboration across technical leadership and various business stakeholders with the ability to present ideas and strategies clearly and persuasively.
Demonstratable soft skills in motivating, inspiring and leading a team (direct line management is not part of the roles remit)
Educated to degree level or equivalent and with a minimum of 5 years proven experience in a systems administration or dev-ops blended role.
Experience implementing technologies such as Terraform, Github Actions & Containerization/Orchestration e.g. Kubernetes & Docker
Expertise in Monitoring tools like New Relic, Grafana, Alert Manager and site24x7.
Have extreme knowledge of cloud computing infrastructure, especially using Amazon Web Services (EKS, ECS, RDS, Route53 etc.)
Excellent troubleshooting, debugging, communication and documentation skills
Experience of working within an Agile product development environment.
For a detailed job description, please refer to the link at the bottom of the advert on our careers site.
We are a Disability Confident (DC) employer that is committed to equality and inclusion ensuring our recruitment process is accessible to all. The DC scheme's Offer of an Interview commitment applies to applicants who opt in, and disclose a disability or a long-term health condition, and best meet the minimum criteria for the role. In instances where interviewing all qualifying candidates is not practicable, we prioritise those who best meet the minimum criteria, as we would for applicants who do not have a disability or long-term health condition.
Cambridge University Press & Assessment is an approved UK employer for the sponsorship of eligible roles and applicants under the Skilled Worker visa route. Please refer to the gov.uk website for guidance to understand your own eligibility based on the role you are applying for.
Rewards and benefits
We will support you to be at your best in work and to live well outside of it. In addition to competitive salaries, we offer a world-class, flexible rewards package , featuring family-friendly and planet-friendly benefits including:
28 days annual leave plus bank holidays
Private medical and Permanent Health Insurance
Discretionary annual bonus
Group personal pension scheme
Life assurance up to 4 x annual salary
Green travel schemes
Ready to pursue your potential? Apply now.
We aim to support candidates by making our interview process clear and transparent. The closing date for all applications will be 4th February. We will review applications on an ongoing basis, and shortlisted candidates can expect interviews to take place shortly after it closes.
If you are shortlisted and progressed through the stages, you can expect:
A 15-minute screening call with the Hiring Manager.
Final stage virtual interview via MS Teams.
If you require any reasonable adjustments during the recruitment process due to a disability or a long-term health condition, there will be an opportunity for you to inform us via the online application form. We will do our best to accommodate your needs.
Please note that successful applicants will be subject to satisfactory background checks including DBS due to working in a regulated industry.
We are committed to an equitable recruitment process. As such, applications must be submitted via our official online application procedure. Please refrain from sending your CV directly to our recruiters. If you experience technical difficulties or require additional support with submitting your online application, contact the Recruiter.
Why join us
Joining us is your opportunity to pursue potential. You will belong to a collaborative team that is exploring new and better ways to serve students, teachers and researchers across the globe – for the benefit of individuals, society and the world. Sharing our mission will inspire your own growth, development and progress, in an environment which embraces difference, change and aspiration.
Cambridge University Press & Assessment is committed to being a place where anyone can enjoy a successful career, where it is safe to speak up, and where we learn continuously to improve together. We welcome applications from all candidates, regardless of demographic characteristics (age, disability, educational attainment, ethnicity, gender, marital status, neurodiversity, religion, sex, gender identity and sexual identity), cultural, or social class/background.
We believe better outcomes come through diversity of thought, background and approach. We welcome applications from people from all backgrounds and communities, actively seeking to employ people from a wide range of different communities.
If you are ready to take the next step in your Cambridge journey, we welcome your application. Together, we continue to shape a culture where everyone feels empowered to succeed and motivated to make a difference— for ourselves, for each other, and for learners worldwide.
21/01/2026
Full time
Job Title: English Technology Platform SRE Team Lead
Salary: £68,600 - £91,700
Location: Cambridge/Hybrid (with 2-3 days per week in office)
Contract: Permanent
Hours: Full time
Are you ready to shape the future of technology platforms at the heart of Cambridge's academic excellence? Join us as our English Technology Platform SRE Team Lead and help drive innovation, reliability, and intelligent automation in a world-class environment.
We are Cambridge University Press & Assessment, a world-leading academic publisher and assessment organisation and a proud part of the University of Cambridge.
About the role
The SRE Team Lead will lead a mature Site Reliability Engineering function within the Platform Operations Team, working closely with Platform Support and Engineering teams. This role demands strong thought leadership, technical depth, and strategic direction for the discipline, with a particular emphasis on leveraging AI-driven operations (AIOps) and FinOps practices to optimise reliability, performance, and cloud spend.
Although this is a hands-on technical role, the SRE Team Lead will also manage a small team of SRE, providing clear direction and ensuring consistent, data-driven, AI-enhanced service delivery across the platforms while working collaboratively with existing support and engineering groups.
Apply core SRE and DevOps principles—culture, automation, testing, measurement, and continuous improvement—to build and optimise pipelines focused on rapid, reliable software delivery. Integrate AIOps capabilities, such as automated anomaly detection and intelligent alerting, to further enhance operational excellence.
Work with Solutions Architecture, Development, and QA teams to automate processes wherever possible, creating and improving stable CI/CD pipelines for both software and infrastructure. Develop tools that enable rapid provisioning of environments and resources across all teams, incorporating AI-assisted automation where beneficial.
Use automation, observability, and monitoring tools to improve site reliability and proactively identify issues. Support development teams with troubleshooting, particularly in infrastructure, networking, and multi-tier application design. Serve as a subject matter expert for cloud services—especially AWS PaaS—while applying FinOps practices to ensure cloud cost transparency, optimisation, and efficient resource usage.
Create and maintain robust technical documentation for the infrastructure of the English platforms, including operational runbooks enhanced with predictive and AI-supported insights.
Stay engaged with developments in the SRE, DevOps, AIOps, and FinOps communities, continually introducing new practices and technologies to improve reliability, performance, automation, and cloud cost efficiency
This position has been classified as a hybrid role, requiring the selected candidate to typically spend 40-60% of their time collaborating and connecting face-to-face at their dedicated location. Aside from our hybrid principles, other flexible working requests will be considered from the first day of employment, including other work arrangements should you require adjustments due to a disability or long-term health condition.
About you
A passion for Site reliability engineering and driven to understand, anticipate, and counter platform related issues before they become problems and staying up to date with the latest technological trends and developments
Great communication allowing effective collaboration across technical leadership and various business stakeholders with the ability to present ideas and strategies clearly and persuasively.
Demonstratable soft skills in motivating, inspiring and leading a team (direct line management is not part of the roles remit)
Educated to degree level or equivalent and with a minimum of 5 years proven experience in a systems administration or dev-ops blended role.
Experience implementing technologies such as Terraform, Github Actions & Containerization/Orchestration e.g. Kubernetes & Docker
Expertise in Monitoring tools like New Relic, Grafana, Alert Manager and site24x7.
Have extreme knowledge of cloud computing infrastructure, especially using Amazon Web Services (EKS, ECS, RDS, Route53 etc.)
Excellent troubleshooting, debugging, communication and documentation skills
Experience of working within an Agile product development environment.
For a detailed job description, please refer to the link at the bottom of the advert on our careers site.
We are a Disability Confident (DC) employer that is committed to equality and inclusion ensuring our recruitment process is accessible to all. The DC scheme's Offer of an Interview commitment applies to applicants who opt in, and disclose a disability or a long-term health condition, and best meet the minimum criteria for the role. In instances where interviewing all qualifying candidates is not practicable, we prioritise those who best meet the minimum criteria, as we would for applicants who do not have a disability or long-term health condition.
Cambridge University Press & Assessment is an approved UK employer for the sponsorship of eligible roles and applicants under the Skilled Worker visa route. Please refer to the gov.uk website for guidance to understand your own eligibility based on the role you are applying for.
Rewards and benefits
We will support you to be at your best in work and to live well outside of it. In addition to competitive salaries, we offer a world-class, flexible rewards package , featuring family-friendly and planet-friendly benefits including:
28 days annual leave plus bank holidays
Private medical and Permanent Health Insurance
Discretionary annual bonus
Group personal pension scheme
Life assurance up to 4 x annual salary
Green travel schemes
Ready to pursue your potential? Apply now.
We aim to support candidates by making our interview process clear and transparent. The closing date for all applications will be 4th February. We will review applications on an ongoing basis, and shortlisted candidates can expect interviews to take place shortly after it closes.
If you are shortlisted and progressed through the stages, you can expect:
A 15-minute screening call with the Hiring Manager.
Final stage virtual interview via MS Teams.
If you require any reasonable adjustments during the recruitment process due to a disability or a long-term health condition, there will be an opportunity for you to inform us via the online application form. We will do our best to accommodate your needs.
Please note that successful applicants will be subject to satisfactory background checks including DBS due to working in a regulated industry.
We are committed to an equitable recruitment process. As such, applications must be submitted via our official online application procedure. Please refrain from sending your CV directly to our recruiters. If you experience technical difficulties or require additional support with submitting your online application, contact the Recruiter.
Why join us
Joining us is your opportunity to pursue potential. You will belong to a collaborative team that is exploring new and better ways to serve students, teachers and researchers across the globe – for the benefit of individuals, society and the world. Sharing our mission will inspire your own growth, development and progress, in an environment which embraces difference, change and aspiration.
Cambridge University Press & Assessment is committed to being a place where anyone can enjoy a successful career, where it is safe to speak up, and where we learn continuously to improve together. We welcome applications from all candidates, regardless of demographic characteristics (age, disability, educational attainment, ethnicity, gender, marital status, neurodiversity, religion, sex, gender identity and sexual identity), cultural, or social class/background.
We believe better outcomes come through diversity of thought, background and approach. We welcome applications from people from all backgrounds and communities, actively seeking to employ people from a wide range of different communities.
If you are ready to take the next step in your Cambridge journey, we welcome your application. Together, we continue to shape a culture where everyone feels empowered to succeed and motivated to make a difference— for ourselves, for each other, and for learners worldwide.
AVP Infrastructure Cloud Support - AWS, Terraform, Python, DevOps, SRE - Permanent Job purpose This role is supporting the AWS Public cloud infrastructure and implementation of Infrastructure as Code using Terraform. The role will work closely with the SRE and Engineering teams to ensure that the Cloud environment has sufficient observability and is appropriately managed. What you will be doing: Responsible for ensuring the Production service is prioritized, with all service incidents, problems and requests for cloud hosted services responded to and actioned. Responsible for maintaining the reliability and security of the Cloud Hosted environments. Improve Observability and Telemetry in the Cloud Hosted environments utilizing SRE methodology to give SLA, SLO and SLIs. Ensure risks within the Cloud hosted environment are documented and regularly reviewed. Identified operational risk issues are captured with appropriate actions tracked to agreed timelines. Define and implement standards and procedures to adhere to current best practice and drive continual service improvement. Responsible for ensuring Security standards are implemented and maintained in the Cloud hosted environment. Including delivery of upgrades and security updates to minimise risk and ensure stability for all cloud hosted services. Responsible for maintaining service resilience for all cloud hosted services, including backup and disaster recovery processes. Where necessary plan and conduct quarterly DR tests for all cloud hosted services ensuring any findings are captured and addressed promptly. What we're looking for: Must have strong technical operational skills in supporting AWS Cloud Hosted environments and at least 3 years in an Infrastructure support role. Strong understanding of Infrastructure as Code technologies, ideally including Terraform and Ansible. Operational risk and control management processes, including an understanding of Security best practice and how to apply this safely within a Production environment. Asset management and life cycle (EOS/EOL) process management. Planning and leading disaster recovery fail-overs of IT systems and services. Preferably experience of working in a regulated financial services/banking organization. Able to understand and use AWS including an understanding of AWS services, security and networking. Knowledge of at least 1 programming language, preferably Python. Knowledge of CI/CD specifically relating to Cloud Hosted environments. Including an understanding of some of the Infrastructure as Code tools GIT, Terraform, Ansible, Jenkins. Permanent Role - Hybrid working (Central London based) - Candidate must be eligible to work in the UK By applying to this job you are sending us your CV, which may contain personal information. Please refer to our Privacy Notice to understand how we process this information. In short, in order to supply you with work finding services, we will hold and process your personal data, and only with your express permission we will share this personal data with a client (or a third party working on behalf of the client) by email or by upload to the Client/third parties vendor management system. By giving us permission to send your CV to a client, this constitutes permission to share the personal data that would be necessary to consider your application, interview you (Phone/video/face to face) and if successful hire you. Scope AT acts as an employment agency for Permanent Recruitment and an employment business for the supply of temporary workers. By applying for this job you accept the Terms and Conditions, Data Protection Policy, Privacy Notice and Disclaimers which can be found at our website.
06/10/2025
Full time
AVP Infrastructure Cloud Support - AWS, Terraform, Python, DevOps, SRE - Permanent Job purpose This role is supporting the AWS Public cloud infrastructure and implementation of Infrastructure as Code using Terraform. The role will work closely with the SRE and Engineering teams to ensure that the Cloud environment has sufficient observability and is appropriately managed. What you will be doing: Responsible for ensuring the Production service is prioritized, with all service incidents, problems and requests for cloud hosted services responded to and actioned. Responsible for maintaining the reliability and security of the Cloud Hosted environments. Improve Observability and Telemetry in the Cloud Hosted environments utilizing SRE methodology to give SLA, SLO and SLIs. Ensure risks within the Cloud hosted environment are documented and regularly reviewed. Identified operational risk issues are captured with appropriate actions tracked to agreed timelines. Define and implement standards and procedures to adhere to current best practice and drive continual service improvement. Responsible for ensuring Security standards are implemented and maintained in the Cloud hosted environment. Including delivery of upgrades and security updates to minimise risk and ensure stability for all cloud hosted services. Responsible for maintaining service resilience for all cloud hosted services, including backup and disaster recovery processes. Where necessary plan and conduct quarterly DR tests for all cloud hosted services ensuring any findings are captured and addressed promptly. What we're looking for: Must have strong technical operational skills in supporting AWS Cloud Hosted environments and at least 3 years in an Infrastructure support role. Strong understanding of Infrastructure as Code technologies, ideally including Terraform and Ansible. Operational risk and control management processes, including an understanding of Security best practice and how to apply this safely within a Production environment. Asset management and life cycle (EOS/EOL) process management. Planning and leading disaster recovery fail-overs of IT systems and services. Preferably experience of working in a regulated financial services/banking organization. Able to understand and use AWS including an understanding of AWS services, security and networking. Knowledge of at least 1 programming language, preferably Python. Knowledge of CI/CD specifically relating to Cloud Hosted environments. Including an understanding of some of the Infrastructure as Code tools GIT, Terraform, Ansible, Jenkins. Permanent Role - Hybrid working (Central London based) - Candidate must be eligible to work in the UK By applying to this job you are sending us your CV, which may contain personal information. Please refer to our Privacy Notice to understand how we process this information. In short, in order to supply you with work finding services, we will hold and process your personal data, and only with your express permission we will share this personal data with a client (or a third party working on behalf of the client) by email or by upload to the Client/third parties vendor management system. By giving us permission to send your CV to a client, this constitutes permission to share the personal data that would be necessary to consider your application, interview you (Phone/video/face to face) and if successful hire you. Scope AT acts as an employment agency for Permanent Recruitment and an employment business for the supply of temporary workers. By applying for this job you accept the Terms and Conditions, Data Protection Policy, Privacy Notice and Disclaimers which can be found at our website.
Senior Site Reliability Engineer 6 months Remote £Negotiable - INSIDE IR35 Tech Stack Multiple Platforms and Applications AWS and Azure - Cloud Mainframe skills would be handy Latest applications on Cloud Dev Ops skills would be helpful Attitude of being part of the team and owning the outcomes Advocate - to change the culture to SRE Disclaimer: This vacancy is being advertised by either Advanced Resource Managers Limited, Advanced Resource Managers IT Limited or Advanced Resource Managers Engineering Limited ("ARM"). ARM is a specialist talent acquisition and management consultancy. We provide technical contingency recruitment and a portfolio of more complex resource solutions. Our specialist recruitment divisions cover the entire technical arena, including some of the most economically and strategically important industries in the UK and the world today. We will never send your CV without your permission. Where the role is marked as Outside IR35 in the advertisement this is subject to receipt of a final Status Determination Statement from the end Client and may be subject to change.
06/10/2025
Contractor
Senior Site Reliability Engineer 6 months Remote £Negotiable - INSIDE IR35 Tech Stack Multiple Platforms and Applications AWS and Azure - Cloud Mainframe skills would be handy Latest applications on Cloud Dev Ops skills would be helpful Attitude of being part of the team and owning the outcomes Advocate - to change the culture to SRE Disclaimer: This vacancy is being advertised by either Advanced Resource Managers Limited, Advanced Resource Managers IT Limited or Advanced Resource Managers Engineering Limited ("ARM"). ARM is a specialist talent acquisition and management consultancy. We provide technical contingency recruitment and a portfolio of more complex resource solutions. Our specialist recruitment divisions cover the entire technical arena, including some of the most economically and strategically important industries in the UK and the world today. We will never send your CV without your permission. Where the role is marked as Outside IR35 in the advertisement this is subject to receipt of a final Status Determination Statement from the end Client and may be subject to change.
Job Description Electrical Control and Instrumentation Systems Engineer Full time Derby, Onsite with flexible working (In office 3 days a week, WFH 2 days a week) Multiple exciting opportunities have arisen for an Electrical Control and Instrumentation Systems Engineer to work on the new generation of Submarine EC&I, Dreadnought and SSNA. The EC&I Sub-system teams are responsible for the end-to-end design of the reactor C&I, integration, commissioning, power systems control and sensors. The lifecycle of design work ranges from product concept, detailed design, V&V, production and build and commissioning support. There are opportunities in various areas of the Sub-systems department and the successful candidates will be aligned to their strengths. The Sub-systems department interface with several internal and external teams such as supply chain support, manufacturing engineering support (IPT). Direct support is provided to the Barrow site office and the shipbuilder through build and commissioning documentation and issue resolution. The departments work supports both the Dreadnought and SSNA programmes. Why Rolls-Royce? Rolls-Royce is one of the most enduring and iconic brands in the world and has been at the forefront of innovation for over a century. We design, build and service systems that provide critical power to customers where safety and reliability are paramount. We are proud to be a force for progress, powering, protecting and connecting people everywhere. We want to ensure that the excellence and ingenuity that has shaped our history continues into our future and we need people like you to come and join us on this journey. What we offer We offer excellent development opportunities, a competitive salary, and exceptional benefits. These include bonus, employee support assistance and employee discounts. Your needs are as unique as you are. Hybrid working is a way in which our people can balance their time between the office, home, or another remote location. It's a locally managed and flexed informal discretionary arrangement. As a minimum we're all expected to attend the workplace for collaboration and other specific reasons, on average three days per week. What you will be doing: You will be responsible for designing the C&I, power systems control and sensors products that support, monitor and protect the reactor plant. Additionally you will be responsible for integrating the various sub-systems, defining commissioning strategy/documentation and supporting external teams/vendors. Candidates will be aligned to their strengths against the fours areas listed . Additionally, you will be: Specifying product level requirements and working with vendors to ensure these have been metVerifying the final product against the original design requirementsManaging the requirements, including traceability, through the product maturity gatesReviewing and approving design intent documentationSupport shipbuilder/vendor build and commissioning issue resolution/documentation. Who we are looking for: At Rolls-Royce we put safety first, do the right thing, keep it simple and make a difference. These principles form the behaviours that guide us and are an essential component of our assessment process. They are the fundamental qualities that we seek for all roles. Qualified to degree level or equivalent in an electrical and electronics systems engineering disciplineMember or a related professional engineering institution (eg The IET),Experience in architecture design and V&V, including requirements capture/analysis methodsBackground in power/control electronics, sensor design or systems engineering to enable an intelligent customer relationship One or more of the below is desired: Experience of using requirements management tools (eg DOORS)Proactive and automatous individual who can work the supply chain and various disciplines.Knowledge of NSRP electrical systems We are an equal opportunities employer. We're committed to developing a diverse workforce and an inclusive working environment. We believe that people from different backgrounds and cultures give us different perspectives which are crucial to innovation and problem solving. We believe the more diverse perspectives we have, the more successful we'll be. By building a culture of caring and belonging, we give everyone who works here the opportunity to realise their full potential. We welcome applications from people with a refugee background. You can learn more about our global Inclusion strategy at Our people Rolls-Royce To work for the Rolls-Royce Submarines business an individual has to hold a Security Check clearance. Rolls-Royce will support the application for Security Clearance if you do not currently already have this in place. Due to the nature of work the business conducts and the protection of certain assets we can only progress applications from individuals who are a UK national or, in MoD approved cases, a dual national. Job Category Software Systems Posting Date 22 Jul 2025; 00:07 Posting End Date PandoLogic.
03/10/2025
Full time
Job Description Electrical Control and Instrumentation Systems Engineer Full time Derby, Onsite with flexible working (In office 3 days a week, WFH 2 days a week) Multiple exciting opportunities have arisen for an Electrical Control and Instrumentation Systems Engineer to work on the new generation of Submarine EC&I, Dreadnought and SSNA. The EC&I Sub-system teams are responsible for the end-to-end design of the reactor C&I, integration, commissioning, power systems control and sensors. The lifecycle of design work ranges from product concept, detailed design, V&V, production and build and commissioning support. There are opportunities in various areas of the Sub-systems department and the successful candidates will be aligned to their strengths. The Sub-systems department interface with several internal and external teams such as supply chain support, manufacturing engineering support (IPT). Direct support is provided to the Barrow site office and the shipbuilder through build and commissioning documentation and issue resolution. The departments work supports both the Dreadnought and SSNA programmes. Why Rolls-Royce? Rolls-Royce is one of the most enduring and iconic brands in the world and has been at the forefront of innovation for over a century. We design, build and service systems that provide critical power to customers where safety and reliability are paramount. We are proud to be a force for progress, powering, protecting and connecting people everywhere. We want to ensure that the excellence and ingenuity that has shaped our history continues into our future and we need people like you to come and join us on this journey. What we offer We offer excellent development opportunities, a competitive salary, and exceptional benefits. These include bonus, employee support assistance and employee discounts. Your needs are as unique as you are. Hybrid working is a way in which our people can balance their time between the office, home, or another remote location. It's a locally managed and flexed informal discretionary arrangement. As a minimum we're all expected to attend the workplace for collaboration and other specific reasons, on average three days per week. What you will be doing: You will be responsible for designing the C&I, power systems control and sensors products that support, monitor and protect the reactor plant. Additionally you will be responsible for integrating the various sub-systems, defining commissioning strategy/documentation and supporting external teams/vendors. Candidates will be aligned to their strengths against the fours areas listed . Additionally, you will be: Specifying product level requirements and working with vendors to ensure these have been metVerifying the final product against the original design requirementsManaging the requirements, including traceability, through the product maturity gatesReviewing and approving design intent documentationSupport shipbuilder/vendor build and commissioning issue resolution/documentation. Who we are looking for: At Rolls-Royce we put safety first, do the right thing, keep it simple and make a difference. These principles form the behaviours that guide us and are an essential component of our assessment process. They are the fundamental qualities that we seek for all roles. Qualified to degree level or equivalent in an electrical and electronics systems engineering disciplineMember or a related professional engineering institution (eg The IET),Experience in architecture design and V&V, including requirements capture/analysis methodsBackground in power/control electronics, sensor design or systems engineering to enable an intelligent customer relationship One or more of the below is desired: Experience of using requirements management tools (eg DOORS)Proactive and automatous individual who can work the supply chain and various disciplines.Knowledge of NSRP electrical systems We are an equal opportunities employer. We're committed to developing a diverse workforce and an inclusive working environment. We believe that people from different backgrounds and cultures give us different perspectives which are crucial to innovation and problem solving. We believe the more diverse perspectives we have, the more successful we'll be. By building a culture of caring and belonging, we give everyone who works here the opportunity to realise their full potential. We welcome applications from people with a refugee background. You can learn more about our global Inclusion strategy at Our people Rolls-Royce To work for the Rolls-Royce Submarines business an individual has to hold a Security Check clearance. Rolls-Royce will support the application for Security Clearance if you do not currently already have this in place. Due to the nature of work the business conducts and the protection of certain assets we can only progress applications from individuals who are a UK national or, in MoD approved cases, a dual national. Job Category Software Systems Posting Date 22 Jul 2025; 00:07 Posting End Date PandoLogic.
Job Description Electrical Control and Instrumentation Systems Engineer Full time Derby, Onsite with flexible working (In office 3 days a week, WFH 2 days a week) Multiple exciting opportunities have arisen for an Electrical Control and Instrumentation Systems Engineer to work on the new generation of Submarine EC&I, Dreadnought and SSNA. The EC&I Sub-system teams are responsible for the end-to-end design of the reactor C&I, integration, commissioning, power systems control and sensors. The lifecycle of design work ranges from product concept, detailed design, V&V, production and build and commissioning support. There are opportunities in various areas of the Sub-systems department and the successful candidates will be aligned to their strengths. The Sub-systems department interface with several internal and external teams such as supply chain support, manufacturing engineering support (IPT). Direct support is provided to the Barrow site office and the shipbuilder through build and commissioning documentation and issue resolution. The departments work supports both the Dreadnought and SSNA programmes. Why Rolls-Royce? Rolls-Royce is one of the most enduring and iconic brands in the world and has been at the forefront of innovation for over a century. We design, build and service systems that provide critical power to customers where safety and reliability are paramount. We are proud to be a force for progress, powering, protecting and connecting people everywhere. We want to ensure that the excellence and ingenuity that has shaped our history continues into our future and we need people like you to come and join us on this journey. What we offer We offer excellent development opportunities, a competitive salary, and exceptional benefits. These include bonus, employee support assistance and employee discounts. Your needs are as unique as you are. Hybrid working is a way in which our people can balance their time between the office, home, or another remote location. It's a locally managed and flexed informal discretionary arrangement. As a minimum we're all expected to attend the workplace for collaboration and other specific reasons, on average three days per week. What you will be doing: You will be responsible for designing the C&I, power systems control and sensors products that support, monitor and protect the reactor plant. Additionally you will be responsible for integrating the various sub-systems, defining commissioning strategy/documentation and supporting external teams/vendors. Candidates will be aligned to their strengths against the fours areas listed . Additionally, you will be: Specifying product level requirements and working with vendors to ensure these have been metVerifying the final product against the original design requirementsManaging the requirements, including traceability, through the product maturity gatesReviewing and approving design intent documentationSupport shipbuilder/vendor build and commissioning issue resolution/documentation. Who we are looking for: At Rolls-Royce we put safety first, do the right thing, keep it simple and make a difference. These principles form the behaviours that guide us and are an essential component of our assessment process. They are the fundamental qualities that we seek for all roles. Qualified to degree level or equivalent in an electrical and electronics systems engineering disciplineMember or a related professional engineering institution (eg The IET),Experience in architecture design and V&V, including requirements capture/analysis methodsBackground in power/control electronics, sensor design or systems engineering to enable an intelligent customer relationship One or more of the below is desired: Experience of using requirements management tools (eg DOORS)Proactive and automatous individual who can work the supply chain and various disciplines.Knowledge of NSRP electrical systems We are an equal opportunities employer. We're committed to developing a diverse workforce and an inclusive working environment. We believe that people from different backgrounds and cultures give us different perspectives which are crucial to innovation and problem solving. We believe the more diverse perspectives we have, the more successful we'll be. By building a culture of caring and belonging, we give everyone who works here the opportunity to realise their full potential. We welcome applications from people with a refugee background. You can learn more about our global Inclusion strategy at Our people Rolls-Royce To work for the Rolls-Royce Submarines business an individual has to hold a Security Check clearance. Rolls-Royce will support the application for Security Clearance if you do not currently already have this in place. Due to the nature of work the business conducts and the protection of certain assets we can only progress applications from individuals who are a UK national or, in MoD approved cases, a dual national. Job Category Software Systems Posting Date 22 Jul 2025; 00:07 Posting End Date PandoLogic.
01/10/2025
Full time
Job Description Electrical Control and Instrumentation Systems Engineer Full time Derby, Onsite with flexible working (In office 3 days a week, WFH 2 days a week) Multiple exciting opportunities have arisen for an Electrical Control and Instrumentation Systems Engineer to work on the new generation of Submarine EC&I, Dreadnought and SSNA. The EC&I Sub-system teams are responsible for the end-to-end design of the reactor C&I, integration, commissioning, power systems control and sensors. The lifecycle of design work ranges from product concept, detailed design, V&V, production and build and commissioning support. There are opportunities in various areas of the Sub-systems department and the successful candidates will be aligned to their strengths. The Sub-systems department interface with several internal and external teams such as supply chain support, manufacturing engineering support (IPT). Direct support is provided to the Barrow site office and the shipbuilder through build and commissioning documentation and issue resolution. The departments work supports both the Dreadnought and SSNA programmes. Why Rolls-Royce? Rolls-Royce is one of the most enduring and iconic brands in the world and has been at the forefront of innovation for over a century. We design, build and service systems that provide critical power to customers where safety and reliability are paramount. We are proud to be a force for progress, powering, protecting and connecting people everywhere. We want to ensure that the excellence and ingenuity that has shaped our history continues into our future and we need people like you to come and join us on this journey. What we offer We offer excellent development opportunities, a competitive salary, and exceptional benefits. These include bonus, employee support assistance and employee discounts. Your needs are as unique as you are. Hybrid working is a way in which our people can balance their time between the office, home, or another remote location. It's a locally managed and flexed informal discretionary arrangement. As a minimum we're all expected to attend the workplace for collaboration and other specific reasons, on average three days per week. What you will be doing: You will be responsible for designing the C&I, power systems control and sensors products that support, monitor and protect the reactor plant. Additionally you will be responsible for integrating the various sub-systems, defining commissioning strategy/documentation and supporting external teams/vendors. Candidates will be aligned to their strengths against the fours areas listed . Additionally, you will be: Specifying product level requirements and working with vendors to ensure these have been metVerifying the final product against the original design requirementsManaging the requirements, including traceability, through the product maturity gatesReviewing and approving design intent documentationSupport shipbuilder/vendor build and commissioning issue resolution/documentation. Who we are looking for: At Rolls-Royce we put safety first, do the right thing, keep it simple and make a difference. These principles form the behaviours that guide us and are an essential component of our assessment process. They are the fundamental qualities that we seek for all roles. Qualified to degree level or equivalent in an electrical and electronics systems engineering disciplineMember or a related professional engineering institution (eg The IET),Experience in architecture design and V&V, including requirements capture/analysis methodsBackground in power/control electronics, sensor design or systems engineering to enable an intelligent customer relationship One or more of the below is desired: Experience of using requirements management tools (eg DOORS)Proactive and automatous individual who can work the supply chain and various disciplines.Knowledge of NSRP electrical systems We are an equal opportunities employer. We're committed to developing a diverse workforce and an inclusive working environment. We believe that people from different backgrounds and cultures give us different perspectives which are crucial to innovation and problem solving. We believe the more diverse perspectives we have, the more successful we'll be. By building a culture of caring and belonging, we give everyone who works here the opportunity to realise their full potential. We welcome applications from people with a refugee background. You can learn more about our global Inclusion strategy at Our people Rolls-Royce To work for the Rolls-Royce Submarines business an individual has to hold a Security Check clearance. Rolls-Royce will support the application for Security Clearance if you do not currently already have this in place. Due to the nature of work the business conducts and the protection of certain assets we can only progress applications from individuals who are a UK national or, in MoD approved cases, a dual national. Job Category Software Systems Posting Date 22 Jul 2025; 00:07 Posting End Date PandoLogic.
About the role: Join Our Team at Holland & Barrett! Are you passionate about cloud security and looking to make a significant impact? Holland & Barrett is seeking a Cloud Security Specialist to help us define and implement our cloud security strategy. If you're an experienced professional eager to work with cutting-edge technology and collaborate with diverse teams, we want to hear from you! Key Responsibilities: Security Strategy: Help define and execute the Holland & Barrett cloud security strategy, partnering with platform and Site Reliability Engineering (SRE) teams to build robust infrastructure that supports our business. Perimeter Security: Establish platform perimeter security by implementing controls at ingress and egress points, including creating and maintaining an edge network with a Web Application Firewall (WAF), Distributed Denial of Service (DDoS) protection, and a Content Delivery Network (CDN). Access Control: Establish an access control baseline focusing on the principle of least privilege and segregation of duties. Monitor and enforce these controls once roles and permissions are set. Security Controls: Design, implement, and maintain security controls to prevent, detect, and remediate insecure configurations, including defining and disseminating secure AWS/infrastructure baselines. Standards Development: Own the development and maintenance of tailored security standards and guidelines, creating reusable resources for various development teams. AWS Security Services: Establish and manage AWS security services, including certificate authorities, encryption services, insecure configuration scanners, and security control canaries. Key requirements: Essential: 5+ years of experience in cloud security, particularly with AWS, and at least 2+ years in software development. Strong understanding of cloud and application security concepts, including secure coding practices, threat modeling, vulnerability management, and access control mechanisms. Experience with AWS, Kubernetes, Service Mesh, API gateways, and API Security (authentication and authorization). Proficiency in programming languages such as Python, JavaScript, GoLang, Terraform, CloudFormation (AWS), and AWS CDK. Familiarity with Agile methodologies like SCRUM, along with proven project management skills to manage multiple security projects effectively. Desired: Ability to work independently, take initiative, and maintain a keen attention to detail, ensuring high security standards. Strong communication and interpersonal skills, facilitating effective collaboration with both technical and non-technical teams. Why Holland & Barrett? At Holland & Barrett, we are dedicated to promoting health and well-being while ensuring the highest standards of cloud security. Join our team and be part of a company that values innovation and security. Ready to Make an Impact? If you're excited about cloud security and want to contribute to a secure future, apply now! We look forward to welcoming you to our team. We support flexibility and productivity of our employees by hybrid working arrangements. Although your role will be based in London (or Nuneaton, or Amsterdam) you will be required to travel only occasionally to our Hubs in Nuneaton or London or to any other location of H&B. What we offer: Pension company contribution = 3% Incentive scheme up to 10% of annual salary , based on company performance. Your wellbeing is paramount so you can get away and take 33 Days Holiday per year . Private Medical Care (Self after 1 year) Learning and Development opportunity with Holland & Barrett is a great base for career development long term. Career progression. Refer and Earn Scheme - as we're growing you can earn money by referring people to join us from your network. Epic Extras gives you access to exclusive benefits, free advice and savings from a range of retailers and providers. Stay healthy with Discounted Products - from day one you'll get a 25% discount (on top of other promotions) when you shop at H&B on anything that you buy. We all need a little help sometimes, so weoffer Free 24/7 Confidential Advice & Colleague Welfare . Mental Health First Aiders - we have lots of qualified Mental Health First Aiders because its all about your health & wellbeing. Stay active in the Onsite Gym at our Nuneaton Hub! We have colleague Reward and Recognition Schemes , so your hard work and loyalty won't go unnoticed. And many more! We're passionate about helping every colleague thrive across all dimensions of wellbeing, and we're committed to having a diverse and inclusive workplace. In line with our EPIC values (Expertise, Pioneering, Inclusive, Caring), we embrace and actively celebrate all our colleagues' unique and varying experiences, backgrounds, identities and cultures - I am me, we are H&B. Holland & Barrett does not accept unsolicited resumes from search firms/recruiters. Please do not forward resumes to our job alias, employees, or any other company location. Holland & Barrett is not and will not be responsible for any fees if a candidate submitted by a search firm/recruiter unless otherwise agreed with respect to specific open position(s).
01/10/2025
Full time
About the role: Join Our Team at Holland & Barrett! Are you passionate about cloud security and looking to make a significant impact? Holland & Barrett is seeking a Cloud Security Specialist to help us define and implement our cloud security strategy. If you're an experienced professional eager to work with cutting-edge technology and collaborate with diverse teams, we want to hear from you! Key Responsibilities: Security Strategy: Help define and execute the Holland & Barrett cloud security strategy, partnering with platform and Site Reliability Engineering (SRE) teams to build robust infrastructure that supports our business. Perimeter Security: Establish platform perimeter security by implementing controls at ingress and egress points, including creating and maintaining an edge network with a Web Application Firewall (WAF), Distributed Denial of Service (DDoS) protection, and a Content Delivery Network (CDN). Access Control: Establish an access control baseline focusing on the principle of least privilege and segregation of duties. Monitor and enforce these controls once roles and permissions are set. Security Controls: Design, implement, and maintain security controls to prevent, detect, and remediate insecure configurations, including defining and disseminating secure AWS/infrastructure baselines. Standards Development: Own the development and maintenance of tailored security standards and guidelines, creating reusable resources for various development teams. AWS Security Services: Establish and manage AWS security services, including certificate authorities, encryption services, insecure configuration scanners, and security control canaries. Key requirements: Essential: 5+ years of experience in cloud security, particularly with AWS, and at least 2+ years in software development. Strong understanding of cloud and application security concepts, including secure coding practices, threat modeling, vulnerability management, and access control mechanisms. Experience with AWS, Kubernetes, Service Mesh, API gateways, and API Security (authentication and authorization). Proficiency in programming languages such as Python, JavaScript, GoLang, Terraform, CloudFormation (AWS), and AWS CDK. Familiarity with Agile methodologies like SCRUM, along with proven project management skills to manage multiple security projects effectively. Desired: Ability to work independently, take initiative, and maintain a keen attention to detail, ensuring high security standards. Strong communication and interpersonal skills, facilitating effective collaboration with both technical and non-technical teams. Why Holland & Barrett? At Holland & Barrett, we are dedicated to promoting health and well-being while ensuring the highest standards of cloud security. Join our team and be part of a company that values innovation and security. Ready to Make an Impact? If you're excited about cloud security and want to contribute to a secure future, apply now! We look forward to welcoming you to our team. We support flexibility and productivity of our employees by hybrid working arrangements. Although your role will be based in London (or Nuneaton, or Amsterdam) you will be required to travel only occasionally to our Hubs in Nuneaton or London or to any other location of H&B. What we offer: Pension company contribution = 3% Incentive scheme up to 10% of annual salary , based on company performance. Your wellbeing is paramount so you can get away and take 33 Days Holiday per year . Private Medical Care (Self after 1 year) Learning and Development opportunity with Holland & Barrett is a great base for career development long term. Career progression. Refer and Earn Scheme - as we're growing you can earn money by referring people to join us from your network. Epic Extras gives you access to exclusive benefits, free advice and savings from a range of retailers and providers. Stay healthy with Discounted Products - from day one you'll get a 25% discount (on top of other promotions) when you shop at H&B on anything that you buy. We all need a little help sometimes, so weoffer Free 24/7 Confidential Advice & Colleague Welfare . Mental Health First Aiders - we have lots of qualified Mental Health First Aiders because its all about your health & wellbeing. Stay active in the Onsite Gym at our Nuneaton Hub! We have colleague Reward and Recognition Schemes , so your hard work and loyalty won't go unnoticed. And many more! We're passionate about helping every colleague thrive across all dimensions of wellbeing, and we're committed to having a diverse and inclusive workplace. In line with our EPIC values (Expertise, Pioneering, Inclusive, Caring), we embrace and actively celebrate all our colleagues' unique and varying experiences, backgrounds, identities and cultures - I am me, we are H&B. Holland & Barrett does not accept unsolicited resumes from search firms/recruiters. Please do not forward resumes to our job alias, employees, or any other company location. Holland & Barrett is not and will not be responsible for any fees if a candidate submitted by a search firm/recruiter unless otherwise agreed with respect to specific open position(s).
LA International Computer Consultants Ltd
Wokingham, Berkshire
Our client is looking for a Principal Site Reliability Engineers to join their team on a initial three month contract with good scope for extension. They require candidates to be able to go to site in Wokingham twice a week and rest remote. This role is Inside IR35 and requires an active SC clearance. Role Description: Collaborate with Agile teams to automate deployment, monitoring, and infrastructure management. Ensure platform and business application reliability and performance against strict SLAs and KPIs. Implement and maintain cloud-native observability stacks (Prometheus, Grafana, Loki, Tempo). Develop and maintain Infrastructure as Code (IaC) using tools like Kustomize or Helm. Manage CI/CD pipelines using Tekton and ArgoCD. Support and troubleshoot OpenShift Operators (ServiceMesh, ODF, ACS, ACM, AMQ). Conduct security reviews and implement controls aligned with national infrastructure standards. Mentor junior engineers and promote SRE best practices. Collaborate with vendors and IT teams for incident resolution and platform improvements. Required Skills: Strong communication skills (written and verbal). Experience in remote team collaboration. Deep expertise in OpenShift/Kubernetes and RedHat Linux. Proficiency in Scripting (Bash, Python) and templating (Helm, Kustomize). Experience with CI/CD automation and IaC strategies. Security-first mindset with experience in regulated environments. Experience with VMware vSphere virtualization Due to the nature and urgency of this post, candidates holding or who have held high level security clearance in the past are most welcome to apply. Please note successful applicants will be required to be security cleared prior to appointment which can take up to a minimum 10 weeks. LA International is a HMG approved ICT Recruitment and Project Solutions Consultancy, operating globally from the largest single site in the UK as an IT Consultancy or as an Employment Business & Agency depending upon the precise nature of the work, for security cleared jobs or non-clearance vacancies, LA International welcome applications from all sections of the community and from people with diverse experience and backgrounds. Award Winning LA International, winner of the Recruiter Awards for Excellence, Best IT Recruitment Company, Best Public Sector Recruitment Company and overall Gold Award winner, has now secured the most prestigious business award that any business can receive, The Queens Award for Enterprise: International Trade, for the second consecutive period.
01/10/2025
Contractor
Our client is looking for a Principal Site Reliability Engineers to join their team on a initial three month contract with good scope for extension. They require candidates to be able to go to site in Wokingham twice a week and rest remote. This role is Inside IR35 and requires an active SC clearance. Role Description: Collaborate with Agile teams to automate deployment, monitoring, and infrastructure management. Ensure platform and business application reliability and performance against strict SLAs and KPIs. Implement and maintain cloud-native observability stacks (Prometheus, Grafana, Loki, Tempo). Develop and maintain Infrastructure as Code (IaC) using tools like Kustomize or Helm. Manage CI/CD pipelines using Tekton and ArgoCD. Support and troubleshoot OpenShift Operators (ServiceMesh, ODF, ACS, ACM, AMQ). Conduct security reviews and implement controls aligned with national infrastructure standards. Mentor junior engineers and promote SRE best practices. Collaborate with vendors and IT teams for incident resolution and platform improvements. Required Skills: Strong communication skills (written and verbal). Experience in remote team collaboration. Deep expertise in OpenShift/Kubernetes and RedHat Linux. Proficiency in Scripting (Bash, Python) and templating (Helm, Kustomize). Experience with CI/CD automation and IaC strategies. Security-first mindset with experience in regulated environments. Experience with VMware vSphere virtualization Due to the nature and urgency of this post, candidates holding or who have held high level security clearance in the past are most welcome to apply. Please note successful applicants will be required to be security cleared prior to appointment which can take up to a minimum 10 weeks. LA International is a HMG approved ICT Recruitment and Project Solutions Consultancy, operating globally from the largest single site in the UK as an IT Consultancy or as an Employment Business & Agency depending upon the precise nature of the work, for security cleared jobs or non-clearance vacancies, LA International welcome applications from all sections of the community and from people with diverse experience and backgrounds. Award Winning LA International, winner of the Recruiter Awards for Excellence, Best IT Recruitment Company, Best Public Sector Recruitment Company and overall Gold Award winner, has now secured the most prestigious business award that any business can receive, The Queens Award for Enterprise: International Trade, for the second consecutive period.
LA International Computer Consultants Ltd
Wokingham, Berkshire
Our client is looking for a number of hands on Site Reliability Engineers to join their team on a initial three month contract with good scope for extension. They require candidates to be able to go to site in Wokingham twice a week and rest remote. This role is Inside IR35 and needs active SC clearance. Role Description: Collaborate with Agile teams to automate deployment, monitoring, and infrastructure management. Ensure platform and business application reliability and performance against strict SLAs and KPIs. Implement and maintain cloud-native observability stacks (Prometheus, Grafana, Loki, Tempo). Develop and maintain Infrastructure as Code (IaC) using tools like Kustomize or Helm. Manage CI/CD pipelines using Tekton and ArgoCD. Support and troubleshoot OpenShift Operators (ServiceMesh, ODF, ACS, ACM, AMQ). Conduct security reviews and implement controls aligned with national infrastructure standards. Mentor junior engineers and promote SRE best practices. Collaborate with vendors and IT teams for incident resolution and platform improvements. Required Skills: Strong communication skills (written and verbal). Experience in remote team collaboration. Deep expertise in OpenShift/Kubernetes and RedHat Linux. Proficiency in Scripting (Bash, Python) and templating (Helm, Kustomize). Experience with CI/CD automation and IaC strategies. Security-first mindset with experience in regulated environments. Experience with VMware vSphere virtualization? Due to the nature and urgency of this post, candidates holding or who have held high level security clearance in the past are most welcome to apply. Please note successful applicants will be required to be security cleared prior to appointment which can take up to a minimum 10 weeks. LA International is a HMG approved ICT Recruitment and Project Solutions Consultancy, operating globally from the largest single site in the UK as an IT Consultancy or as an Employment Business & Agency depending upon the precise nature of the work, for security cleared jobs or non-clearance vacancies, LA International welcome applications from all sections of the community and from people with diverse experience and backgrounds. Award Winning LA International, winner of the Recruiter Awards for Excellence, Best IT Recruitment Company, Best Public Sector Recruitment Company and overall Gold Award winner, has now secured the most prestigious business award that any business can receive, The Queens Award for Enterprise: International Trade, for the second consecutive period.
01/10/2025
Contractor
Our client is looking for a number of hands on Site Reliability Engineers to join their team on a initial three month contract with good scope for extension. They require candidates to be able to go to site in Wokingham twice a week and rest remote. This role is Inside IR35 and needs active SC clearance. Role Description: Collaborate with Agile teams to automate deployment, monitoring, and infrastructure management. Ensure platform and business application reliability and performance against strict SLAs and KPIs. Implement and maintain cloud-native observability stacks (Prometheus, Grafana, Loki, Tempo). Develop and maintain Infrastructure as Code (IaC) using tools like Kustomize or Helm. Manage CI/CD pipelines using Tekton and ArgoCD. Support and troubleshoot OpenShift Operators (ServiceMesh, ODF, ACS, ACM, AMQ). Conduct security reviews and implement controls aligned with national infrastructure standards. Mentor junior engineers and promote SRE best practices. Collaborate with vendors and IT teams for incident resolution and platform improvements. Required Skills: Strong communication skills (written and verbal). Experience in remote team collaboration. Deep expertise in OpenShift/Kubernetes and RedHat Linux. Proficiency in Scripting (Bash, Python) and templating (Helm, Kustomize). Experience with CI/CD automation and IaC strategies. Security-first mindset with experience in regulated environments. Experience with VMware vSphere virtualization? Due to the nature and urgency of this post, candidates holding or who have held high level security clearance in the past are most welcome to apply. Please note successful applicants will be required to be security cleared prior to appointment which can take up to a minimum 10 weeks. LA International is a HMG approved ICT Recruitment and Project Solutions Consultancy, operating globally from the largest single site in the UK as an IT Consultancy or as an Employment Business & Agency depending upon the precise nature of the work, for security cleared jobs or non-clearance vacancies, LA International welcome applications from all sections of the community and from people with diverse experience and backgrounds. Award Winning LA International, winner of the Recruiter Awards for Excellence, Best IT Recruitment Company, Best Public Sector Recruitment Company and overall Gold Award winner, has now secured the most prestigious business award that any business can receive, The Queens Award for Enterprise: International Trade, for the second consecutive period.
We are a Global Recruitment specialist that provides support to the clients across EMEA, APAC, US and Canada. We have an excellent job opportunity for you Location: Wokingham (Reading) | Hybrid - 60% remote and 40% onsite Duration: 30/01/2026 - possible extension CONTRACTOR MUST HOLD ACTIVE SC CLEARANCE Role Description: Collaborate with Agile teams to automate deployment, monitoring, and infrastructure management. Ensure platform and business application reliability and performance against strict SLAs and KPIs. Implement and maintain cloud-native observability stacks (Prometheus, Grafana, Loki, Tempo). Develop and maintain Infrastructure as Code (IaC) using tools like Kustomize or Helm. Manage CI/CD pipelines using Tekton and ArgoCD. Support and troubleshoot OpenShift Operators (ServiceMesh, ODF, ACS, ACM, AMQ). Conduct security reviews and implement controls aligned with national infrastructure standards. Mentor junior engineers and promote SRE best practices. Collaborate with vendors and IT teams for incident resolution and platform improvements. Required Skills: Strong communication skills (written and verbal). Experience in remote team collaboration. Deep expertise in OpenShift/Kubernetes and RedHat Linux. Proficiency in Scripting (Bash, Python) and templating (Helm, Kustomize). Experience with CI/CD automation and IaC strategies. Security-first mindset with experience in regulated environments. Experience with VMware vSphere virtualization? If you are interested in this position and would like to learn more, please send through your CV and we will get in touch with you as soon as possible. Please note, candidates are often Shortlisted within 48 hours.
01/10/2025
Contractor
We are a Global Recruitment specialist that provides support to the clients across EMEA, APAC, US and Canada. We have an excellent job opportunity for you Location: Wokingham (Reading) | Hybrid - 60% remote and 40% onsite Duration: 30/01/2026 - possible extension CONTRACTOR MUST HOLD ACTIVE SC CLEARANCE Role Description: Collaborate with Agile teams to automate deployment, monitoring, and infrastructure management. Ensure platform and business application reliability and performance against strict SLAs and KPIs. Implement and maintain cloud-native observability stacks (Prometheus, Grafana, Loki, Tempo). Develop and maintain Infrastructure as Code (IaC) using tools like Kustomize or Helm. Manage CI/CD pipelines using Tekton and ArgoCD. Support and troubleshoot OpenShift Operators (ServiceMesh, ODF, ACS, ACM, AMQ). Conduct security reviews and implement controls aligned with national infrastructure standards. Mentor junior engineers and promote SRE best practices. Collaborate with vendors and IT teams for incident resolution and platform improvements. Required Skills: Strong communication skills (written and verbal). Experience in remote team collaboration. Deep expertise in OpenShift/Kubernetes and RedHat Linux. Proficiency in Scripting (Bash, Python) and templating (Helm, Kustomize). Experience with CI/CD automation and IaC strategies. Security-first mindset with experience in regulated environments. Experience with VMware vSphere virtualization? If you are interested in this position and would like to learn more, please send through your CV and we will get in touch with you as soon as possible. Please note, candidates are often Shortlisted within 48 hours.
Job Title: Splunk Site Reliability Engineer/Migration Specialist (Contract) Location: Birmingham (Hybrid/On-site, required 3 days per week) Contract Type: Contract Duration: 3 months rolling Job Summary: We are seeking an experienced Splunk SME/Migration Specialist to lead and support the migration of observability workloads from Splunk to Elasticsearch (ELK Stack) . The ideal candidate will bring hands-on expertise in Splunk architecture, data ingestion, alerting, and dashboarding, along with experience migrating workloads to Elasticsearch. In addition to migration duties, the candidate will maintain and enhance existing Splunk infrastructure, provide incident support, manage upgrades, and ensure observability platforms remain secure and performant. This role demands a technically strong individual with excellent stakeholder communication and problem-solving skills. Key Responsibilities: Migration: Develop and implement a comprehensive migration strategy from Splunk to Elasticsearch (ELK Stack). Assess existing Splunk configurations (dashboards, alerts, saved searches, data models) and recreate them in Kibana. Collaborate with Elastic teams to configure alerting and monitoring using Kibana, Elasticsearch Watcher, or third-party tools. Ensure migration plans include validation, rollback procedures, and knowledge transfer. Platform Operations & Incident Response: Maintain Splunk infrastructure in both Production and Non-Production environments. Support Splunk SRE and Application teams in incident investigation and resolution. Proactively monitor system health and performance metrics. Upgrades and Change Management: Plan and execute upgrades to Splunk components. Perform pre- and post-upgrade checks and validations. Prepare documentation and submit Change Requests following organizational procedures. Security and Compliance: Work with Puppet and other automation tools to ensure timely patching of vulnerabilities. Implement and verify security best practices for observability platforms. Support compliance initiatives and audits. Documentation and Knowledge Sharing: Maintain accurate and up-to-date technical documentation, including architecture diagrams, configurations, procedures, and troubleshooting guides. Review and update support articles and take ownership of relevant assets. Support knowledge transfer across teams as needed. Troubleshooting and Support: Identify and resolve issues in Splunk and ELK environments. Assist teams with Splunk-related queries and optimization efforts. Skills and Qualifications: Essential: Proven expertise with Splunk architecture , data ingestion, dashboarding, alerting, and administration. Experience migrating Splunk workloads to Elasticsearch (ELK Stack) . Solid understanding of Kibana , Elasticsearch Watcher , and observability tooling. Proficiency in Linux/Unix systems and networking protocols . Hands-on experience with Scripting (eg, Python, Shell/Bash). Experience supporting or working alongside DevOps/SRE teams . Strong analytical, troubleshooting, and communication skills. Desirable: Experience with containerized environments such as Docker or Kubernetes . Industry certifications such as Splunk Certified Power User/Admin/Architect . Knowledge of automation tools (eg, Puppet, Ansible). Bachelor's degree in Computer Science, Information Systems, or related field. Key Attributes: Independent and proactive problem-solver. Collaborative and able to work cross-functionally with infrastructure, security, and application teams. Able to work under pressure and prioritize tasks effectively. Strong communicator, both written and verbal.
04/09/2025
Contractor
Job Title: Splunk Site Reliability Engineer/Migration Specialist (Contract) Location: Birmingham (Hybrid/On-site, required 3 days per week) Contract Type: Contract Duration: 3 months rolling Job Summary: We are seeking an experienced Splunk SME/Migration Specialist to lead and support the migration of observability workloads from Splunk to Elasticsearch (ELK Stack) . The ideal candidate will bring hands-on expertise in Splunk architecture, data ingestion, alerting, and dashboarding, along with experience migrating workloads to Elasticsearch. In addition to migration duties, the candidate will maintain and enhance existing Splunk infrastructure, provide incident support, manage upgrades, and ensure observability platforms remain secure and performant. This role demands a technically strong individual with excellent stakeholder communication and problem-solving skills. Key Responsibilities: Migration: Develop and implement a comprehensive migration strategy from Splunk to Elasticsearch (ELK Stack). Assess existing Splunk configurations (dashboards, alerts, saved searches, data models) and recreate them in Kibana. Collaborate with Elastic teams to configure alerting and monitoring using Kibana, Elasticsearch Watcher, or third-party tools. Ensure migration plans include validation, rollback procedures, and knowledge transfer. Platform Operations & Incident Response: Maintain Splunk infrastructure in both Production and Non-Production environments. Support Splunk SRE and Application teams in incident investigation and resolution. Proactively monitor system health and performance metrics. Upgrades and Change Management: Plan and execute upgrades to Splunk components. Perform pre- and post-upgrade checks and validations. Prepare documentation and submit Change Requests following organizational procedures. Security and Compliance: Work with Puppet and other automation tools to ensure timely patching of vulnerabilities. Implement and verify security best practices for observability platforms. Support compliance initiatives and audits. Documentation and Knowledge Sharing: Maintain accurate and up-to-date technical documentation, including architecture diagrams, configurations, procedures, and troubleshooting guides. Review and update support articles and take ownership of relevant assets. Support knowledge transfer across teams as needed. Troubleshooting and Support: Identify and resolve issues in Splunk and ELK environments. Assist teams with Splunk-related queries and optimization efforts. Skills and Qualifications: Essential: Proven expertise with Splunk architecture , data ingestion, dashboarding, alerting, and administration. Experience migrating Splunk workloads to Elasticsearch (ELK Stack) . Solid understanding of Kibana , Elasticsearch Watcher , and observability tooling. Proficiency in Linux/Unix systems and networking protocols . Hands-on experience with Scripting (eg, Python, Shell/Bash). Experience supporting or working alongside DevOps/SRE teams . Strong analytical, troubleshooting, and communication skills. Desirable: Experience with containerized environments such as Docker or Kubernetes . Industry certifications such as Splunk Certified Power User/Admin/Architect . Knowledge of automation tools (eg, Puppet, Ansible). Bachelor's degree in Computer Science, Information Systems, or related field. Key Attributes: Independent and proactive problem-solver. Collaborative and able to work cross-functionally with infrastructure, security, and application teams. Able to work under pressure and prioritize tasks effectively. Strong communicator, both written and verbal.
Job Description Electrical Control and Instrumentation Systems Engineer Full time Derby, Onsite with flexible working (In office 3 days a week, WFH 2 days a week) Multiple exciting opportunities have arisen for an Electrical Control and Instrumentation Systems Engineer to work on the new generation of Submarine EC&I, Dreadnought and SSNA. The EC&I Sub-system teams are responsible for the end-to-end design of the reactor C&I, integration, commissioning, power systems control and sensors. The lifecycle of design work ranges from product concept, detailed design, V&V, production and build and commissioning support. There are opportunities in various areas of the Sub-systems department and the successful candidates will be aligned to their strengths. The Sub-systems department interface with several internal and external teams such as supply chain support, manufacturing engineering support (IPT). Direct support is provided to the Barrow site office and the shipbuilder through build and commissioning documentation and issue resolution. The departments work supports both the Dreadnought and SSNA programmes. Why Rolls-Royce? Rolls-Royce is one of the most enduring and iconic brands in the world and has been at the forefront of innovation for over a century. We design, build and service systems that provide critical power to customers where safety and reliability are paramount. We are proud to be a force for progress, powering, protecting and connecting people everywhere. We want to ensure that the excellence and ingenuity that has shaped our history continues into our future and we need people like you to come and join us on this journey. What we offer We offer excellent development opportunities, a competitive salary, and exceptional benefits. These include bonus, employee support assistance and employee discounts. Your needs are as unique as you are. Hybrid working is a way in which our people can balance their time between the office, home, or another remote location. It's a locally managed and flexed informal discretionary arrangement. As a minimum we're all expected to attend the workplace for collaboration and other specific reasons, on average three days per week. What you will be doing: You will be responsible for designing the C&I, power systems control and sensors products that support, monitor and protect the reactor plant. Additionally you will be responsible for integrating the various sub-systems, defining commissioning strategy/documentation and supporting external teams/vendors. Candidates will be aligned to their strengths against the fours areas listed . Additionally, you will be: Specifying product level requirements and working with vendors to ensure these have been metVerifying the final product against the original design requirementsManaging the requirements, including traceability, through the product maturity gatesReviewing and approving design intent documentationSupport shipbuilder/vendor build and commissioning issue resolution/documentation. Who we are looking for: At Rolls-Royce we put safety first, do the right thing, keep it simple and make a difference. These principles form the behaviours that guide us and are an essential component of our assessment process. They are the fundamental qualities that we seek for all roles. Qualified to degree level or equivalent in an electrical and electronics systems engineering disciplineMember or a related professional engineering institution (eg The IET),Experience in architecture design and V&V, including requirements capture/analysis methodsBackground in power/control electronics, sensor design or systems engineering to enable an intelligent customer relationship One or more of the below is desired: Experience of using requirements management tools (eg DOORS)Proactive and automatous individual who can work the supply chain and various disciplines.Knowledge of NSRP electrical systems We are an equal opportunities employer. We're committed to developing a diverse workforce and an inclusive working environment. We believe that people from different backgrounds and cultures give us different perspectives which are crucial to innovation and problem solving. We believe the more diverse perspectives we have, the more successful we'll be. By building a culture of caring and belonging, we give everyone who works here the opportunity to realise their full potential. We welcome applications from people with a refugee background. You can learn more about our global Inclusion strategy at Our people Rolls-Royce To work for the Rolls-Royce Submarines business an individual has to hold a Security Check clearance. Rolls-Royce will support the application for Security Clearance if you do not currently already have this in place. Due to the nature of work the business conducts and the protection of certain assets we can only progress applications from individuals who are a UK national or, in MoD approved cases, a dual national. Job Category Software Systems Posting Date 22 Jul 2025; 00:07 Posting End Date PandoLogic.
02/09/2025
Full time
Job Description Electrical Control and Instrumentation Systems Engineer Full time Derby, Onsite with flexible working (In office 3 days a week, WFH 2 days a week) Multiple exciting opportunities have arisen for an Electrical Control and Instrumentation Systems Engineer to work on the new generation of Submarine EC&I, Dreadnought and SSNA. The EC&I Sub-system teams are responsible for the end-to-end design of the reactor C&I, integration, commissioning, power systems control and sensors. The lifecycle of design work ranges from product concept, detailed design, V&V, production and build and commissioning support. There are opportunities in various areas of the Sub-systems department and the successful candidates will be aligned to their strengths. The Sub-systems department interface with several internal and external teams such as supply chain support, manufacturing engineering support (IPT). Direct support is provided to the Barrow site office and the shipbuilder through build and commissioning documentation and issue resolution. The departments work supports both the Dreadnought and SSNA programmes. Why Rolls-Royce? Rolls-Royce is one of the most enduring and iconic brands in the world and has been at the forefront of innovation for over a century. We design, build and service systems that provide critical power to customers where safety and reliability are paramount. We are proud to be a force for progress, powering, protecting and connecting people everywhere. We want to ensure that the excellence and ingenuity that has shaped our history continues into our future and we need people like you to come and join us on this journey. What we offer We offer excellent development opportunities, a competitive salary, and exceptional benefits. These include bonus, employee support assistance and employee discounts. Your needs are as unique as you are. Hybrid working is a way in which our people can balance their time between the office, home, or another remote location. It's a locally managed and flexed informal discretionary arrangement. As a minimum we're all expected to attend the workplace for collaboration and other specific reasons, on average three days per week. What you will be doing: You will be responsible for designing the C&I, power systems control and sensors products that support, monitor and protect the reactor plant. Additionally you will be responsible for integrating the various sub-systems, defining commissioning strategy/documentation and supporting external teams/vendors. Candidates will be aligned to their strengths against the fours areas listed . Additionally, you will be: Specifying product level requirements and working with vendors to ensure these have been metVerifying the final product against the original design requirementsManaging the requirements, including traceability, through the product maturity gatesReviewing and approving design intent documentationSupport shipbuilder/vendor build and commissioning issue resolution/documentation. Who we are looking for: At Rolls-Royce we put safety first, do the right thing, keep it simple and make a difference. These principles form the behaviours that guide us and are an essential component of our assessment process. They are the fundamental qualities that we seek for all roles. Qualified to degree level or equivalent in an electrical and electronics systems engineering disciplineMember or a related professional engineering institution (eg The IET),Experience in architecture design and V&V, including requirements capture/analysis methodsBackground in power/control electronics, sensor design or systems engineering to enable an intelligent customer relationship One or more of the below is desired: Experience of using requirements management tools (eg DOORS)Proactive and automatous individual who can work the supply chain and various disciplines.Knowledge of NSRP electrical systems We are an equal opportunities employer. We're committed to developing a diverse workforce and an inclusive working environment. We believe that people from different backgrounds and cultures give us different perspectives which are crucial to innovation and problem solving. We believe the more diverse perspectives we have, the more successful we'll be. By building a culture of caring and belonging, we give everyone who works here the opportunity to realise their full potential. We welcome applications from people with a refugee background. You can learn more about our global Inclusion strategy at Our people Rolls-Royce To work for the Rolls-Royce Submarines business an individual has to hold a Security Check clearance. Rolls-Royce will support the application for Security Clearance if you do not currently already have this in place. Due to the nature of work the business conducts and the protection of certain assets we can only progress applications from individuals who are a UK national or, in MoD approved cases, a dual national. Job Category Software Systems Posting Date 22 Jul 2025; 00:07 Posting End Date PandoLogic.