it job board logo
  • Home
  • Find IT Jobs
  • Register CV
  • Career Advice
  • Contact us
  • Employers
    • Register as Employer
    • Pricing Plans
  • Recruiting? Post a job
  • Sign in
  • Sign up
  • Home
  • Find IT Jobs
  • Register CV
  • Career Advice
  • Contact us
  • Employers
    • Register as Employer
    • Pricing Plans
Sorry, that job is no longer available. Here are some results that may be similar to the job you were looking for.

12 jobs found

Email me jobs like this
Refine Search
Current Search
principal sre site reliability
Principal Software Engineer
PEXA Group Limited Leeds, Yorkshire
We're looking for a Principal Engineer to help shape the future of engineering at PEXA UK. In this technical leader role you will work closely with the UK CTO and engineering leadership team to define technical strategy, drive architectural decisions, champion innovation, and build the engineering capabilities that support our next phase of growth. What You'll Be Doing Technical Strategy & Architecture Define and execute engineering strategy across multiple teams and domains. Drive architectural decisions that balance business priorities, scalability, security and long term sustainability. Design solutions that support future growth and evolving customer needs. Identify and address technical risks across the organisation. Engineering Excellence Establish and evolve engineering standards, best practices and quality frameworks. Drive improvements in performance, reliability, security and developer productivity. Champion modern software engineering principles and continuous improvement. Support the evolution of platform, infrastructure and service architecture. AI & Emerging Technology Lead the evaluation and adoption of emerging technologies and AI capabilities. Drive the integration of AI tools and practices that enhance engineering effectiveness. Explore and implement modern AI powered development workflows, coding assistants and agentic systems. Help shape PEXA's approach to AI innovation and organisational adoption. Leadership & Mentorship Mentor and coach engineers across all levels, from emerging talent to senior technical leaders. Foster a culture of technical excellence, curiosity and continuous learning. Lead technical discussions, knowledge sharing initiatives and engineering communities of practice. Contribute to hiring and technical assessment processes. Collaboration & Business Impact Partner with Product, Design and Business leaders to align technical strategy with organisational goals. Translate business objectives into technical roadmaps and engineering outcomes. Influence decision making at executive level through clear communication and technical insight. Represent PEXA externally within the technology community and industry forums. About You You'll bring significant experience designing and delivering large scale distributed systems, a track record of defining technical strategy and driving organisation wide engineering outcomes, and strong experience working across multiple teams, platforms and stakeholders. You have a proven ability to balance technical excellence with commercial priorities and communicate effectively at all levels of the organisation. Technical Expertise Node.js, Python and Kotlin Modern frontend technologies Cloud platforms such as AWS and Azure Microservices and distributed architectures Containerisation and cloud native development Security architecture and secure software development practices Performance, scalability and reliability engineering Strong interest in the evolving AI landscape, including AI assisted software development tools, agentic AI frameworks and modern approaches to context engineering and AI enhanced developer productivity. Nice to Have Experience within financial services, property technology or other regulated industries. Exposure to DevSecOps, Site Reliability Engineering (SRE) or Platform Engineering environments. Knowledge of AI frameworks and technologies such as Anthropic Claude, LangGraph, Hugging Face, LiteLLM or Pydantic AI. £100,000 - £120,000 a year Benefits Annual Leave: 25 days per year (increasing to 27 days after 5 years), plus Bank Holidays. Annual Leave Purchase: Option to purchase up to 1 working week of additional leave. Wellness Days: 4 paid days per year to rest and recharge. Paid Volunteer Day: 1 paid day per year to volunteer at a charity of your choice. Pension scheme (Aviva): 6% employer contribution, 4% employee contribution. Life Assurance: 4 basic salary. Health Flex Pot via Benifex: £800 flexible benefit to spend on health options of your choice. Additional Benifex perks: voluntary opt in for benefits such as Activity Pass, Perkbox discounts, Beauty and Fitness Discount Card, Cycle to Work, Motor Breakdown, Will Writing, or enhanced coverage of benefits provided by the company. Critical Illness Cover: 1 basic salary. Group wide Equity Plan: Eligibility for participation in the group wide employee share plan, subject to annual invitation. Enhanced Maternity/Adoption Pay: Up to 20 weeks full pay. Enhanced Paternity Pay: Up to 4 weeks full pay. Performance based Incentive Scheme: Eligibility for a discretionary Short Term Incentive plan, based on job role and level. Learning & Development: Access to LinkedIn Learning and reimbursement of professional membership costs. Additional employee assistance programme and wellbeing resources, eye test vouchers and VDU glasses contribution. Office Space: Use of PEXA office workspaces in Leeds and Thame.
07/06/2026
Full time
We're looking for a Principal Engineer to help shape the future of engineering at PEXA UK. In this technical leader role you will work closely with the UK CTO and engineering leadership team to define technical strategy, drive architectural decisions, champion innovation, and build the engineering capabilities that support our next phase of growth. What You'll Be Doing Technical Strategy & Architecture Define and execute engineering strategy across multiple teams and domains. Drive architectural decisions that balance business priorities, scalability, security and long term sustainability. Design solutions that support future growth and evolving customer needs. Identify and address technical risks across the organisation. Engineering Excellence Establish and evolve engineering standards, best practices and quality frameworks. Drive improvements in performance, reliability, security and developer productivity. Champion modern software engineering principles and continuous improvement. Support the evolution of platform, infrastructure and service architecture. AI & Emerging Technology Lead the evaluation and adoption of emerging technologies and AI capabilities. Drive the integration of AI tools and practices that enhance engineering effectiveness. Explore and implement modern AI powered development workflows, coding assistants and agentic systems. Help shape PEXA's approach to AI innovation and organisational adoption. Leadership & Mentorship Mentor and coach engineers across all levels, from emerging talent to senior technical leaders. Foster a culture of technical excellence, curiosity and continuous learning. Lead technical discussions, knowledge sharing initiatives and engineering communities of practice. Contribute to hiring and technical assessment processes. Collaboration & Business Impact Partner with Product, Design and Business leaders to align technical strategy with organisational goals. Translate business objectives into technical roadmaps and engineering outcomes. Influence decision making at executive level through clear communication and technical insight. Represent PEXA externally within the technology community and industry forums. About You You'll bring significant experience designing and delivering large scale distributed systems, a track record of defining technical strategy and driving organisation wide engineering outcomes, and strong experience working across multiple teams, platforms and stakeholders. You have a proven ability to balance technical excellence with commercial priorities and communicate effectively at all levels of the organisation. Technical Expertise Node.js, Python and Kotlin Modern frontend technologies Cloud platforms such as AWS and Azure Microservices and distributed architectures Containerisation and cloud native development Security architecture and secure software development practices Performance, scalability and reliability engineering Strong interest in the evolving AI landscape, including AI assisted software development tools, agentic AI frameworks and modern approaches to context engineering and AI enhanced developer productivity. Nice to Have Experience within financial services, property technology or other regulated industries. Exposure to DevSecOps, Site Reliability Engineering (SRE) or Platform Engineering environments. Knowledge of AI frameworks and technologies such as Anthropic Claude, LangGraph, Hugging Face, LiteLLM or Pydantic AI. £100,000 - £120,000 a year Benefits Annual Leave: 25 days per year (increasing to 27 days after 5 years), plus Bank Holidays. Annual Leave Purchase: Option to purchase up to 1 working week of additional leave. Wellness Days: 4 paid days per year to rest and recharge. Paid Volunteer Day: 1 paid day per year to volunteer at a charity of your choice. Pension scheme (Aviva): 6% employer contribution, 4% employee contribution. Life Assurance: 4 basic salary. Health Flex Pot via Benifex: £800 flexible benefit to spend on health options of your choice. Additional Benifex perks: voluntary opt in for benefits such as Activity Pass, Perkbox discounts, Beauty and Fitness Discount Card, Cycle to Work, Motor Breakdown, Will Writing, or enhanced coverage of benefits provided by the company. Critical Illness Cover: 1 basic salary. Group wide Equity Plan: Eligibility for participation in the group wide employee share plan, subject to annual invitation. Enhanced Maternity/Adoption Pay: Up to 20 weeks full pay. Enhanced Paternity Pay: Up to 4 weeks full pay. Performance based Incentive Scheme: Eligibility for a discretionary Short Term Incentive plan, based on job role and level. Learning & Development: Access to LinkedIn Learning and reimbursement of professional membership costs. Additional employee assistance programme and wellbeing resources, eye test vouchers and VDU glasses contribution. Office Space: Use of PEXA office workspaces in Leeds and Thame.
Principal Cloud Engineer (AWS) (Multiple)
Cloudscaler
Principal Cloud Engineer (AWS) - (Multiple) Central London (Hybrid - 3 days onsite) £95,000 - £130,000 + bonus + great benefits Defence/SC Clearance Requirements SC clearance is required for our customer facing teams. You do not need active SC clearance as we will undertake checks upon you joining, but you must be eligible to pass SC clearance. DV clearance may be required for certain roles, so the ability/willingness to pass DV clearance is advantageous but not essential. Who are we We are AWS experts helping major public and private sector organisations unlock the power of their platform. From landing zones to cloud operations and risk mitigation, we're at the forefront of cloud transformation - and we're looking for a Principal Cloud Engineer (AWS) to join our team. What you'll do Design & build enterprise-scale AWS landing zones Automate complex AWS platforms with Terraform Package services into AMIs, Docker, or Serverless Define and lead technical strategy with a strong view on IaC modularisation Work directly with customers, driving cloud transformations Apply site reliability best practices and help shape mature security models What we're looking for Deep AWS expertise, especially landing zones Strong hands on Terraform and automation skills Clear opinions on IaC structure and cloud architecture Experience working in enterprise scale AWS environments Proven experience leading cross functional teams and mentoring engineers Strong hands on AWS knowledge, with the ability to choose the right service for the job Deep expertise in landing zones, multi account, and multi tenant AWS setups Experience operating cloud platforms at scale, ideally in regulated environments Solid grasp of secure, scalable, and resilient AWS architecture Strong understanding of AWS security risks and how to prevent, detect, and remediate them Proficient in Infrastructure as Code (IaC) using Terraform (or CloudFormation) Confident advising on IaC structure and modularisation Awareness of SRE principles and operational priorities Experience with CI/CD pipelines Strong system design skills Why join us? Industry leading AWS expert team - As an AWS Advanced & Well Architected Partner, we build secure, scalable cloud platforms for major enterprises People first, inclusive culture - We thrive on trust, openness, and collaboration, with regular socials, quarterly company days and town halls plus, a committed signatory to the Tech Talent Charter Tailored career growth - Enjoy individual development plans, sponsored training and AWS certifications, transparent promotion tracks and 5 training days per year Comprehensive, flexible benefits - Bonus, 25 days annual leave + 5 days training or voluntary leave days, private GP, life/disability cover, Cycle to Work, bike storage, dog friendly offices, and remote/hybrid working High impact projects & expert community - Operate at the cutting edge designing cloud solutions in regulated industries, working with CXOs, and shaping the future of cloud, supported by a network of bright, passionate AWS professionals Additional Perks Discretionary bonus Discretionary security clearance bonus for those holding certain clearance Discretionary utilisation bonus Periodic offer of share options schemes 25 days' annual leave 5 additional days per year towards training, certifications, or charity work Option to buy additional annual leave up to 5 days per year Public holidays opt out scheme, the option to work on public holidays creating the flexibility to enjoy your time off when it suits you Certifications and training expensed Life Assurance Long Term Disability cover Employee Assist Programme for employee advice and support (including legal and counselling helpline) Health, Mental Health, Wellbeing, Financial and Legal support 24/7 GP access Pension auto enrolment and contribution Employee referral scheme Client referral scheme Cycle to work scheme Travel expenses policy Interview Process Chat with Talent Team Remote interview with Hiring Manager Technical interview (remote) Final in person with Leadership Team Cloudscaler are proud to be an equal opportunity employer, committed to equal opportunities regardless of gender identity, sexual orientation, race, ancestry, age, marital status, disability, parental status, religion or medical history. If you require reasonable adjustments during the recruitment process or within the workplace, please let us know when you speak to our Talent Acquisition team or contact at the earliest opportunity.
06/06/2026
Full time
Principal Cloud Engineer (AWS) - (Multiple) Central London (Hybrid - 3 days onsite) £95,000 - £130,000 + bonus + great benefits Defence/SC Clearance Requirements SC clearance is required for our customer facing teams. You do not need active SC clearance as we will undertake checks upon you joining, but you must be eligible to pass SC clearance. DV clearance may be required for certain roles, so the ability/willingness to pass DV clearance is advantageous but not essential. Who are we We are AWS experts helping major public and private sector organisations unlock the power of their platform. From landing zones to cloud operations and risk mitigation, we're at the forefront of cloud transformation - and we're looking for a Principal Cloud Engineer (AWS) to join our team. What you'll do Design & build enterprise-scale AWS landing zones Automate complex AWS platforms with Terraform Package services into AMIs, Docker, or Serverless Define and lead technical strategy with a strong view on IaC modularisation Work directly with customers, driving cloud transformations Apply site reliability best practices and help shape mature security models What we're looking for Deep AWS expertise, especially landing zones Strong hands on Terraform and automation skills Clear opinions on IaC structure and cloud architecture Experience working in enterprise scale AWS environments Proven experience leading cross functional teams and mentoring engineers Strong hands on AWS knowledge, with the ability to choose the right service for the job Deep expertise in landing zones, multi account, and multi tenant AWS setups Experience operating cloud platforms at scale, ideally in regulated environments Solid grasp of secure, scalable, and resilient AWS architecture Strong understanding of AWS security risks and how to prevent, detect, and remediate them Proficient in Infrastructure as Code (IaC) using Terraform (or CloudFormation) Confident advising on IaC structure and modularisation Awareness of SRE principles and operational priorities Experience with CI/CD pipelines Strong system design skills Why join us? Industry leading AWS expert team - As an AWS Advanced & Well Architected Partner, we build secure, scalable cloud platforms for major enterprises People first, inclusive culture - We thrive on trust, openness, and collaboration, with regular socials, quarterly company days and town halls plus, a committed signatory to the Tech Talent Charter Tailored career growth - Enjoy individual development plans, sponsored training and AWS certifications, transparent promotion tracks and 5 training days per year Comprehensive, flexible benefits - Bonus, 25 days annual leave + 5 days training or voluntary leave days, private GP, life/disability cover, Cycle to Work, bike storage, dog friendly offices, and remote/hybrid working High impact projects & expert community - Operate at the cutting edge designing cloud solutions in regulated industries, working with CXOs, and shaping the future of cloud, supported by a network of bright, passionate AWS professionals Additional Perks Discretionary bonus Discretionary security clearance bonus for those holding certain clearance Discretionary utilisation bonus Periodic offer of share options schemes 25 days' annual leave 5 additional days per year towards training, certifications, or charity work Option to buy additional annual leave up to 5 days per year Public holidays opt out scheme, the option to work on public holidays creating the flexibility to enjoy your time off when it suits you Certifications and training expensed Life Assurance Long Term Disability cover Employee Assist Programme for employee advice and support (including legal and counselling helpline) Health, Mental Health, Wellbeing, Financial and Legal support 24/7 GP access Pension auto enrolment and contribution Employee referral scheme Client referral scheme Cycle to work scheme Travel expenses policy Interview Process Chat with Talent Team Remote interview with Hiring Manager Technical interview (remote) Final in person with Leadership Team Cloudscaler are proud to be an equal opportunity employer, committed to equal opportunities regardless of gender identity, sexual orientation, race, ancestry, age, marital status, disability, parental status, religion or medical history. If you require reasonable adjustments during the recruitment process or within the workplace, please let us know when you speak to our Talent Acquisition team or contact at the earliest opportunity.
Platform Lead - UK
PLP Group
Platform Lead - UKJob detailsNEXT GATE TECH LIMITEDFull-time About Next Gate Tech At Next Gate Tech, we create technologies that reshape the landscape of the fund industry operations.We empower our clients by capturing the full potential of harmonized data to drive intelligent and fully automated operations. Our transformative solutions optimize processes, enhance efficiency, reduce risks, and drive cost savings for our clients.Driven by our commitment to innovation, our intelligence layer extracts invaluable insights, employs advanced pattern analysis spotting anomalies, and uncovers hidden links within the data.Our modular, one-stop-shop, SaaS platform seamlessly ingests diverse datasets, creating a harmonized and enriched source of portfolios, transactions, and accounting data. This robust foundation fuels the platform to generate powerful signals through intelligent analytics, empowering a multitude of use cases.Next Gate Tech is not just a part of the industry's evolution - it is a driving force behind it.Learn more about us: Our story, values, mission and team: Our unified platform and technology: Our solutions and use cases: About the Role As a key senior technical leader, the Platform Lead will own and scale the technology and infrastructure that powers our financial technology platform. This role is critical to building the technical foundation that enables new products, supports rapid growth, and upholds the trust expected in financial services. They will design a resilient architecture that accelerates product development, and deliver exceptional reliability as we grow.Partnering closely with product and engineering teams, they will combine hands-on building with strategic technical leadership to ensure our platform remains secure, scalable, and fully compliant with financial-industry standards. Responsibilities Develop and communicate a clear, multi-year technical vision and strategic roadmap for the core platform (including infrastructure, data services, internal developer tools, and security) Lead the technical evaluation and decision-making process for platform technologies, balancing in-house development with best-in-class third-party solutions Drive a Site Reliability Engineering (SRE) culture, ensuring high availability, low latency, and robust disaster recovery capabilities Manage and optimize our cloud infrastructure, focusing on Infrastructure-as-Code (e.g., Terraform), containerization (e.g., Kubernetes), and cost optimization Champion an outstanding Internal Developer Platform (IDP) and developer experience, providing tools, automation, and documentation that accelerate feature delivery for the product team Qualifications 7+ years of experience building software, DevOps, or platform/infrastructure systems 3+ years in a senior leadership or Staff/Principal Engineer role, driving key technical projects Direct experience in implementing cost management and optimization strategies within a cloud environment, resulting in demonstrable savings Proven experience working in a highly regulated industry, ideally fintech, banking, or payments, with a deep understanding of security and compliance requirements Expert-level knowledge of modern cloud architecture (e.g. Microservices, Event-Driven Architecture, Serverless), CI/CD pipelines, and cloud provider platforms. Strong hands-on experience with container orchestration (e.g., Kubernetes), Infrastructure-as-Code (e.g., Terraform), Python, and Observability tools (e.g., Prometheus, Grafana, Datadog, Sentry)Benefits 26 vacation days + 2 duvet days, so you can truly recharge and enjoy life Comprehensive health and dental care coverage Central location for both offices with a fully stocked kitchen, including healthy snacks, and fresh fruit Freedom to create your own entrepreneurial experience by being part of a team in search of excellence Professional Development programs to expand your skills, and maximize your potential on the frontier of financial innovation Next Gate Tech is an equal opportunity employer. We believe our team's unique life experiences, backgrounds, cultures, beliefs and abilities add richness to our culture and depth to our ideas. Our ongoing commitment to diversity and inclusion creates an environment that supports, empowers and delivers a sense of belonging for all members of the team. Should you require any accommodation, please inform us and we will work with you to meet your accessibility needs.
05/06/2026
Full time
Platform Lead - UKJob detailsNEXT GATE TECH LIMITEDFull-time About Next Gate Tech At Next Gate Tech, we create technologies that reshape the landscape of the fund industry operations.We empower our clients by capturing the full potential of harmonized data to drive intelligent and fully automated operations. Our transformative solutions optimize processes, enhance efficiency, reduce risks, and drive cost savings for our clients.Driven by our commitment to innovation, our intelligence layer extracts invaluable insights, employs advanced pattern analysis spotting anomalies, and uncovers hidden links within the data.Our modular, one-stop-shop, SaaS platform seamlessly ingests diverse datasets, creating a harmonized and enriched source of portfolios, transactions, and accounting data. This robust foundation fuels the platform to generate powerful signals through intelligent analytics, empowering a multitude of use cases.Next Gate Tech is not just a part of the industry's evolution - it is a driving force behind it.Learn more about us: Our story, values, mission and team: Our unified platform and technology: Our solutions and use cases: About the Role As a key senior technical leader, the Platform Lead will own and scale the technology and infrastructure that powers our financial technology platform. This role is critical to building the technical foundation that enables new products, supports rapid growth, and upholds the trust expected in financial services. They will design a resilient architecture that accelerates product development, and deliver exceptional reliability as we grow.Partnering closely with product and engineering teams, they will combine hands-on building with strategic technical leadership to ensure our platform remains secure, scalable, and fully compliant with financial-industry standards. Responsibilities Develop and communicate a clear, multi-year technical vision and strategic roadmap for the core platform (including infrastructure, data services, internal developer tools, and security) Lead the technical evaluation and decision-making process for platform technologies, balancing in-house development with best-in-class third-party solutions Drive a Site Reliability Engineering (SRE) culture, ensuring high availability, low latency, and robust disaster recovery capabilities Manage and optimize our cloud infrastructure, focusing on Infrastructure-as-Code (e.g., Terraform), containerization (e.g., Kubernetes), and cost optimization Champion an outstanding Internal Developer Platform (IDP) and developer experience, providing tools, automation, and documentation that accelerate feature delivery for the product team Qualifications 7+ years of experience building software, DevOps, or platform/infrastructure systems 3+ years in a senior leadership or Staff/Principal Engineer role, driving key technical projects Direct experience in implementing cost management and optimization strategies within a cloud environment, resulting in demonstrable savings Proven experience working in a highly regulated industry, ideally fintech, banking, or payments, with a deep understanding of security and compliance requirements Expert-level knowledge of modern cloud architecture (e.g. Microservices, Event-Driven Architecture, Serverless), CI/CD pipelines, and cloud provider platforms. Strong hands-on experience with container orchestration (e.g., Kubernetes), Infrastructure-as-Code (e.g., Terraform), Python, and Observability tools (e.g., Prometheus, Grafana, Datadog, Sentry)Benefits 26 vacation days + 2 duvet days, so you can truly recharge and enjoy life Comprehensive health and dental care coverage Central location for both offices with a fully stocked kitchen, including healthy snacks, and fresh fruit Freedom to create your own entrepreneurial experience by being part of a team in search of excellence Professional Development programs to expand your skills, and maximize your potential on the frontier of financial innovation Next Gate Tech is an equal opportunity employer. We believe our team's unique life experiences, backgrounds, cultures, beliefs and abilities add richness to our culture and depth to our ideas. Our ongoing commitment to diversity and inclusion creates an environment that supports, empowers and delivers a sense of belonging for all members of the team. Should you require any accommodation, please inform us and we will work with you to meet your accessibility needs.
Platform Lead: Fintech Architecture & SRE
PLP Group
Platform Lead - UKJob detailsNEXT GATE TECH LIMITEDFull-time About Next Gate Tech At Next Gate Tech, we create technologies that reshape the landscape of the fund industry operations.We empower our clients by capturing the full potential of harmonized data to drive intelligent and fully automated operations. Our transformative solutions optimize processes, enhance efficiency, reduce risks, and drive cost savings for our clients.Driven by our commitment to innovation, our intelligence layer extracts invaluable insights, employs advanced pattern analysis spotting anomalies, and uncovers hidden links within the data.Our modular, one-stop-shop, SaaS platform seamlessly ingests diverse datasets, creating a harmonized and enriched source of portfolios, transactions, and accounting data. This robust foundation fuels the platform to generate powerful signals through intelligent analytics, empowering a multitude of use cases.Next Gate Tech is not just a part of the industry's evolution - it is a driving force behind it.Learn more about us: Our story, values, mission and team: Our unified platform and technology: Our solutions and use cases: About the Role As a key senior technical leader, the Platform Lead will own and scale the technology and infrastructure that powers our financial technology platform. This role is critical to building the technical foundation that enables new products, supports rapid growth, and upholds the trust expected in financial services. They will design a resilient architecture that accelerates product development, and deliver exceptional reliability as we grow.Partnering closely with product and engineering teams, they will combine hands-on building with strategic technical leadership to ensure our platform remains secure, scalable, and fully compliant with financial-industry standards. Responsibilities Develop and communicate a clear, multi-year technical vision and strategic roadmap for the core platform (including infrastructure, data services, internal developer tools, and security) Lead the technical evaluation and decision-making process for platform technologies, balancing in-house development with best-in-class third-party solutions Drive a Site Reliability Engineering (SRE) culture, ensuring high availability, low latency, and robust disaster recovery capabilities Manage and optimize our cloud infrastructure, focusing on Infrastructure-as-Code (e.g., Terraform), containerization (e.g., Kubernetes), and cost optimization Champion an outstanding Internal Developer Platform (IDP) and developer experience, providing tools, automation, and documentation that accelerate feature delivery for the product team Qualifications 7+ years of experience building software, DevOps, or platform/infrastructure systems 3+ years in a senior leadership or Staff/Principal Engineer role, driving key technical projects Direct experience in implementing cost management and optimization strategies within a cloud environment, resulting in demonstrable savings Proven experience working in a highly regulated industry, ideally fintech, banking, or payments, with a deep understanding of security and compliance requirements Expert-level knowledge of modern cloud architecture (e.g. Microservices, Event-Driven Architecture, Serverless), CI/CD pipelines, and cloud provider platforms. Strong hands-on experience with container orchestration (e.g., Kubernetes), Infrastructure-as-Code (e.g., Terraform), Python, and Observability tools (e.g., Prometheus, Grafana, Datadog, Sentry)Benefits 26 vacation days + 2 duvet days, so you can truly recharge and enjoy life Comprehensive health and dental care coverage Central location for both offices with a fully stocked kitchen, including healthy snacks, and fresh fruit Freedom to create your own entrepreneurial experience by being part of a team in search of excellence Professional Development programs to expand your skills, and maximize your potential on the frontier of financial innovation Next Gate Tech is an equal opportunity employer. We believe our team's unique life experiences, backgrounds, cultures, beliefs and abilities add richness to our culture and depth to our ideas. Our ongoing commitment to diversity and inclusion creates an environment that supports, empowers and delivers a sense of belonging for all members of the team. Should you require any accommodation, please inform us and we will work with you to meet your accessibility needs.
04/06/2026
Full time
Platform Lead - UKJob detailsNEXT GATE TECH LIMITEDFull-time About Next Gate Tech At Next Gate Tech, we create technologies that reshape the landscape of the fund industry operations.We empower our clients by capturing the full potential of harmonized data to drive intelligent and fully automated operations. Our transformative solutions optimize processes, enhance efficiency, reduce risks, and drive cost savings for our clients.Driven by our commitment to innovation, our intelligence layer extracts invaluable insights, employs advanced pattern analysis spotting anomalies, and uncovers hidden links within the data.Our modular, one-stop-shop, SaaS platform seamlessly ingests diverse datasets, creating a harmonized and enriched source of portfolios, transactions, and accounting data. This robust foundation fuels the platform to generate powerful signals through intelligent analytics, empowering a multitude of use cases.Next Gate Tech is not just a part of the industry's evolution - it is a driving force behind it.Learn more about us: Our story, values, mission and team: Our unified platform and technology: Our solutions and use cases: About the Role As a key senior technical leader, the Platform Lead will own and scale the technology and infrastructure that powers our financial technology platform. This role is critical to building the technical foundation that enables new products, supports rapid growth, and upholds the trust expected in financial services. They will design a resilient architecture that accelerates product development, and deliver exceptional reliability as we grow.Partnering closely with product and engineering teams, they will combine hands-on building with strategic technical leadership to ensure our platform remains secure, scalable, and fully compliant with financial-industry standards. Responsibilities Develop and communicate a clear, multi-year technical vision and strategic roadmap for the core platform (including infrastructure, data services, internal developer tools, and security) Lead the technical evaluation and decision-making process for platform technologies, balancing in-house development with best-in-class third-party solutions Drive a Site Reliability Engineering (SRE) culture, ensuring high availability, low latency, and robust disaster recovery capabilities Manage and optimize our cloud infrastructure, focusing on Infrastructure-as-Code (e.g., Terraform), containerization (e.g., Kubernetes), and cost optimization Champion an outstanding Internal Developer Platform (IDP) and developer experience, providing tools, automation, and documentation that accelerate feature delivery for the product team Qualifications 7+ years of experience building software, DevOps, or platform/infrastructure systems 3+ years in a senior leadership or Staff/Principal Engineer role, driving key technical projects Direct experience in implementing cost management and optimization strategies within a cloud environment, resulting in demonstrable savings Proven experience working in a highly regulated industry, ideally fintech, banking, or payments, with a deep understanding of security and compliance requirements Expert-level knowledge of modern cloud architecture (e.g. Microservices, Event-Driven Architecture, Serverless), CI/CD pipelines, and cloud provider platforms. Strong hands-on experience with container orchestration (e.g., Kubernetes), Infrastructure-as-Code (e.g., Terraform), Python, and Observability tools (e.g., Prometheus, Grafana, Datadog, Sentry)Benefits 26 vacation days + 2 duvet days, so you can truly recharge and enjoy life Comprehensive health and dental care coverage Central location for both offices with a fully stocked kitchen, including healthy snacks, and fresh fruit Freedom to create your own entrepreneurial experience by being part of a team in search of excellence Professional Development programs to expand your skills, and maximize your potential on the frontier of financial innovation Next Gate Tech is an equal opportunity employer. We believe our team's unique life experiences, backgrounds, cultures, beliefs and abilities add richness to our culture and depth to our ideas. Our ongoing commitment to diversity and inclusion creates an environment that supports, empowers and delivers a sense of belonging for all members of the team. Should you require any accommodation, please inform us and we will work with you to meet your accessibility needs.
Cambridge University Press
Principal Developer & Team Lead
Cambridge University Press Cambridge, Cambridgeshire
Job Title: Principal Developer & Team Lead Salary: £51,400 - £68,800 Location: Cambridge/Hybrid with 40-60% of time in the office Contract: Permanent Hours: Full time 35 hours per week As a team lead with strong technical instincts you're ready to take the next step with more scope, more shaping of what gets built, more influence over how things are done but you're not ready to put down your tools. We are Cambridge University Press & Assessment, a world-leading academic publisher and assessment organisation and a proud part of the University of Cambridge. About the role You'll lead transitions taking place in our organisation: migrating legacy enterprise applications to cloud-native AWS architectures, while helping us establish two practices from close to scratch including Site Reliability Engineering and AI development. Alongside that broader technical leadership, you'll be the principal developer on a focused project within the programme, having deep ownership of a specific delivery, hands in the-code, designing and building it end to end. You'll spend most of your time close to the code: setting direction, setting the standard in design and review, and writing code alongside the team. You will also be the line manager for a small team of developers. We're not looking for a fully formed people leader; we want someone who cares about the people around them and wants to grow into that part of the role. You'll have support and development to grow into it over time. The bulk of your focus is on technical work. You'll be the principal developer on a focused project within the migration programme, your own delivery to design, build and own. Around that, you'll lead the wider migration to AWS, build the DevOps automation and observability that lets SRE practices take hold, and establish the standards for how we use AI responsibly in education products. You'll set the technical bar through code reviews, design conversations and your own contributions, not from a distance. You'll deliver in agile squads alongside architects, product owners, technical leads, SREs and infrastructure teams, and you'll be the technical voice in stakeholder conversations about what's possible. This position has been classified as a hybrid role, requiring the selected candidate to typically spend 40-60% of their time collaborating and connecting face to face at a dedicated location. Aside from our hybrid principles, other flexible working requests will be considered from the first day of employment, including other work arrangements should you require adjustments due to a disability or long term health condition. About You A current or recent team lead ready to step up technically. The bar that matters most: You've led developers before, formally or informally, and people have grown around you You're fluent in two or more modern languages and you still write code regularly You've worked with AWS (or an equivalent cloud) in depth, not just touched it You understand CI/CD, infrastructure as code, and what observability actually means in production You can hold a conversation about event driven architecture, microservices and security in cloud environments at a level beyond the textbook You communicate clearly with engineers and non engineers, and you're open to growing the people leadership side of the role with support Desired criteria: Hands on exposure to AI/ML in production systems Experience helping establish SRE or observability practice early on A track record of modernising legacy systems without breaking them What this role offers Genuine scope from the start-you'll be shaping the SRE function and AI practice, not inheriting someone else's blueprint, while owning a focused project as principal developer that keeps you firmly in the code. A small team to lead and learn from. A leadership team that expects you to stay close to the code, and that will support you to grow as a people leader at your own pace. And work that has real reach: the systems you help build serve millions of learners, teachers and researchers worldwide. Rewards and benefits 28 days annual leave plus bank holidays Private medical and Permanent Health Insurance Discretionary annual bonus Group personal pension scheme Life assurance up to 4 x annual salary Green travel schemes We are a Disability Confident (DC) employer that is committed to equality and inclusion, ensuring our recruitment process is accessible to all. The DC scheme's Offer of an Interview commitment applies to applicants who opt in, disclose a disability or long term health condition, and who best meet the minimum criteria for the role. In instances where interviewing all qualifying candidates is not practicable and/or appropriate, we prioritise those who best meet the minimum criteria, as we would for applicants who do not have a disability or long term health condition. Cambridge University Press & Assessment is an approved UK employer for the sponsorship of eligible roles and applicants under the Skilled Worker visa route. Please refer to the gov.uk website for guidance to understand your own eligibility based on the role you are applying for. If you require any reasonable adjustments during the recruitment process due to a disability or a long term health condition, there will be an opportunity for you to inform us via the online application form. We will do our best to accommodate your needs. Successful applicants will be subject to satisfactory background checks, including DBS. We welcome applications from all candidates, regardless of demographic characteristics (age, disability, educational attainment, ethnicity, gender, marital status, neurodiversity, religion, sex, gender identity and sexual identity), cultural, or social class/background.
02/06/2026
Full time
Job Title: Principal Developer & Team Lead Salary: £51,400 - £68,800 Location: Cambridge/Hybrid with 40-60% of time in the office Contract: Permanent Hours: Full time 35 hours per week As a team lead with strong technical instincts you're ready to take the next step with more scope, more shaping of what gets built, more influence over how things are done but you're not ready to put down your tools. We are Cambridge University Press & Assessment, a world-leading academic publisher and assessment organisation and a proud part of the University of Cambridge. About the role You'll lead transitions taking place in our organisation: migrating legacy enterprise applications to cloud-native AWS architectures, while helping us establish two practices from close to scratch including Site Reliability Engineering and AI development. Alongside that broader technical leadership, you'll be the principal developer on a focused project within the programme, having deep ownership of a specific delivery, hands in the-code, designing and building it end to end. You'll spend most of your time close to the code: setting direction, setting the standard in design and review, and writing code alongside the team. You will also be the line manager for a small team of developers. We're not looking for a fully formed people leader; we want someone who cares about the people around them and wants to grow into that part of the role. You'll have support and development to grow into it over time. The bulk of your focus is on technical work. You'll be the principal developer on a focused project within the migration programme, your own delivery to design, build and own. Around that, you'll lead the wider migration to AWS, build the DevOps automation and observability that lets SRE practices take hold, and establish the standards for how we use AI responsibly in education products. You'll set the technical bar through code reviews, design conversations and your own contributions, not from a distance. You'll deliver in agile squads alongside architects, product owners, technical leads, SREs and infrastructure teams, and you'll be the technical voice in stakeholder conversations about what's possible. This position has been classified as a hybrid role, requiring the selected candidate to typically spend 40-60% of their time collaborating and connecting face to face at a dedicated location. Aside from our hybrid principles, other flexible working requests will be considered from the first day of employment, including other work arrangements should you require adjustments due to a disability or long term health condition. About You A current or recent team lead ready to step up technically. The bar that matters most: You've led developers before, formally or informally, and people have grown around you You're fluent in two or more modern languages and you still write code regularly You've worked with AWS (or an equivalent cloud) in depth, not just touched it You understand CI/CD, infrastructure as code, and what observability actually means in production You can hold a conversation about event driven architecture, microservices and security in cloud environments at a level beyond the textbook You communicate clearly with engineers and non engineers, and you're open to growing the people leadership side of the role with support Desired criteria: Hands on exposure to AI/ML in production systems Experience helping establish SRE or observability practice early on A track record of modernising legacy systems without breaking them What this role offers Genuine scope from the start-you'll be shaping the SRE function and AI practice, not inheriting someone else's blueprint, while owning a focused project as principal developer that keeps you firmly in the code. A small team to lead and learn from. A leadership team that expects you to stay close to the code, and that will support you to grow as a people leader at your own pace. And work that has real reach: the systems you help build serve millions of learners, teachers and researchers worldwide. Rewards and benefits 28 days annual leave plus bank holidays Private medical and Permanent Health Insurance Discretionary annual bonus Group personal pension scheme Life assurance up to 4 x annual salary Green travel schemes We are a Disability Confident (DC) employer that is committed to equality and inclusion, ensuring our recruitment process is accessible to all. The DC scheme's Offer of an Interview commitment applies to applicants who opt in, disclose a disability or long term health condition, and who best meet the minimum criteria for the role. In instances where interviewing all qualifying candidates is not practicable and/or appropriate, we prioritise those who best meet the minimum criteria, as we would for applicants who do not have a disability or long term health condition. Cambridge University Press & Assessment is an approved UK employer for the sponsorship of eligible roles and applicants under the Skilled Worker visa route. Please refer to the gov.uk website for guidance to understand your own eligibility based on the role you are applying for. If you require any reasonable adjustments during the recruitment process due to a disability or a long term health condition, there will be an opportunity for you to inform us via the online application form. We will do our best to accommodate your needs. Successful applicants will be subject to satisfactory background checks, including DBS. We welcome applications from all candidates, regardless of demographic characteristics (age, disability, educational attainment, ethnicity, gender, marital status, neurodiversity, religion, sex, gender identity and sexual identity), cultural, or social class/background.
Cambridge University Press & Assessment
Principal Developer & Team Lead
Cambridge University Press & Assessment Cambridge, UK
Job Title : Principal Developer & Team Lead Salary:   £51,400 - £68,800 Location:   Cambridge/Hybrid with 40-60% of time in the office Contract:   Permanent Hours:   Full time 35 hours per week  As a team lead with strong technical instincts  you're ready to take the next step with more scope, more shaping of what gets built, more influence over how things are done but you're not ready to put down your tools. We are Cambridge University Press & Assessment, a world-leading academic publisher and assessment organisation and a proud part of the University of Cambridge. You'll lead transitions taking place in our organisation: migrating legacy enterprise applications to cloud-native AWS architectures, while helping us establish two practices from close to scratch including Site Reliability Engineering and AI development. About the role    Alongside that broader technical leadership, you'll be the principal developer on a focused project within the programme deep ownership of a specific delivery, hands-in-the-code, designing and building it end-to-end. You'll spend most of your time close to the code: setting direction, setting the standard in design and review, and writing code alongside the team. You'll also be the line manager for a small team of developers. We're not looking for a fully formed people leader we're looking for someone who cares about the people around them and wants to grow into that part of the role. You'll have support and development to grow into it over time. The bulk of your focus is on the technical work. You'll be the principal developer on a focused project within the migration programme your own delivery to design, build and own. Around that, you'll lead the wider migration to AWS, build the DevOps automation and observability that lets SRE practices take hold, and establish the standards for how we use AI responsibly in education products. You'll set the technical bar through code reviews, design conversations and your own contributions not from a distance. On the team side, you'll be the line manager for a small team of developers: one-to-ones, development conversations, helping people find their next step. As you settle in, you'll take on more of the wider people work recruitment, performance, identifying where the team needs to grow in AI/ML and SRE. We'll support you to learn this side of the role; we don't expect you to arrive with it fully formed. You'll deliver in agile squads alongside architects, product owners, technical leads, SREs and infrastructure teams, and you'll be the technical voice in stakeholder conversations about what's possible. This position has been classified as a hybrid role, requiring the selected candidate to typically spend 40-60% of their time collaborating and connecting face-to-face at their dedicated location. Aside from our hybrid principles, other flexible working requests will be considered from the first day of employment, including other work arrangements should you require adjustments due to a disability or long-term health condition.  About You    A current or recent team lead ready to step up technically. The bar that matters most:   You've led developers before, formally or informally, and people have grown around you You're fluent in two or more modern languages and you still write code regularly You've worked with AWS (or an equivalent cloud) in anger not just touched it You understand CI/CD, infrastructure as code, and what observability actually means in production You can hold a conversation about event-driven architecture, microservices and security in cloud environments at a level beyond the textbook You communicate clearly with engineers and non-engineers, and you're open to growing the people-leadership side of the role with support If you meet the above minimum requirements, we encourage you to apply. Your application will be even stronger if you can also demonstrate the following  desirable  criteria:  Hands-on exposure to AI/ML in production systems Experience helping establish SRE or observability practice early on A track record of modernising legacy systems without breaking them What this role offers someone taking the next step Genuine scope from the start you'll be shaping the SRE function and AI practice, not inheriting someone else's blueprint, while owning a focused project as principal developer that keeps you firmly in the code. A small team to lead and learn from. A leadership team that expects you to stay close to the code, and that will support you to grow as a people leader at your own pace. And work that has real reach the systems you help build serve millions of learners, teachers and researchers worldwide. For a detailed job description, please refer to the link at the bottom of the advert on our careers site.     We are a Disability Confident (DC) employer that is committed to equality and inclusion ensuring our recruitment process is accessible to all. The DC scheme's Offer of an Interview commitment applies to applicants who opt in, and disclose a disability or a long-term health condition, and who best meet the minimum criteria for the role. In instances where interviewing all qualifying candidates is not practicable and/or appropriate, we prioritise those who best meet the minimum criteria, as we would for applicants who do not have a disability or long-term health condition.   Cambridge University Press & Assessment is an approved UK employer for the sponsorship of eligible roles and applicants under the Skilled Worker visa route. Please refer to the  gov.uk  website for guidance to understand your own eligibility based on the role you are applying for. Rewards and benefits     We will support you to be at your best in work and to live well outside of it. In addition to competitive salaries, we offer a world-class, flexible  rewards package , featuring family-friendly and planet-friendly benefits including:  28 days annual leave plus bank holidays  Private medical and Permanent Health Insurance   Discretionary annual bonus   Group personal pension scheme  Life assurance up to 4 x annual salary   Green travel schemes   Ready to pursue your potential? Apply now. We aim to support candidates by making our interview process clear and transparent. The closing date for all applications will be  14th   June  We will review applications on an ongoing basis, and shortlisted candidates can expect interviews to take place shortly after it closes. If you are shortlisted and progressed through the stages, you can expect:  Two questions to select one answer from multiple options.  A 15-minute screening call with the Hiring Manager. First stage interview via MS Teams or in person. You will be provided with a brief to complete a role related task which will need to be returned by email in advance of your interview.     If you require any reasonable adjustments during the recruitment process due to a disability or a long-term health condition, there will be an opportunity for you to inform us via the online application form. We will do our best to accommodate your needs.    Please note that successful applicants will be subject to satisfactory background checks including DBS due to working in a regulated industry. We are committed to an equitable recruitment process. As such, applications must be submitted via our official online application procedure. Please refrain from sending your CV directly to our recruiters. If you experience technical difficulties or require additional support with submitting your online application, contact the Recruiter.  Why join us   Joining us is your opportunity to pursue potential. You will belong to a collaborative team that is exploring new and better ways to serve students, teachers and researchers across the globe – for the benefit of individuals, society and the world. Sharing our mission will inspire your own growth, development and progress, in an environment which embraces difference, change and aspiration. Cambridge University Press & Assessment is committed to being a place where anyone can enjoy a successful career, where it is safe to speak up, and where we learn continuously to improve together. We welcome applications from all candidates, regardless of demographic characteristics (age, disability, educational attainment, ethnicity, gender, marital status, neurodiversity, religion, sex, gender identity and sexual identity), cultural, or social class/background.  We believe better outcomes come through diversity of thought, background and approach. We welcome applications from people from all backgrounds and communities, actively seeking to employ people from a wide range of different communities.
01/06/2026
Full time
Job Title : Principal Developer & Team Lead Salary:   £51,400 - £68,800 Location:   Cambridge/Hybrid with 40-60% of time in the office Contract:   Permanent Hours:   Full time 35 hours per week  As a team lead with strong technical instincts  you're ready to take the next step with more scope, more shaping of what gets built, more influence over how things are done but you're not ready to put down your tools. We are Cambridge University Press & Assessment, a world-leading academic publisher and assessment organisation and a proud part of the University of Cambridge. You'll lead transitions taking place in our organisation: migrating legacy enterprise applications to cloud-native AWS architectures, while helping us establish two practices from close to scratch including Site Reliability Engineering and AI development. About the role    Alongside that broader technical leadership, you'll be the principal developer on a focused project within the programme deep ownership of a specific delivery, hands-in-the-code, designing and building it end-to-end. You'll spend most of your time close to the code: setting direction, setting the standard in design and review, and writing code alongside the team. You'll also be the line manager for a small team of developers. We're not looking for a fully formed people leader we're looking for someone who cares about the people around them and wants to grow into that part of the role. You'll have support and development to grow into it over time. The bulk of your focus is on the technical work. You'll be the principal developer on a focused project within the migration programme your own delivery to design, build and own. Around that, you'll lead the wider migration to AWS, build the DevOps automation and observability that lets SRE practices take hold, and establish the standards for how we use AI responsibly in education products. You'll set the technical bar through code reviews, design conversations and your own contributions not from a distance. On the team side, you'll be the line manager for a small team of developers: one-to-ones, development conversations, helping people find their next step. As you settle in, you'll take on more of the wider people work recruitment, performance, identifying where the team needs to grow in AI/ML and SRE. We'll support you to learn this side of the role; we don't expect you to arrive with it fully formed. You'll deliver in agile squads alongside architects, product owners, technical leads, SREs and infrastructure teams, and you'll be the technical voice in stakeholder conversations about what's possible. This position has been classified as a hybrid role, requiring the selected candidate to typically spend 40-60% of their time collaborating and connecting face-to-face at their dedicated location. Aside from our hybrid principles, other flexible working requests will be considered from the first day of employment, including other work arrangements should you require adjustments due to a disability or long-term health condition.  About You    A current or recent team lead ready to step up technically. The bar that matters most:   You've led developers before, formally or informally, and people have grown around you You're fluent in two or more modern languages and you still write code regularly You've worked with AWS (or an equivalent cloud) in anger not just touched it You understand CI/CD, infrastructure as code, and what observability actually means in production You can hold a conversation about event-driven architecture, microservices and security in cloud environments at a level beyond the textbook You communicate clearly with engineers and non-engineers, and you're open to growing the people-leadership side of the role with support If you meet the above minimum requirements, we encourage you to apply. Your application will be even stronger if you can also demonstrate the following  desirable  criteria:  Hands-on exposure to AI/ML in production systems Experience helping establish SRE or observability practice early on A track record of modernising legacy systems without breaking them What this role offers someone taking the next step Genuine scope from the start you'll be shaping the SRE function and AI practice, not inheriting someone else's blueprint, while owning a focused project as principal developer that keeps you firmly in the code. A small team to lead and learn from. A leadership team that expects you to stay close to the code, and that will support you to grow as a people leader at your own pace. And work that has real reach the systems you help build serve millions of learners, teachers and researchers worldwide. For a detailed job description, please refer to the link at the bottom of the advert on our careers site.     We are a Disability Confident (DC) employer that is committed to equality and inclusion ensuring our recruitment process is accessible to all. The DC scheme's Offer of an Interview commitment applies to applicants who opt in, and disclose a disability or a long-term health condition, and who best meet the minimum criteria for the role. In instances where interviewing all qualifying candidates is not practicable and/or appropriate, we prioritise those who best meet the minimum criteria, as we would for applicants who do not have a disability or long-term health condition.   Cambridge University Press & Assessment is an approved UK employer for the sponsorship of eligible roles and applicants under the Skilled Worker visa route. Please refer to the  gov.uk  website for guidance to understand your own eligibility based on the role you are applying for. Rewards and benefits     We will support you to be at your best in work and to live well outside of it. In addition to competitive salaries, we offer a world-class, flexible  rewards package , featuring family-friendly and planet-friendly benefits including:  28 days annual leave plus bank holidays  Private medical and Permanent Health Insurance   Discretionary annual bonus   Group personal pension scheme  Life assurance up to 4 x annual salary   Green travel schemes   Ready to pursue your potential? Apply now. We aim to support candidates by making our interview process clear and transparent. The closing date for all applications will be  14th   June  We will review applications on an ongoing basis, and shortlisted candidates can expect interviews to take place shortly after it closes. If you are shortlisted and progressed through the stages, you can expect:  Two questions to select one answer from multiple options.  A 15-minute screening call with the Hiring Manager. First stage interview via MS Teams or in person. You will be provided with a brief to complete a role related task which will need to be returned by email in advance of your interview.     If you require any reasonable adjustments during the recruitment process due to a disability or a long-term health condition, there will be an opportunity for you to inform us via the online application form. We will do our best to accommodate your needs.    Please note that successful applicants will be subject to satisfactory background checks including DBS due to working in a regulated industry. We are committed to an equitable recruitment process. As such, applications must be submitted via our official online application procedure. Please refrain from sending your CV directly to our recruiters. If you experience technical difficulties or require additional support with submitting your online application, contact the Recruiter.  Why join us   Joining us is your opportunity to pursue potential. You will belong to a collaborative team that is exploring new and better ways to serve students, teachers and researchers across the globe – for the benefit of individuals, society and the world. Sharing our mission will inspire your own growth, development and progress, in an environment which embraces difference, change and aspiration. Cambridge University Press & Assessment is committed to being a place where anyone can enjoy a successful career, where it is safe to speak up, and where we learn continuously to improve together. We welcome applications from all candidates, regardless of demographic characteristics (age, disability, educational attainment, ethnicity, gender, marital status, neurodiversity, religion, sex, gender identity and sexual identity), cultural, or social class/background.  We believe better outcomes come through diversity of thought, background and approach. We welcome applications from people from all backgrounds and communities, actively seeking to employ people from a wide range of different communities.
London Stock Exchange Group
Principal Cloud SRE: Scale & Reliability Platform Lead
London Stock Exchange Group
The London Stock Exchange Group is seeking a Principal Cloud Site Reliability Engineer to enhance the reliability of the LSEG Workspace platform. This role involves designing resilient services across AWS and Azure while applying Site Reliability Engineering principles for optimal outcomes. Ideal candidates should possess extensive experience in cloud engineering and DevOps, with a focus on automation, collaboration, and continuous learning.
30/05/2026
Full time
The London Stock Exchange Group is seeking a Principal Cloud Site Reliability Engineer to enhance the reliability of the LSEG Workspace platform. This role involves designing resilient services across AWS and Azure while applying Site Reliability Engineering principles for optimal outcomes. Ideal candidates should possess extensive experience in cloud engineering and DevOps, with a focus on automation, collaboration, and continuous learning.
London Stock Exchange Group
Principal Cloud SRE / Cloud SME - LSEG Workspace
London Stock Exchange Group
Role Profile At LSEG, we build technology that helps people make informed decisions in global financial markets. We are looking for a Principal Cloud Site Reliability Engineer (SRE) with strong cloud platform experience to evolve the reliability, scalability, and operational health of our LSEG Workspace platform - a core product used by financial professionals worldwide for real time data, analytics, research, and productivity. In this role, you will strengthen how our cloud native services are designed, operated, and supported across AWS and Azure, contribute hands on engineering skills, share technical guidance, and help shape long term platform approaches in a collaborative, learning oriented environment. Responsibilities Design and operate Workspace platform services that are resilient, scalable, and well understood, contributing to cloud infrastructure and application reliability across AWS and Azure. Apply Site Reliability Engineering principles to support consistent, customer focused outcomes: develop and refine SLIs, SLOs, and error budgets, and use incidents as opportunities for shared learning and improvement. Shape the technical direction of Workspace cloud services, providing experience based input into distributed systems design, container platforms such as EKS, cloud networking, storage, and security, and exploring new technologies that align with platform needs and long term maintainability. Own production systems through shared responsibility, investigating and resolving complex service issues while improving observability practices for early detection and effective root cause analysis. Automate infrastructure, delivery pipelines, and operational processes to improve reliability and developer experience, supporting CI/CD workflows using GitLab and infrastructure as code tools such as Terraform. Embed security and compliance into everyday engineering practices, ensuring cloud services align with governance expectations, vulnerability management processes, and regulatory requirements. Collaborate across engineering and product teams, sharing cloud and SRE knowledge, supporting less experienced engineers, and creating clear, reusable patterns that improve platform consistency. Qualifications Substantial experience in cloud engineering, site reliability engineering, or DevOps at a senior or principal level, with a hands on approach to building and supporting production systems. Strong experience working with AWS and/or Azure, and competent operating Kubernetes based platforms such as EKS, with a solid Linux foundation. Familiar with SRE concepts and observability practices, and experienced with CI/CD tooling such as GitLab and infrastructure as code solutions like Terraform. Understanding of cloud native networking and high availability design, comfortable working in Agile or Scrum based teams. Excellent collaboration, clear communication, and a continuous learning mindset. We are an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, religion, color, national origin, gender, sexual orientation, gender identity, age, veteran status, or disability.
30/05/2026
Full time
Role Profile At LSEG, we build technology that helps people make informed decisions in global financial markets. We are looking for a Principal Cloud Site Reliability Engineer (SRE) with strong cloud platform experience to evolve the reliability, scalability, and operational health of our LSEG Workspace platform - a core product used by financial professionals worldwide for real time data, analytics, research, and productivity. In this role, you will strengthen how our cloud native services are designed, operated, and supported across AWS and Azure, contribute hands on engineering skills, share technical guidance, and help shape long term platform approaches in a collaborative, learning oriented environment. Responsibilities Design and operate Workspace platform services that are resilient, scalable, and well understood, contributing to cloud infrastructure and application reliability across AWS and Azure. Apply Site Reliability Engineering principles to support consistent, customer focused outcomes: develop and refine SLIs, SLOs, and error budgets, and use incidents as opportunities for shared learning and improvement. Shape the technical direction of Workspace cloud services, providing experience based input into distributed systems design, container platforms such as EKS, cloud networking, storage, and security, and exploring new technologies that align with platform needs and long term maintainability. Own production systems through shared responsibility, investigating and resolving complex service issues while improving observability practices for early detection and effective root cause analysis. Automate infrastructure, delivery pipelines, and operational processes to improve reliability and developer experience, supporting CI/CD workflows using GitLab and infrastructure as code tools such as Terraform. Embed security and compliance into everyday engineering practices, ensuring cloud services align with governance expectations, vulnerability management processes, and regulatory requirements. Collaborate across engineering and product teams, sharing cloud and SRE knowledge, supporting less experienced engineers, and creating clear, reusable patterns that improve platform consistency. Qualifications Substantial experience in cloud engineering, site reliability engineering, or DevOps at a senior or principal level, with a hands on approach to building and supporting production systems. Strong experience working with AWS and/or Azure, and competent operating Kubernetes based platforms such as EKS, with a solid Linux foundation. Familiar with SRE concepts and observability practices, and experienced with CI/CD tooling such as GitLab and infrastructure as code solutions like Terraform. Understanding of cloud native networking and high availability design, comfortable working in Agile or Scrum based teams. Excellent collaboration, clear communication, and a continuous learning mindset. We are an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, religion, color, national origin, gender, sexual orientation, gender identity, age, veteran status, or disability.
Staff Site Reliability Engineer
Replit
Replit is the agentic software creation platform that enables anyone to build applications using natural language. With millions of users worldwide, Replit is democratizing software development by removing traditional barriers to application creation. About the role Join our Site Reliability Engineering (SRE) team and help ensure the reliability, scalability, and performance of Replit's infrastructure that serves millions of developers worldwide. As a Staff Site Reliability Engineer, you will bridge the gap between development and operations, implementing automation and establishing best practices that enable our platform to scale efficiently while maintaining high availability. We are seeking Staff SREs who are passionate about building and maintaining resilient systems at scale. Your mission will be to proactively find and analyze reliability problems across our stack, then design and implement software and systems to create step-function improvements. You will design robust observability solutions, lead incident response, automate operational tasks, and continuously improve our infrastructure's reliability, all while mentoring and educating the broader engineering team to make reliability a core value at Replit. You Will Architect and Implement Observability: Design, build, and lead the implementation of comprehensive monitoring, logging, and tracing solutions. Create dashboards and metrics that provide real-time visibility into system health and performance, enabling proactive issue detection. Define and Drive Reliability Standards: Work with product and engineering teams to define, implement, and track Service Level Objectives (SLOs) and Service Level Indicators (SLIs). Build systems to monitor and report on these metrics, holding teams accountable and ensuring we maintain high reliability standards while balancing innovation speed. Lead Incident Management and Response: Act as a senior leader during high-impact incidents, guiding the team to rapid resolution. Conduct thorough, blameless post-mortems and drive the implementation of preventative measures. Develop and refine runbooks and build automation to reduce Mean Time To Recovery (MTTR). Drive Automation and Infrastructure as Code: Architect, build, and improve automation to eliminate toil and operational work. Design and maintain CI/CD pipelines and infrastructure automation using tools like Terraform or Pulumi. Create self-healing systems that can automatically respond to common failure scenarios. Optimize Performance on Kubernetes: Collaborate with core infrastructure and product teams to performance-tune and optimize our large-scale cloud deployments, with a deep focus on Kubernetes, Docker, and GCP. Identify and resolve performance bottlenecks, implement capacity planning strategies, and reduce latency across global regions. Debug and Harden Distributed Systems: Dive deep into debugging extremely difficult technical problems across the stack. Use your findings to design and implement long-term fixes that make our systems and products more robust, operable, and easier to diagnose. Provide Staff-Level Guidance: Review feature and system designs from across the company, acting as a key owner for the reliability, scalability, security, and operational integrity of those designs. Educate and Mentor: Educate, mentor, and hold accountable the broader engineering team to improve the reliability of our systems, making reliability a core value of the Replit engineering culture. Build and Integrate: Write high-quality, well-tested code in Python or Go to meet the needs of your customers, whether it's building new internal tools or integrating with third-party vendors. Required Skills and Experience 8-10 years of experience in Site Reliability Engineering or similar roles (e.g., DevOps, Systems Engineering, Infrastructure Engineering). Strong programming skills in languages like Python or Go. You write high-quality, well-tested code. Deep understanding of distributed systems. You've designed, built, scaled, and maintained production services and know how to compose a service-oriented architecture. Deep experience with container orchestration platforms, specifically Kubernetes, and cloud-native technologies. Proven track record of designing, implementing, and maintaining sophisticated monitoring and observability solutions (e.g., metrics, logging, tracing). Strong incident management skills with extensive experience leading incident response for complex systems and demonstrated critical thinking under pressure. Experience with infrastructure as code (e.g., Terraform, Pulumi) and configuration management tools. Excellent written and verbal communication skills, with an ability to explain complex technical concepts clearly and simply and a bias toward open, transparent cultural practices. Strong interpersonal skills, with experience working with and mentoring engineers from junior to principal levels. A willingness to dive into understanding, debugging, and improving any layer of the stack. You're passionate about making software creation accessible and empowering the next generation of builders. Bonus Points Deep experience with Google Cloud Platform (GCP) services and tools. Expert-level knowledge of modern observability platforms (e.g., Prometheus, Grafana, Datadog, OpenTelemetry). Experience designing and building reliable systems capable of handling high throughput and low latency. Significant experience with Go and Terraform. Familiarity with working in rapid-growth, startup environments. Experience writing company-facing blog posts and training materials. Full-Time Employee Benefits Include: Competitive Salary & Equity 401(k) Program with a 4% match (US Only) Health, Dental, Vision and Life Insurance Short Term and Long Term Disability Paid Parental, Medical, Caregiver Leave Flexible Time Off (FTO) + Holidays Commuter Benefits (In-Office Only) Monthly Wellness Stipend Autonomous Work Environment In Office Set-Up Reimbursement (In-Office Only) Quarterly Team Gatherings In Office Amenities (In-Office Only) Want to learn more about what we are up to? Meet the Replit Agent Replit: Make an app for that Replit Blog Amjad TED Talk Interviewing + Culture at Replit Operating Principles Reasons not to work at Replit To achieve our mission of making programming more accessible and around the world, we need our team to be representative of the world. We welcome your unique perspective and experiences in shaping this product. We encourage people from all kinds of backgrounds to apply, including and especially candidates from underrepresented and non-traditional backgrounds.
24/05/2026
Full time
Replit is the agentic software creation platform that enables anyone to build applications using natural language. With millions of users worldwide, Replit is democratizing software development by removing traditional barriers to application creation. About the role Join our Site Reliability Engineering (SRE) team and help ensure the reliability, scalability, and performance of Replit's infrastructure that serves millions of developers worldwide. As a Staff Site Reliability Engineer, you will bridge the gap between development and operations, implementing automation and establishing best practices that enable our platform to scale efficiently while maintaining high availability. We are seeking Staff SREs who are passionate about building and maintaining resilient systems at scale. Your mission will be to proactively find and analyze reliability problems across our stack, then design and implement software and systems to create step-function improvements. You will design robust observability solutions, lead incident response, automate operational tasks, and continuously improve our infrastructure's reliability, all while mentoring and educating the broader engineering team to make reliability a core value at Replit. You Will Architect and Implement Observability: Design, build, and lead the implementation of comprehensive monitoring, logging, and tracing solutions. Create dashboards and metrics that provide real-time visibility into system health and performance, enabling proactive issue detection. Define and Drive Reliability Standards: Work with product and engineering teams to define, implement, and track Service Level Objectives (SLOs) and Service Level Indicators (SLIs). Build systems to monitor and report on these metrics, holding teams accountable and ensuring we maintain high reliability standards while balancing innovation speed. Lead Incident Management and Response: Act as a senior leader during high-impact incidents, guiding the team to rapid resolution. Conduct thorough, blameless post-mortems and drive the implementation of preventative measures. Develop and refine runbooks and build automation to reduce Mean Time To Recovery (MTTR). Drive Automation and Infrastructure as Code: Architect, build, and improve automation to eliminate toil and operational work. Design and maintain CI/CD pipelines and infrastructure automation using tools like Terraform or Pulumi. Create self-healing systems that can automatically respond to common failure scenarios. Optimize Performance on Kubernetes: Collaborate with core infrastructure and product teams to performance-tune and optimize our large-scale cloud deployments, with a deep focus on Kubernetes, Docker, and GCP. Identify and resolve performance bottlenecks, implement capacity planning strategies, and reduce latency across global regions. Debug and Harden Distributed Systems: Dive deep into debugging extremely difficult technical problems across the stack. Use your findings to design and implement long-term fixes that make our systems and products more robust, operable, and easier to diagnose. Provide Staff-Level Guidance: Review feature and system designs from across the company, acting as a key owner for the reliability, scalability, security, and operational integrity of those designs. Educate and Mentor: Educate, mentor, and hold accountable the broader engineering team to improve the reliability of our systems, making reliability a core value of the Replit engineering culture. Build and Integrate: Write high-quality, well-tested code in Python or Go to meet the needs of your customers, whether it's building new internal tools or integrating with third-party vendors. Required Skills and Experience 8-10 years of experience in Site Reliability Engineering or similar roles (e.g., DevOps, Systems Engineering, Infrastructure Engineering). Strong programming skills in languages like Python or Go. You write high-quality, well-tested code. Deep understanding of distributed systems. You've designed, built, scaled, and maintained production services and know how to compose a service-oriented architecture. Deep experience with container orchestration platforms, specifically Kubernetes, and cloud-native technologies. Proven track record of designing, implementing, and maintaining sophisticated monitoring and observability solutions (e.g., metrics, logging, tracing). Strong incident management skills with extensive experience leading incident response for complex systems and demonstrated critical thinking under pressure. Experience with infrastructure as code (e.g., Terraform, Pulumi) and configuration management tools. Excellent written and verbal communication skills, with an ability to explain complex technical concepts clearly and simply and a bias toward open, transparent cultural practices. Strong interpersonal skills, with experience working with and mentoring engineers from junior to principal levels. A willingness to dive into understanding, debugging, and improving any layer of the stack. You're passionate about making software creation accessible and empowering the next generation of builders. Bonus Points Deep experience with Google Cloud Platform (GCP) services and tools. Expert-level knowledge of modern observability platforms (e.g., Prometheus, Grafana, Datadog, OpenTelemetry). Experience designing and building reliable systems capable of handling high throughput and low latency. Significant experience with Go and Terraform. Familiarity with working in rapid-growth, startup environments. Experience writing company-facing blog posts and training materials. Full-Time Employee Benefits Include: Competitive Salary & Equity 401(k) Program with a 4% match (US Only) Health, Dental, Vision and Life Insurance Short Term and Long Term Disability Paid Parental, Medical, Caregiver Leave Flexible Time Off (FTO) + Holidays Commuter Benefits (In-Office Only) Monthly Wellness Stipend Autonomous Work Environment In Office Set-Up Reimbursement (In-Office Only) Quarterly Team Gatherings In Office Amenities (In-Office Only) Want to learn more about what we are up to? Meet the Replit Agent Replit: Make an app for that Replit Blog Amjad TED Talk Interviewing + Culture at Replit Operating Principles Reasons not to work at Replit To achieve our mission of making programming more accessible and around the world, we need our team to be representative of the world. We welcome your unique perspective and experiences in shaping this product. We encourage people from all kinds of backgrounds to apply, including and especially candidates from underrepresented and non-traditional backgrounds.
Staff Site Reliability Engineer
DailyPay City, Belfast
About Us: DailyPay is transforming the way people get paid. As a worktech company and the industry's leading on demand pay solution, DailyPay uses an award-winning technology platform to help America's top employers build stronger relationships with their employees. This voluntary employee benefit enables workers everywhere to feel more motivated to work harder and stay longer on the job while supporting their financial well-being outside of the workplace. DailyPay is headquartered in New York City, with operations throughout the United States as well as in Belfast. For more information, visit DailyPay's Press Center. Because millions rely on us for their financial well-being, our SRE team is dedicated to ensuring our services are always up, always stable and always there when our users need them most. The Role: We are looking for a Staff Site Reliability Engineer to drive operational excellence and reliability best practices across the organisation. We are looking for someone who is passionate about SRE, operates with a high degree of autonomy and is comfortable navigating ambiguity. In this position, you will be expected to champion our SRE mission, leading both our dedicated SRE team and the wider engineering organisation. Our Tech Stack: Go, Terraform, Kubernetes, AWS, DataDog, Loki, Grafana, Tempo, Mimir, OpenTelemetry How You Will Make an Impact: Act as a technical team leader and SRE subject matter expert across the organisation Influence engineering culture to promote reliability and operational excellence Steer the SRE roadmap, technical strategy and reliability standards in collaboration with leadership and our principal engineers Shape the organisation's observability standards, platform choices, and best practices. Identify strategic opportunities to improve system reliability and champion them to completion Level up the SRE team through mentoring, setting a high bar and leading by example. What You Bring to The Team: 8+ years of experience designing, building and scaling complex, highly available systems. Deep SRE expertise: We strongly align with the principles outlined in the Google SRE Book. Proven technical leadership: A strong track record of delivery, technical leadership, and cross-team collaboration. AI Readiness: A willingness to work with AI-assisted development tools as part of your daily workflow. You don't need to be an expert today, but you should be curious, open to learning, and able to critically evaluate AI-generated code before it reaches production. Stack familiarity: Experience with Terraform, Kubernetes, AWS and LGTM would be advantageous. Coding skills: You are comfortable writing production-quality code, ideally in Go or Python is advantageous. What We Offer: Opportunity for equity ownership Private health insurance option Employee Resource Groups Fun company outings and events Generous PTO Allowance 5% Pension contribution If you require reasonable accommodation for any aspect of the recruitment process, please send a request to . All requests for accommodation will be addressed as confidentially as practicable. DailyPay is an equal opportunity employer. All qualified applicants will receive consideration without regard to race, color, religion or creed, alienage or citizenship status, political affiliation, marital or partnership status, age, national origin, ancestry, physical or mental disability, medical condition, veteran status, gender, gender identity, pregnancy, childbirth (or related medical conditions), sex, sexual orientation, sexual and other reproductive health decisions, genetic disorder, genetic predisposition, carrier status, military status, familial status, or domestic violence victim status and any other basis protected under federal, state, or local laws.
22/05/2026
Full time
About Us: DailyPay is transforming the way people get paid. As a worktech company and the industry's leading on demand pay solution, DailyPay uses an award-winning technology platform to help America's top employers build stronger relationships with their employees. This voluntary employee benefit enables workers everywhere to feel more motivated to work harder and stay longer on the job while supporting their financial well-being outside of the workplace. DailyPay is headquartered in New York City, with operations throughout the United States as well as in Belfast. For more information, visit DailyPay's Press Center. Because millions rely on us for their financial well-being, our SRE team is dedicated to ensuring our services are always up, always stable and always there when our users need them most. The Role: We are looking for a Staff Site Reliability Engineer to drive operational excellence and reliability best practices across the organisation. We are looking for someone who is passionate about SRE, operates with a high degree of autonomy and is comfortable navigating ambiguity. In this position, you will be expected to champion our SRE mission, leading both our dedicated SRE team and the wider engineering organisation. Our Tech Stack: Go, Terraform, Kubernetes, AWS, DataDog, Loki, Grafana, Tempo, Mimir, OpenTelemetry How You Will Make an Impact: Act as a technical team leader and SRE subject matter expert across the organisation Influence engineering culture to promote reliability and operational excellence Steer the SRE roadmap, technical strategy and reliability standards in collaboration with leadership and our principal engineers Shape the organisation's observability standards, platform choices, and best practices. Identify strategic opportunities to improve system reliability and champion them to completion Level up the SRE team through mentoring, setting a high bar and leading by example. What You Bring to The Team: 8+ years of experience designing, building and scaling complex, highly available systems. Deep SRE expertise: We strongly align with the principles outlined in the Google SRE Book. Proven technical leadership: A strong track record of delivery, technical leadership, and cross-team collaboration. AI Readiness: A willingness to work with AI-assisted development tools as part of your daily workflow. You don't need to be an expert today, but you should be curious, open to learning, and able to critically evaluate AI-generated code before it reaches production. Stack familiarity: Experience with Terraform, Kubernetes, AWS and LGTM would be advantageous. Coding skills: You are comfortable writing production-quality code, ideally in Go or Python is advantageous. What We Offer: Opportunity for equity ownership Private health insurance option Employee Resource Groups Fun company outings and events Generous PTO Allowance 5% Pension contribution If you require reasonable accommodation for any aspect of the recruitment process, please send a request to . All requests for accommodation will be addressed as confidentially as practicable. DailyPay is an equal opportunity employer. All qualified applicants will receive consideration without regard to race, color, religion or creed, alienage or citizenship status, political affiliation, marital or partnership status, age, national origin, ancestry, physical or mental disability, medical condition, veteran status, gender, gender identity, pregnancy, childbirth (or related medical conditions), sex, sexual orientation, sexual and other reproductive health decisions, genetic disorder, genetic predisposition, carrier status, military status, familial status, or domestic violence victim status and any other basis protected under federal, state, or local laws.
live nation
Lead Site Reliability Developer - CSRE Consulting
live nation
Job Summary:JOB DESCRIPTION - LEAD SITE RELIABILITY ENGINEER - CSRE CONSULTINGLocation: London, United KingdomDivision: Ticketmaster UK LimitedLine Manager: Engagement Lead, CSRE ConsultingContract Terms: Permanent, 40 hours per weekTHE TEAMA career at Ticketmaster will challenge and engage you. We support the creators and producers of shows and live performances, while connecting more passionate fans to these events. The pace here is fast, the atmosphere is fun and a passion for live events is a common thread that ties us together. As a global and growing business, we can truly offer a world of opportunities to expand your skills and develop your career. Visit any of our offices and you'll find a diverse mix of passionate employees, helping fans around the globe connect with the artists, teams and events they love. It truly is a unique and rewarding environment.You will be part of the Central SRE Consulting team, which partners with product and platform engineering teams throughout Ticketmaster to improve reliability, resilience, and sustainable engineering practices. We often deliver through work that combine hands-on delivery with capability building so teams can sustain improvements independently. The team's remit is to increase adoption and maturity of SRE principles across Ticketmaster and ensure our services are appropriately scaled and reliable.We support teams across the globe, with many peers in the USA. Most of your teammates operate in UTC/UTC+1, and we are adding people in other time zones.THE JOBAs a Lead Site Reliability Engineer in CSRE Consulting, you will lead reliability consulting work across multiple teams or a domain, aligning stakeholders on priorities and driving delivery of sustained improvements. You will translate reliability goals into sequenced workstreams, align dependencies, and ensure teams can maintain the mechanisms after you move on.You will mentor other consultants, codify reusable patterns, and influence shared platforms so reliability improvements propagate beyond any single team or engagement.WHAT YOU WILL BE DOINGLead consulting work from discovery through delivery by aligning stakeholders on priorities, sequencing work, and communicating measurable outcomes.Establish working cadence and facilitate decision forums to surface risks, map dependencies, and drive clear ownership and timelines.Align product, platform, and engineering stakeholders on reliability targets and trade-offs using SLOs and error budgets.Partner regularly with Engineering Managers, product managers, Staff and Principal engineers, and platform leads to keep dependencies, decisions, and delivery aligned.Identify systemic risks across shared dependencies and coordinate remediation across multiple teams to reduce recurring incidents.Drive change adoption by embedding reliability mechanisms into partner team routines such as planning, PRRs, and on-call practices.Design and implement reusable reliability mechanisms, templates, and tooling that can be adopted across teams.Establish and evolve production readiness review practices with partner teams to improve launch quality and change safety.Drive observability strategy for partner domains by improving signal quality, alerting philosophy, and operational dashboards.Lead complex incident investigations and ensure learnings translate into durable fixes with clear owners and verification.Lead reliability-focused design and code reviews and guide teams toward simpler, safer architectures.Mentor Senior engineers and other consultants through pairing, reviews, and structured coaching to multiply impact.Partner with internal platform engineering to influence roadmaps and deliver shared capabilities that accelerate SRE adoption.Improve CSRE Consulting playbooks and operating practices based on repeated patterns observed across teams.WHAT YOU NEED TO KNOW (or TECHNICAL SKILLS)Deep practical understanding of SRE principles, including SLO governance and error budget policy in practice.Proven ability to lead cross-team technical work and influence without authority.Strong experience designing and troubleshooting distributed systems with cross-service failure modes.Experience shaping observability and alerting strategy and improving operational signal quality.Strong Kubernetes and AWS experience, including governance and cost trade-offs.Ability to design reliability automation and tooling that is reusable and adopted by multiple teams.Experience leading production readiness and resilience practices, including DR validation and controlled testing.Strong software engineering fundamentals with the ability to deliver and review high-quality changes in enterprise codebases.Advanced incident analysis skills focused on systemic risk reduction and organizational learning.Excellent communication skills, including exec-ready summaries and clear technical diagrams.YOU (BEHAVIOURAL SKILLS)Lead with service and humility, creating clarity and momentum without relying on authority.Build relationships across teams and functions, and set clear expectations for how you partner and deliver.Facilitate alignment by framing problems, surfacing trade-offs, and running working sessions that end in decisions.Persuade with evidence and empathy, adapting your narrative for engineers, product, and senior stakeholders.Coach and mentor deliberately, helping others grow in reliability thinking and consulting craft.Maintain psychological safety while raising standards, giving direct feedback with respect.Stay persistent and patient in complex organizations, keeping work moving despite slow dependencies.Hold ambiguity comfortably and turn messy inputs into clear plans, options, and next steps.Favor simple mechanisms that scale adoption, not bespoke one-offs that require you to maintain them.Operate at a sustainable pace and discourage hero culture by designing systems that do not need it.Take pride in quality, including documentation and decision records that help teams sustain the work.Remain adaptable, switching between hands-on debugging, stakeholder management, and planning as needed.LIFE AT TICKETMASTERWe are proud to be a part of Live Nation Entertainment, the world's largest live entertainment company.Our vision at Ticketmaster is to connect people around the world to the live events they love. As the world's largest ticket marketplace and the leading global provider of enterprise tools and services for the live entertainment business, we are uniquely positioned to successfully deliver on that vision.We do it all with an intense passion for Live and an inspiring and diverse culture driven by accessible leaders, attentive managers, and enthusiastic teams. If you're passionate about live entertainment like we are, and you want to work at a company dedicated to helping millions of fans experience it, we want to hear from you.Our work is guided by our values:Reliability - We understand that fans and clients rely on us to power their live event experiences, and we rely on each other to make it happen.Teamwork - We believe individual achievement pales in comparison to the level of success that can be achieved by a teamIntegrity - We are committed to the highest moral and ethical standards on behalf of the countless partners and stakeholders we representBelonging - We are committed to building a culture in which all people can be their authentic selves, have an equal voice and opportunities to thriveEQUAL OPPORTUNITIESWe are passionate and committed to our people and go beyond the rhetoric of diversity and inclusion. You will be working in an inclusive environment and be encouraged to bring your whole self to work. We will do all that we can to help you successfully balance your work and homelife. As a growing business we will encourage you to develop your professional and personal aspirations, enjoy new experiences, and learn from the talented people you will be working with. It's talent that matters to us and we encourage applications from people irrespective of their gender, race, sexual orientation, religion, age, disability status or caring responsibilities. Nation Entertainment will never request payment or equipment purchases as part of the hiring process. Recruiters will only contact candidates from official Live Nation or affiliated brand email domains.
13/05/2026
Full time
Job Summary:JOB DESCRIPTION - LEAD SITE RELIABILITY ENGINEER - CSRE CONSULTINGLocation: London, United KingdomDivision: Ticketmaster UK LimitedLine Manager: Engagement Lead, CSRE ConsultingContract Terms: Permanent, 40 hours per weekTHE TEAMA career at Ticketmaster will challenge and engage you. We support the creators and producers of shows and live performances, while connecting more passionate fans to these events. The pace here is fast, the atmosphere is fun and a passion for live events is a common thread that ties us together. As a global and growing business, we can truly offer a world of opportunities to expand your skills and develop your career. Visit any of our offices and you'll find a diverse mix of passionate employees, helping fans around the globe connect with the artists, teams and events they love. It truly is a unique and rewarding environment.You will be part of the Central SRE Consulting team, which partners with product and platform engineering teams throughout Ticketmaster to improve reliability, resilience, and sustainable engineering practices. We often deliver through work that combine hands-on delivery with capability building so teams can sustain improvements independently. The team's remit is to increase adoption and maturity of SRE principles across Ticketmaster and ensure our services are appropriately scaled and reliable.We support teams across the globe, with many peers in the USA. Most of your teammates operate in UTC/UTC+1, and we are adding people in other time zones.THE JOBAs a Lead Site Reliability Engineer in CSRE Consulting, you will lead reliability consulting work across multiple teams or a domain, aligning stakeholders on priorities and driving delivery of sustained improvements. You will translate reliability goals into sequenced workstreams, align dependencies, and ensure teams can maintain the mechanisms after you move on.You will mentor other consultants, codify reusable patterns, and influence shared platforms so reliability improvements propagate beyond any single team or engagement.WHAT YOU WILL BE DOINGLead consulting work from discovery through delivery by aligning stakeholders on priorities, sequencing work, and communicating measurable outcomes.Establish working cadence and facilitate decision forums to surface risks, map dependencies, and drive clear ownership and timelines.Align product, platform, and engineering stakeholders on reliability targets and trade-offs using SLOs and error budgets.Partner regularly with Engineering Managers, product managers, Staff and Principal engineers, and platform leads to keep dependencies, decisions, and delivery aligned.Identify systemic risks across shared dependencies and coordinate remediation across multiple teams to reduce recurring incidents.Drive change adoption by embedding reliability mechanisms into partner team routines such as planning, PRRs, and on-call practices.Design and implement reusable reliability mechanisms, templates, and tooling that can be adopted across teams.Establish and evolve production readiness review practices with partner teams to improve launch quality and change safety.Drive observability strategy for partner domains by improving signal quality, alerting philosophy, and operational dashboards.Lead complex incident investigations and ensure learnings translate into durable fixes with clear owners and verification.Lead reliability-focused design and code reviews and guide teams toward simpler, safer architectures.Mentor Senior engineers and other consultants through pairing, reviews, and structured coaching to multiply impact.Partner with internal platform engineering to influence roadmaps and deliver shared capabilities that accelerate SRE adoption.Improve CSRE Consulting playbooks and operating practices based on repeated patterns observed across teams.WHAT YOU NEED TO KNOW (or TECHNICAL SKILLS)Deep practical understanding of SRE principles, including SLO governance and error budget policy in practice.Proven ability to lead cross-team technical work and influence without authority.Strong experience designing and troubleshooting distributed systems with cross-service failure modes.Experience shaping observability and alerting strategy and improving operational signal quality.Strong Kubernetes and AWS experience, including governance and cost trade-offs.Ability to design reliability automation and tooling that is reusable and adopted by multiple teams.Experience leading production readiness and resilience practices, including DR validation and controlled testing.Strong software engineering fundamentals with the ability to deliver and review high-quality changes in enterprise codebases.Advanced incident analysis skills focused on systemic risk reduction and organizational learning.Excellent communication skills, including exec-ready summaries and clear technical diagrams.YOU (BEHAVIOURAL SKILLS)Lead with service and humility, creating clarity and momentum without relying on authority.Build relationships across teams and functions, and set clear expectations for how you partner and deliver.Facilitate alignment by framing problems, surfacing trade-offs, and running working sessions that end in decisions.Persuade with evidence and empathy, adapting your narrative for engineers, product, and senior stakeholders.Coach and mentor deliberately, helping others grow in reliability thinking and consulting craft.Maintain psychological safety while raising standards, giving direct feedback with respect.Stay persistent and patient in complex organizations, keeping work moving despite slow dependencies.Hold ambiguity comfortably and turn messy inputs into clear plans, options, and next steps.Favor simple mechanisms that scale adoption, not bespoke one-offs that require you to maintain them.Operate at a sustainable pace and discourage hero culture by designing systems that do not need it.Take pride in quality, including documentation and decision records that help teams sustain the work.Remain adaptable, switching between hands-on debugging, stakeholder management, and planning as needed.LIFE AT TICKETMASTERWe are proud to be a part of Live Nation Entertainment, the world's largest live entertainment company.Our vision at Ticketmaster is to connect people around the world to the live events they love. As the world's largest ticket marketplace and the leading global provider of enterprise tools and services for the live entertainment business, we are uniquely positioned to successfully deliver on that vision.We do it all with an intense passion for Live and an inspiring and diverse culture driven by accessible leaders, attentive managers, and enthusiastic teams. If you're passionate about live entertainment like we are, and you want to work at a company dedicated to helping millions of fans experience it, we want to hear from you.Our work is guided by our values:Reliability - We understand that fans and clients rely on us to power their live event experiences, and we rely on each other to make it happen.Teamwork - We believe individual achievement pales in comparison to the level of success that can be achieved by a teamIntegrity - We are committed to the highest moral and ethical standards on behalf of the countless partners and stakeholders we representBelonging - We are committed to building a culture in which all people can be their authentic selves, have an equal voice and opportunities to thriveEQUAL OPPORTUNITIESWe are passionate and committed to our people and go beyond the rhetoric of diversity and inclusion. You will be working in an inclusive environment and be encouraged to bring your whole self to work. We will do all that we can to help you successfully balance your work and homelife. As a growing business we will encourage you to develop your professional and personal aspirations, enjoy new experiences, and learn from the talented people you will be working with. It's talent that matters to us and we encourage applications from people irrespective of their gender, race, sexual orientation, religion, age, disability status or caring responsibilities. Nation Entertainment will never request payment or equipment purchases as part of the hiring process. Recruiters will only contact candidates from official Live Nation or affiliated brand email domains.
LA International Computer Consultants Ltd
Principal Site Reliability Engineer
LA International Computer Consultants Ltd Wokingham, Berkshire
Our client is looking for a Principal Site Reliability Engineers to join their team on a initial three month contract with good scope for extension. They require candidates to be able to go to site in Wokingham twice a week and rest remote. This role is Inside IR35 and requires an active SC clearance. Role Description: Collaborate with Agile teams to automate deployment, monitoring, and infrastructure management. Ensure platform and business application reliability and performance against strict SLAs and KPIs. Implement and maintain cloud-native observability stacks (Prometheus, Grafana, Loki, Tempo). Develop and maintain Infrastructure as Code (IaC) using tools like Kustomize or Helm. Manage CI/CD pipelines using Tekton and ArgoCD. Support and troubleshoot OpenShift Operators (ServiceMesh, ODF, ACS, ACM, AMQ). Conduct security reviews and implement controls aligned with national infrastructure standards. Mentor junior engineers and promote SRE best practices. Collaborate with vendors and IT teams for incident resolution and platform improvements. Required Skills: Strong communication skills (written and verbal). Experience in remote team collaboration. Deep expertise in OpenShift/Kubernetes and RedHat Linux. Proficiency in Scripting (Bash, Python) and templating (Helm, Kustomize). Experience with CI/CD automation and IaC strategies. Security-first mindset with experience in regulated environments. Experience with VMware vSphere virtualization Due to the nature and urgency of this post, candidates holding or who have held high level security clearance in the past are most welcome to apply. Please note successful applicants will be required to be security cleared prior to appointment which can take up to a minimum 10 weeks. LA International is a HMG approved ICT Recruitment and Project Solutions Consultancy, operating globally from the largest single site in the UK as an IT Consultancy or as an Employment Business & Agency depending upon the precise nature of the work, for security cleared jobs or non-clearance vacancies, LA International welcome applications from all sections of the community and from people with diverse experience and backgrounds. Award Winning LA International, winner of the Recruiter Awards for Excellence, Best IT Recruitment Company, Best Public Sector Recruitment Company and overall Gold Award winner, has now secured the most prestigious business award that any business can receive, The Queens Award for Enterprise: International Trade, for the second consecutive period.
01/10/2025
Contractor
Our client is looking for a Principal Site Reliability Engineers to join their team on a initial three month contract with good scope for extension. They require candidates to be able to go to site in Wokingham twice a week and rest remote. This role is Inside IR35 and requires an active SC clearance. Role Description: Collaborate with Agile teams to automate deployment, monitoring, and infrastructure management. Ensure platform and business application reliability and performance against strict SLAs and KPIs. Implement and maintain cloud-native observability stacks (Prometheus, Grafana, Loki, Tempo). Develop and maintain Infrastructure as Code (IaC) using tools like Kustomize or Helm. Manage CI/CD pipelines using Tekton and ArgoCD. Support and troubleshoot OpenShift Operators (ServiceMesh, ODF, ACS, ACM, AMQ). Conduct security reviews and implement controls aligned with national infrastructure standards. Mentor junior engineers and promote SRE best practices. Collaborate with vendors and IT teams for incident resolution and platform improvements. Required Skills: Strong communication skills (written and verbal). Experience in remote team collaboration. Deep expertise in OpenShift/Kubernetes and RedHat Linux. Proficiency in Scripting (Bash, Python) and templating (Helm, Kustomize). Experience with CI/CD automation and IaC strategies. Security-first mindset with experience in regulated environments. Experience with VMware vSphere virtualization Due to the nature and urgency of this post, candidates holding or who have held high level security clearance in the past are most welcome to apply. Please note successful applicants will be required to be security cleared prior to appointment which can take up to a minimum 10 weeks. LA International is a HMG approved ICT Recruitment and Project Solutions Consultancy, operating globally from the largest single site in the UK as an IT Consultancy or as an Employment Business & Agency depending upon the precise nature of the work, for security cleared jobs or non-clearance vacancies, LA International welcome applications from all sections of the community and from people with diverse experience and backgrounds. Award Winning LA International, winner of the Recruiter Awards for Excellence, Best IT Recruitment Company, Best Public Sector Recruitment Company and overall Gold Award winner, has now secured the most prestigious business award that any business can receive, The Queens Award for Enterprise: International Trade, for the second consecutive period.

Modal Window

  • Home
  • Contact
  • About Us
  • FAQs
  • Terms & Conditions
  • Privacy
  • Employer
  • Post a Job
  • Search Resumes
  • Sign in
  • Job Seeker
  • Find Jobs
  • Create Resume
  • Sign in
  • IT blog
  • Facebook
  • Twitter
  • LinkedIn
  • Youtube
© 2008-2026 IT Job Board