Network Engineer - 5 days onsite in Gloucestershire At DXC Technology, delivering excellence for our customers and colleagues is more than just a motto, it's something we strive towards constantly through our work. Every day we deliver mission critical services in a secure environment whilst promoting our people first agenda, a real sense of community and a healthy work-life balance. Our consistently positive customer feedback and continuous growth helps us cement our place as one of the world's leading IT solutions enterprises, helping us deliver services and solutions in both challenging and exciting situations. We believe that hiring a diverse team is crucial to our success and our recruiting decisions are based on your skills and experience as an individual. We actively encourage consistent growth on our journey towards a culture of inclusion and recognise that the people we employ are vital to providing a great customer experience. As such, we have a variety of training, support, and tools available to aid in your continual personal and professional development. Our ongoing goal is to drive innovation and modernise operations across the board, which includes furthering the skills of our colleagues. At DXC, building a better you, builds a better us. At DXC, one of our platinum accounts has openings for on site Network System Administrators for varying skill levels. The successful candidate will work within multiple teams and will be innovative and analytical with a good eye for detail. Your role will include implementing standards, policies, and procedures for continual service improvement. Role Responsibilities Provide first and second level technical support on incidents and problems Monitor overall system performance and ensure smooth system functionality Create, maintain, and utilise documentation Assist building compliance with our processes and policies What You Will Bring To The Team Excellent organisation and time management skills Working to ITIL best practices Desire to improve processes, looking for the root cause of a problem Willingness to both share your knowledge and learn from others A proactive approach towards looking for risks and problemsExcellent written and verbal communication skills An ability to adapt quickly and work in an agile fashion Desirable Skills And Technologies Experience in Cisco, including Nexus family, ASA family, ACI and Hyperflex Knowledge of F5 software, including LTM, ASM and GTM Experience in Ansible Experience working on Datacenters Knowledge in VMware, including vSphere, NSX-V and NSX-T Exposure to cloud services, such as AWS and Microsoft Azure Experience in Dell VxRAIL Knowledge of SolarWinds NCM - NPM would also be useful Experience in Software Defined Networking What We Will Do For You Competitive compensation Pension scheme DXC Select - Our comprehensive benefits package (includes private health/medical insurance, childcare vouchers, gym membership and more) Perks at Work (discounts on technology, groceries, travel and more) DXC incentives (recognition tools, employee lunches, regular social events etc) At DXC Technology, we believe strong connections and community are key to our success. Our work model prioritizes in-person collaboration while offering flexibility to support wellbeing, productivity, individual work styles, and life circumstances. We're committed to fostering an inclusive environment where everyone can thrive.
27/05/2026
Full time
Network Engineer - 5 days onsite in Gloucestershire At DXC Technology, delivering excellence for our customers and colleagues is more than just a motto, it's something we strive towards constantly through our work. Every day we deliver mission critical services in a secure environment whilst promoting our people first agenda, a real sense of community and a healthy work-life balance. Our consistently positive customer feedback and continuous growth helps us cement our place as one of the world's leading IT solutions enterprises, helping us deliver services and solutions in both challenging and exciting situations. We believe that hiring a diverse team is crucial to our success and our recruiting decisions are based on your skills and experience as an individual. We actively encourage consistent growth on our journey towards a culture of inclusion and recognise that the people we employ are vital to providing a great customer experience. As such, we have a variety of training, support, and tools available to aid in your continual personal and professional development. Our ongoing goal is to drive innovation and modernise operations across the board, which includes furthering the skills of our colleagues. At DXC, building a better you, builds a better us. At DXC, one of our platinum accounts has openings for on site Network System Administrators for varying skill levels. The successful candidate will work within multiple teams and will be innovative and analytical with a good eye for detail. Your role will include implementing standards, policies, and procedures for continual service improvement. Role Responsibilities Provide first and second level technical support on incidents and problems Monitor overall system performance and ensure smooth system functionality Create, maintain, and utilise documentation Assist building compliance with our processes and policies What You Will Bring To The Team Excellent organisation and time management skills Working to ITIL best practices Desire to improve processes, looking for the root cause of a problem Willingness to both share your knowledge and learn from others A proactive approach towards looking for risks and problemsExcellent written and verbal communication skills An ability to adapt quickly and work in an agile fashion Desirable Skills And Technologies Experience in Cisco, including Nexus family, ASA family, ACI and Hyperflex Knowledge of F5 software, including LTM, ASM and GTM Experience in Ansible Experience working on Datacenters Knowledge in VMware, including vSphere, NSX-V and NSX-T Exposure to cloud services, such as AWS and Microsoft Azure Experience in Dell VxRAIL Knowledge of SolarWinds NCM - NPM would also be useful Experience in Software Defined Networking What We Will Do For You Competitive compensation Pension scheme DXC Select - Our comprehensive benefits package (includes private health/medical insurance, childcare vouchers, gym membership and more) Perks at Work (discounts on technology, groceries, travel and more) DXC incentives (recognition tools, employee lunches, regular social events etc) At DXC Technology, we believe strong connections and community are key to our success. Our work model prioritizes in-person collaboration while offering flexibility to support wellbeing, productivity, individual work styles, and life circumstances. We're committed to fostering an inclusive environment where everyone can thrive.
About Graphcore At Graphcore, we're building the future of AI compute. We're a team of semiconductor, software and AI experts, with deep experience in creating the complete AI compute stack - from silicon and software to infrastructure at datacenter scale. As part of the SoftBank Group, backed by significant long-term investment, we are delivering key technology into the fast-growing SoftBank AI ecosystem. To meet the vast and exciting AI opportunity, Graphcore is expanding its teams around the world. We are bringing together the brightest minds to solve the toughest problems, in a place where everyone has the opportunity to make an impact on the company, our products and the future of artificial intelligence Job Summary Applicants for this role should have strong experience working with machine learning systems and frameworks, along with a solid understanding of core AI concepts and model behaviour. The role centres on testing, validating, and benchmarking a complex ML software stack, with a particular focus on performance, reliability, and correctness across modern AI workloads. The ideal candidate is an experienced ML engineer who understands how contemporary models are trained and executed, and who has hands on experience debugging functional and performance issues in ML systems. This person will be comfortable working with industry-standard frameworks and state-of-the-art models, bringing them up on internal infrastructure, and collaborating closely with software and hardware teams in a technically demanding environment spanning ML frameworks, infrastructure, and AI accelerator hardware. The Team The ML QA team is composed of highly skilled software engineers with a strong focus on automation, software quality, and data driven validation. The team works closely with industry standard machine learning frameworks and models, contributing to upstream open source projects and collaborating across the wider software organization. Operating in a fast paced environment, the team plays a critical role in ensuring reliability, performance, and maintainability across the ML software stack, helping to deliver robust and high quality products to customers. Responsibilities and Duties Benchmark ML models and frameworks, analysing results to identify regressions, performance bottlenecks, and correctness issues. Work hands on with industry standard ML frameworks to validate functionality and performance across different execution environments. Build and maintain automated testing and benchmarking pipelines targeting simulators, emulators, and physical hardware. Collaborate closely with software teams to ensure adequate test coverage for new and existing features. Develop tooling and scripts (primarily in Python) to support testing, benchmarking, and functional reporting. Take ownership over aspects of our testing and infrastructure, owning the roadmap and driving innovation independently. Candidate Profile Essential: Experience working in Machine Learning or ML adjacent engineering roles. Strong foundation in core AI and ML concepts (e.g., neural networks, training vs inference, numerical precision, performance trade offs). Hands on experience with one or more major ML frameworks such as PyTorch, TensorFlow, JAX, or similar. Strong proficiency in Python for ML workflows, experimentation, and automation. Experience designing, running, and analysing ML benchmarks or experiments. Experience working in Linux environments. Strong analytical and debugging skills, with the ability to reason about model behaviour and system performance. Bachelor/Master's/PhD or equivalent experience in Computer Science, Maths, Machine Learning, Data Science, or related field. Desirable Experience with MLOps pipelines, model deployment, or production ML systems. Familiarity with performance analysis, profiling tools, or numerical accuracy validation. Exposure to distributed training or inference systems. Experience with hardware accelerated ML, compilers, or system level performance considerations. Familiarity with CI/CD systems used for ML workflows. Experience contributing to open source ML frameworks or tooling. Benefits In addition to a competitive salary, Graphcore offers flexible working, a generous annual leave policy, private medical insurance and health cash plan, a dental plan, pension (matched up to 5%), life assurance and income protection. We have a generous parental leave policy and an employee assistance programme (which includes health, mental wellbeing, and bereavement support). We offer a range of healthy food and snacks at our central Bristol office and have our own barista bar! We welcome people of different backgrounds and experiences; we're committed to building an inclusive work environment that makes Graphcore a great home for everyone. We offer an equal opportunity process and understand that there are visible and invisible differences in all of us. We can provide a flexible approach to interview and encourage you to chat to us if you require any reasonable adjustments. Applicants for this position must hold the right to work in the UK. Unfortunately at this time, we are unable to provide visa sponsorship or support for visa applications.
27/05/2026
Full time
About Graphcore At Graphcore, we're building the future of AI compute. We're a team of semiconductor, software and AI experts, with deep experience in creating the complete AI compute stack - from silicon and software to infrastructure at datacenter scale. As part of the SoftBank Group, backed by significant long-term investment, we are delivering key technology into the fast-growing SoftBank AI ecosystem. To meet the vast and exciting AI opportunity, Graphcore is expanding its teams around the world. We are bringing together the brightest minds to solve the toughest problems, in a place where everyone has the opportunity to make an impact on the company, our products and the future of artificial intelligence Job Summary Applicants for this role should have strong experience working with machine learning systems and frameworks, along with a solid understanding of core AI concepts and model behaviour. The role centres on testing, validating, and benchmarking a complex ML software stack, with a particular focus on performance, reliability, and correctness across modern AI workloads. The ideal candidate is an experienced ML engineer who understands how contemporary models are trained and executed, and who has hands on experience debugging functional and performance issues in ML systems. This person will be comfortable working with industry-standard frameworks and state-of-the-art models, bringing them up on internal infrastructure, and collaborating closely with software and hardware teams in a technically demanding environment spanning ML frameworks, infrastructure, and AI accelerator hardware. The Team The ML QA team is composed of highly skilled software engineers with a strong focus on automation, software quality, and data driven validation. The team works closely with industry standard machine learning frameworks and models, contributing to upstream open source projects and collaborating across the wider software organization. Operating in a fast paced environment, the team plays a critical role in ensuring reliability, performance, and maintainability across the ML software stack, helping to deliver robust and high quality products to customers. Responsibilities and Duties Benchmark ML models and frameworks, analysing results to identify regressions, performance bottlenecks, and correctness issues. Work hands on with industry standard ML frameworks to validate functionality and performance across different execution environments. Build and maintain automated testing and benchmarking pipelines targeting simulators, emulators, and physical hardware. Collaborate closely with software teams to ensure adequate test coverage for new and existing features. Develop tooling and scripts (primarily in Python) to support testing, benchmarking, and functional reporting. Take ownership over aspects of our testing and infrastructure, owning the roadmap and driving innovation independently. Candidate Profile Essential: Experience working in Machine Learning or ML adjacent engineering roles. Strong foundation in core AI and ML concepts (e.g., neural networks, training vs inference, numerical precision, performance trade offs). Hands on experience with one or more major ML frameworks such as PyTorch, TensorFlow, JAX, or similar. Strong proficiency in Python for ML workflows, experimentation, and automation. Experience designing, running, and analysing ML benchmarks or experiments. Experience working in Linux environments. Strong analytical and debugging skills, with the ability to reason about model behaviour and system performance. Bachelor/Master's/PhD or equivalent experience in Computer Science, Maths, Machine Learning, Data Science, or related field. Desirable Experience with MLOps pipelines, model deployment, or production ML systems. Familiarity with performance analysis, profiling tools, or numerical accuracy validation. Exposure to distributed training or inference systems. Experience with hardware accelerated ML, compilers, or system level performance considerations. Familiarity with CI/CD systems used for ML workflows. Experience contributing to open source ML frameworks or tooling. Benefits In addition to a competitive salary, Graphcore offers flexible working, a generous annual leave policy, private medical insurance and health cash plan, a dental plan, pension (matched up to 5%), life assurance and income protection. We have a generous parental leave policy and an employee assistance programme (which includes health, mental wellbeing, and bereavement support). We offer a range of healthy food and snacks at our central Bristol office and have our own barista bar! We welcome people of different backgrounds and experiences; we're committed to building an inclusive work environment that makes Graphcore a great home for everyone. We offer an equal opportunity process and understand that there are visible and invisible differences in all of us. We can provide a flexible approach to interview and encourage you to chat to us if you require any reasonable adjustments. Applicants for this position must hold the right to work in the UK. Unfortunately at this time, we are unable to provide visa sponsorship or support for visa applications.
Network Site Reliability Engineer Location: London, United Kingdom Posted about 1 year ago Tech Stack Hardware Python Go Amazon AWS Operating systems Reliability Tools and Techniques The role involves collaborative work across various teams, exploring domains like hardware, operating systems, Python/ Go development, AWS, and storage. Responsibilities Develop network and datacenter infrastructure with consistent and straightforward Compensation Competitive Role type Full time Visa sponsorship Not provided
27/05/2026
Full time
Network Site Reliability Engineer Location: London, United Kingdom Posted about 1 year ago Tech Stack Hardware Python Go Amazon AWS Operating systems Reliability Tools and Techniques The role involves collaborative work across various teams, exploring domains like hardware, operating systems, Python/ Go development, AWS, and storage. Responsibilities Develop network and datacenter infrastructure with consistent and straightforward Compensation Competitive Role type Full time Visa sponsorship Not provided
About Graphcore At Graphcore, we're building the future of AI compute. We're a team of semiconductor, software and AI experts, with deep experience in creating the complete AI compute stack - from silicon and software to infrastructure at datacenter scale. As part of the SoftBank Group, backed by significant long-term investment, we are delivering key technology into the fast-growing SoftBank AI ecosystem. To meet the vast and exciting AI opportunity, Graphcore is expanding its teams around the world. We are bringing together the brightest minds to solve the toughest problems, in a place where everyone has the opportunity to make an impact on the company, our products and the future of artificial intelligence. Job Summary We are looking for a Senior Staff Engineer to join our Cloud Platform Team and help develop and deploy cloud services. Working closely with our colleagues in Software Platform, Datacentre Operations and Product Development teams, you will deploy services on our fleet of cutting edge AI systems. As part of our Software Platform organisation, you will be involved in the cloud integration, validation, performance benchmarking, optimisation, and development of our high performance AI solutions, including in house AI systems and off the shelf high performance servers, switches and storage solutions. This is a hand on technical role requiring a solid background in the use of cloud infrastructure, deployment using Infrastructure as Code, observability, high performance networking and storage systems. You may have been working in an IT organisation, a datacentre, a cloud provider or as a developer of orchestration or cloud services. The Software Platform team We build Graphcore products into large scale AI solutions for our customers. The Cloud Platform Team is responsible for providing such systems to both internal users via private clouds and customers via our own public clouds. Often the internal systems will be using and developing pre release hardware and software, so it's vital you are comfortable with unproven components. Responsibilities and Duties Operate and extend existing OpenStack based cloud services and contribute to the deployment and development of new ones. Develop and operate end user services on our clouds and support internal users in their use. Turn end user and product requirements into deployed services. Help build automation to collect and analyse metrics and other observability data from the cloud services to support clear identification and reporting of any issues. Work with users to provide information on any product related issues to Engineering and QA departments. Work with our Datacentre Operations Engineers to maintain and operate the fleet of AI systems at peak performance in our private clouds. Configure and test new Graphcore AI hardware and systems using Continuous Deployment and Infrastructure as Code in internal and external datacentres. Drive corrective actions for systems that are not operating correctly, working with DC operations and Graphcore Engineering as required. Work with external vendors of off the shelf switches, servers and storage solutions to specify, benchmark and integrate 3rd party products into our Cloud Reference Design. Skills and Experience ALL REQUIRED Bachelor's degree or equivalent practical experience in a relevant subject. Solid infrastructure or IT experience with a proven track record of delivering technical output as an individual contributor. Experience managing or operating on premises or private cloud environments. Experience specifying, scoping, estimating and detailing work plans in an AGILE and SCRUM framework, including priorities, risks, issues, impacts and constraints. Strong proven Linux scripting ability (bash and python required). Strong proven Linux system administration (Ubuntu, RHEL and variants). Experience with a version control system (preferably Git) and using it to manage system configuration or automation. Experience with Continuous Integration or testing pipelines using GitLab, GitHub or similar. Hands on experience deploying services into public or private clouds using Infrastructure as Code. A solid understanding of the technologies underpinning cloud services (APIs, virtualisation of CPUs, IO, systems), virtual networks, block storage, resource management and monitoring. Experience with IAC automation tools (e.g. Terraform/OpenTofu, Ansible, Packer). Experience with container deployment and management tools (e.g. Docker, Podman, Apptainer). Experience with solutions for monitoring and observability (Grafana, Prometheus, OpenSearch/ElasticSearch, Loki, Mimir, OpenTelemetry, Fluentd, Kafka). Good communication and presentation skills, and experience dealing with end users of IT or cloud services. An ability to work independently on critical infrastructure without oversight, and with a focus on end user availability. Desirable but not required Experience with OpenStack deployments or the technologies they rely on (e.g. Ceph, Open vSwitch, KVM, QEMU). Experience with High Performance Computing (HPC) environments using SLURM or similar batch workload solutions. Strong skillset and experience in end to end deployment automation and CI of containerised services. Complete automation of pipelines for build, test, deploy, manage, alert, destroy, rebuild. Experience with managing production Kubernetes clusters and workloads. Experience with workload queue management systems (SLURM, LSF, Kueue). Experience with managed switch configuration (e.g. EOS, SONiC, DNOS). Programming experience with Python3 utilising classes and inheritance. Programming experience with Go. Benefits In addition to a competitive salary, Graphcore offers flexible working, a generous annual leave policy, private medical insurance and health cash plan, a dental plan, pension (matched up to 5%), life assurance and income protection. We have a generous parental leave policy and an employee assistance programme (which includes health, mental wellbeing, and bereavement support). We offer a range of healthy food and snacks at our central Bristol office and have our own barista bar! We welcome people of different backgrounds and experiences; we're committed to building an inclusive work environment that makes Graphcore a great home for everyone. We offer an equal opportunity process and understand that there are visible and invisible differences in all of us. We can provide a flexible approach to interview and encourage you to chat to us if you require any reasonable adjustments. Sponsorship Applicants for this position must hold the right to work in the UK. Unfortunately at this time, we are unable to provide visa sponsorship or support for visa applications.
27/05/2026
Full time
About Graphcore At Graphcore, we're building the future of AI compute. We're a team of semiconductor, software and AI experts, with deep experience in creating the complete AI compute stack - from silicon and software to infrastructure at datacenter scale. As part of the SoftBank Group, backed by significant long-term investment, we are delivering key technology into the fast-growing SoftBank AI ecosystem. To meet the vast and exciting AI opportunity, Graphcore is expanding its teams around the world. We are bringing together the brightest minds to solve the toughest problems, in a place where everyone has the opportunity to make an impact on the company, our products and the future of artificial intelligence. Job Summary We are looking for a Senior Staff Engineer to join our Cloud Platform Team and help develop and deploy cloud services. Working closely with our colleagues in Software Platform, Datacentre Operations and Product Development teams, you will deploy services on our fleet of cutting edge AI systems. As part of our Software Platform organisation, you will be involved in the cloud integration, validation, performance benchmarking, optimisation, and development of our high performance AI solutions, including in house AI systems and off the shelf high performance servers, switches and storage solutions. This is a hand on technical role requiring a solid background in the use of cloud infrastructure, deployment using Infrastructure as Code, observability, high performance networking and storage systems. You may have been working in an IT organisation, a datacentre, a cloud provider or as a developer of orchestration or cloud services. The Software Platform team We build Graphcore products into large scale AI solutions for our customers. The Cloud Platform Team is responsible for providing such systems to both internal users via private clouds and customers via our own public clouds. Often the internal systems will be using and developing pre release hardware and software, so it's vital you are comfortable with unproven components. Responsibilities and Duties Operate and extend existing OpenStack based cloud services and contribute to the deployment and development of new ones. Develop and operate end user services on our clouds and support internal users in their use. Turn end user and product requirements into deployed services. Help build automation to collect and analyse metrics and other observability data from the cloud services to support clear identification and reporting of any issues. Work with users to provide information on any product related issues to Engineering and QA departments. Work with our Datacentre Operations Engineers to maintain and operate the fleet of AI systems at peak performance in our private clouds. Configure and test new Graphcore AI hardware and systems using Continuous Deployment and Infrastructure as Code in internal and external datacentres. Drive corrective actions for systems that are not operating correctly, working with DC operations and Graphcore Engineering as required. Work with external vendors of off the shelf switches, servers and storage solutions to specify, benchmark and integrate 3rd party products into our Cloud Reference Design. Skills and Experience ALL REQUIRED Bachelor's degree or equivalent practical experience in a relevant subject. Solid infrastructure or IT experience with a proven track record of delivering technical output as an individual contributor. Experience managing or operating on premises or private cloud environments. Experience specifying, scoping, estimating and detailing work plans in an AGILE and SCRUM framework, including priorities, risks, issues, impacts and constraints. Strong proven Linux scripting ability (bash and python required). Strong proven Linux system administration (Ubuntu, RHEL and variants). Experience with a version control system (preferably Git) and using it to manage system configuration or automation. Experience with Continuous Integration or testing pipelines using GitLab, GitHub or similar. Hands on experience deploying services into public or private clouds using Infrastructure as Code. A solid understanding of the technologies underpinning cloud services (APIs, virtualisation of CPUs, IO, systems), virtual networks, block storage, resource management and monitoring. Experience with IAC automation tools (e.g. Terraform/OpenTofu, Ansible, Packer). Experience with container deployment and management tools (e.g. Docker, Podman, Apptainer). Experience with solutions for monitoring and observability (Grafana, Prometheus, OpenSearch/ElasticSearch, Loki, Mimir, OpenTelemetry, Fluentd, Kafka). Good communication and presentation skills, and experience dealing with end users of IT or cloud services. An ability to work independently on critical infrastructure without oversight, and with a focus on end user availability. Desirable but not required Experience with OpenStack deployments or the technologies they rely on (e.g. Ceph, Open vSwitch, KVM, QEMU). Experience with High Performance Computing (HPC) environments using SLURM or similar batch workload solutions. Strong skillset and experience in end to end deployment automation and CI of containerised services. Complete automation of pipelines for build, test, deploy, manage, alert, destroy, rebuild. Experience with managing production Kubernetes clusters and workloads. Experience with workload queue management systems (SLURM, LSF, Kueue). Experience with managed switch configuration (e.g. EOS, SONiC, DNOS). Programming experience with Python3 utilising classes and inheritance. Programming experience with Go. Benefits In addition to a competitive salary, Graphcore offers flexible working, a generous annual leave policy, private medical insurance and health cash plan, a dental plan, pension (matched up to 5%), life assurance and income protection. We have a generous parental leave policy and an employee assistance programme (which includes health, mental wellbeing, and bereavement support). We offer a range of healthy food and snacks at our central Bristol office and have our own barista bar! We welcome people of different backgrounds and experiences; we're committed to building an inclusive work environment that makes Graphcore a great home for everyone. We offer an equal opportunity process and understand that there are visible and invisible differences in all of us. We can provide a flexible approach to interview and encourage you to chat to us if you require any reasonable adjustments. Sponsorship Applicants for this position must hold the right to work in the UK. Unfortunately at this time, we are unable to provide visa sponsorship or support for visa applications.
About Graphcore At Graphcore, we're building the future of AI compute. We're a team of semiconductor, software and AI experts, with deep experience in creating the complete AI compute stack - from silicon and software to infrastructure at datacenter scale. As part of the SoftBank Group, backed by significant long-term investment, we are delivering key technology into the fast-growing SoftBank AI ecosystem. To meet the vast and exciting AI opportunity, Graphcore is expanding its teams around the world. We are bringing together the brightest minds to solve the toughest problems, in a place where everyone has the opportunity to make an impact on the company, our products and the future of artificial intelligence Job Summary Applicants for this role should have strong experience working with machine learning systems and frameworks, along with a solid understanding of core AI concepts and model behaviour. The role centres on testing, validating, and benchmarking a complex ML software stack, with a particular focus on performance, reliability, and correctness across modern AI workloads. The ideal candidate is an experienced ML engineer who understands how contemporary models are trained and executed, and who has hands on experience debugging functional and performance issues in ML systems. This person will be comfortable working with industry-standard frameworks and state-of-the-art models, bringing them up on internal infrastructure, and collaborating closely with software and hardware teams in a technically demanding environment spanning ML frameworks, infrastructure, and AI accelerator hardware. The Team The ML QA team is composed of highly skilled software engineers with a strong focus on automation, software quality, and data driven validation. The team works closely with industry standard machine learning frameworks and models, contributing to upstream open source projects and collaborating across the wider software organization. Operating in a fast paced environment, the team plays a critical role in ensuring reliability, performance, and maintainability across the ML software stack, helping to deliver robust and high quality products to customers. Responsibilities and Duties Benchmark ML models and frameworks, analysing results to identify regressions, performance bottlenecks, and correctness issues. Work hands on with industry standard ML frameworks to validate functionality and performance across different execution environments. Build and maintain automated testing and benchmarking pipelines targeting simulators, emulators, and physical hardware. Collaborate closely with software teams to ensure adequate test coverage for new and existing features. Develop tooling and scripts (primarily in Python) to support testing, benchmarking, and functional reporting. Take ownership over aspects of our testing and infrastructure, owning the roadmap and driving innovation independently. Candidate Profile Essential: Experience working in Machine Learning or ML adjacent engineering roles. Strong foundation in core AI and ML concepts (e.g., neural networks, training vs inference, numerical precision, performance trade offs). Hands on experience with one or more major ML frameworks such as PyTorch, TensorFlow, JAX, or similar. Strong proficiency in Python for ML workflows, experimentation, and automation. Experience designing, running, and analysing ML benchmarks or experiments. Experience working in Linux environments. Strong analytical and debugging skills, with the ability to reason about model behaviour and system performance. Bachelor/Master's/PhD or equivalent experience in Computer Science, Maths, Machine Learning, Data Science, or related field. Desirable Experience with MLOps pipelines, model deployment, or production ML systems. Familiarity with performance analysis, profiling tools, or numerical accuracy validation. Exposure to distributed training or inference systems. Experience with hardware accelerated ML, compilers, or system level performance considerations. Familiarity with CI/CD systems used for ML workflows. Experience contributing to open source ML frameworks or tooling. Benefits In addition to a competitive salary, Graphcore offers flexible working, a generous annual leave policy, private medical insurance and health cash plan, a dental plan, pension (matched up to 5%), life assurance and income protection. We have a generous parental leave policy and an employee assistance programme (which includes health, mental wellbeing, and bereavement support). We offer a range of healthy food and snacks at our central Bristol office and have our own barista bar! We welcome people of different backgrounds and experiences; we're committed to building an inclusive work environment that makes Graphcore a great home for everyone. We offer an equal opportunity process and understand that there are visible and invisible differences in all of us. We can provide a flexible approach to interview and encourage you to chat to us if you require any reasonable adjustments. Applicants for this position must hold the right to work in the UK. Unfortunately at this time, we are unable to provide visa sponsorship or support for visa applications.
27/05/2026
Full time
About Graphcore At Graphcore, we're building the future of AI compute. We're a team of semiconductor, software and AI experts, with deep experience in creating the complete AI compute stack - from silicon and software to infrastructure at datacenter scale. As part of the SoftBank Group, backed by significant long-term investment, we are delivering key technology into the fast-growing SoftBank AI ecosystem. To meet the vast and exciting AI opportunity, Graphcore is expanding its teams around the world. We are bringing together the brightest minds to solve the toughest problems, in a place where everyone has the opportunity to make an impact on the company, our products and the future of artificial intelligence Job Summary Applicants for this role should have strong experience working with machine learning systems and frameworks, along with a solid understanding of core AI concepts and model behaviour. The role centres on testing, validating, and benchmarking a complex ML software stack, with a particular focus on performance, reliability, and correctness across modern AI workloads. The ideal candidate is an experienced ML engineer who understands how contemporary models are trained and executed, and who has hands on experience debugging functional and performance issues in ML systems. This person will be comfortable working with industry-standard frameworks and state-of-the-art models, bringing them up on internal infrastructure, and collaborating closely with software and hardware teams in a technically demanding environment spanning ML frameworks, infrastructure, and AI accelerator hardware. The Team The ML QA team is composed of highly skilled software engineers with a strong focus on automation, software quality, and data driven validation. The team works closely with industry standard machine learning frameworks and models, contributing to upstream open source projects and collaborating across the wider software organization. Operating in a fast paced environment, the team plays a critical role in ensuring reliability, performance, and maintainability across the ML software stack, helping to deliver robust and high quality products to customers. Responsibilities and Duties Benchmark ML models and frameworks, analysing results to identify regressions, performance bottlenecks, and correctness issues. Work hands on with industry standard ML frameworks to validate functionality and performance across different execution environments. Build and maintain automated testing and benchmarking pipelines targeting simulators, emulators, and physical hardware. Collaborate closely with software teams to ensure adequate test coverage for new and existing features. Develop tooling and scripts (primarily in Python) to support testing, benchmarking, and functional reporting. Take ownership over aspects of our testing and infrastructure, owning the roadmap and driving innovation independently. Candidate Profile Essential: Experience working in Machine Learning or ML adjacent engineering roles. Strong foundation in core AI and ML concepts (e.g., neural networks, training vs inference, numerical precision, performance trade offs). Hands on experience with one or more major ML frameworks such as PyTorch, TensorFlow, JAX, or similar. Strong proficiency in Python for ML workflows, experimentation, and automation. Experience designing, running, and analysing ML benchmarks or experiments. Experience working in Linux environments. Strong analytical and debugging skills, with the ability to reason about model behaviour and system performance. Bachelor/Master's/PhD or equivalent experience in Computer Science, Maths, Machine Learning, Data Science, or related field. Desirable Experience with MLOps pipelines, model deployment, or production ML systems. Familiarity with performance analysis, profiling tools, or numerical accuracy validation. Exposure to distributed training or inference systems. Experience with hardware accelerated ML, compilers, or system level performance considerations. Familiarity with CI/CD systems used for ML workflows. Experience contributing to open source ML frameworks or tooling. Benefits In addition to a competitive salary, Graphcore offers flexible working, a generous annual leave policy, private medical insurance and health cash plan, a dental plan, pension (matched up to 5%), life assurance and income protection. We have a generous parental leave policy and an employee assistance programme (which includes health, mental wellbeing, and bereavement support). We offer a range of healthy food and snacks at our central Bristol office and have our own barista bar! We welcome people of different backgrounds and experiences; we're committed to building an inclusive work environment that makes Graphcore a great home for everyone. We offer an equal opportunity process and understand that there are visible and invisible differences in all of us. We can provide a flexible approach to interview and encourage you to chat to us if you require any reasonable adjustments. Applicants for this position must hold the right to work in the UK. Unfortunately at this time, we are unable to provide visa sponsorship or support for visa applications.
About Graphcore At Graphcore, we're building the future of AI compute. We're a team of semiconductor, software and AI experts, with deep experience in creating the complete AI compute stack - from silicon and software to infrastructure at datacenter scale. As part of the SoftBank Group, backed by significant long term investment, we are delivering key technology into the fast growing SoftBank AI ecosystem. To meet the vast and exciting AI opportunity, Graphcore is expanding its teams around the world. We are bringing together the brightest minds to solve the toughest problems, in a place where everyone has the opportunity to make an impact on the company, our products and the future of artificial intelligence. Job Summary We are looking for a Senior Engineer to join our Cloud Platform Team and help develop and deploy clouds and services. Working closely with our colleagues in Software Platform, Datacentre Operations and Product Development teams, you will deploy services on our fleet of cutting edge AI systems. As part of our Software Platform organisation, you will be involved in the cloud integration, validation, performance benchmarking, optimisation, and development of our high performance AI solutions. These include in house AI systems alongside off the shelf high performance servers, switches and storage solutions. This is a hand on technical role requiring a solid background in the use of cloud infrastructure, deployment using Infrastructure as Code, observability, high performance networking and storage systems. You may have been working in an IT organisation, a datacentre, a cloud provider or as a developer of orchestration or cloud services. The Software Platform team at Graphcore We build Graphcore products into large scale AI solutions for our customers and the Cloud Platform Team is responsible for providing such systems to both internal users via private clouds and customers via our own public clouds. Often the internal systems will be using and developing pre release hardware and software, so it's vital you are comfortable with unproven components. Responsibilities and Duties Develop and operate Kubernetes managed end user services on our private clouds and support internal users in their use. You will turn end user and product requirements into deployed services. Work with our Datacentre Operations Engineers to maintain and operate the fleet of AI systems at peak performance in our private clouds. Configure and test new Graphcore AI hardware and systems using Continuous Deployment and Infrastructure as code in internal and external datacentres. Skills and Experience (all required) Bachelor's degree or equivalent practical experience in a relevant subject. Experience with managing production Kubernetes clusters and workloads with a continuous delivery tool such as ArgoCD. Solid software engineering or IT experience with a proven track record of delivering technical output as an individual contributor. Experience working in an AGILE and SCRUM framework, including understanding of priorities, risks, issues, impacts and constraints. Strong proven Linux scripting ability (bash, python, awk, sed). Strong proven Linux system administration (Ubuntu, RHEL and variants). Experience with a version control system (preferably Git) and using it to manage system configuration or automation. Experience with Continuous Integration or testing pipelines using GitLab, GitHub or similar. A solid hands on understanding of the technologies underpinning cloud services (APIs, virtualization of CPUs, IO, systems), virtual networks, block storage, resource management and monitoring. Experience with IAC automation tools (Terraform/OpenTofu, Ansible, Packer). Good communication and presentation skills, and experience dealing with end users of IT services. An ability to work independently on critical infrastructure with minimal oversight, and with a focus on end user availability. Desirable but not required Experience with Openstack cloud platforms. Experience with solutions for monitoring and observability. e.g. Grafana, Prometheus, OpenSearch/ElasticSearch, Loki. Experience with High Performance Computing (HPC) environments using SLURM or similar batch workload solutions. Programming experience with Python3 utilising classes and inheritance. Benefits In addition to a competitive salary, Graphcore offers flexible working, a generous annual leave policy, private medical insurance and health cash plan, a dental plan, pension (matched up to 5%), life assurance and income protection. We have a generous parental leave policy and an employee assistance programme (which includes health, mental wellbeing, and bereavement support). We offer a range of healthy food and snacks at our central Bristol office and have our own barista bar! We welcome people of different backgrounds and experiences; we're committed to building an inclusive work environment that makes Graphcore a great home for everyone. We offer an equal opportunity process and understand that there are visible and invisible differences in all of us. We can provide a flexible approach to interview and encourage you to chat to us if you require any reasonable adjustments. Sponsorship Applicants for this position must hold the right to work in the UK. Unfortunately at this time, we are unable to provide visa sponsorship or support for visa applications.
27/05/2026
Full time
About Graphcore At Graphcore, we're building the future of AI compute. We're a team of semiconductor, software and AI experts, with deep experience in creating the complete AI compute stack - from silicon and software to infrastructure at datacenter scale. As part of the SoftBank Group, backed by significant long term investment, we are delivering key technology into the fast growing SoftBank AI ecosystem. To meet the vast and exciting AI opportunity, Graphcore is expanding its teams around the world. We are bringing together the brightest minds to solve the toughest problems, in a place where everyone has the opportunity to make an impact on the company, our products and the future of artificial intelligence. Job Summary We are looking for a Senior Engineer to join our Cloud Platform Team and help develop and deploy clouds and services. Working closely with our colleagues in Software Platform, Datacentre Operations and Product Development teams, you will deploy services on our fleet of cutting edge AI systems. As part of our Software Platform organisation, you will be involved in the cloud integration, validation, performance benchmarking, optimisation, and development of our high performance AI solutions. These include in house AI systems alongside off the shelf high performance servers, switches and storage solutions. This is a hand on technical role requiring a solid background in the use of cloud infrastructure, deployment using Infrastructure as Code, observability, high performance networking and storage systems. You may have been working in an IT organisation, a datacentre, a cloud provider or as a developer of orchestration or cloud services. The Software Platform team at Graphcore We build Graphcore products into large scale AI solutions for our customers and the Cloud Platform Team is responsible for providing such systems to both internal users via private clouds and customers via our own public clouds. Often the internal systems will be using and developing pre release hardware and software, so it's vital you are comfortable with unproven components. Responsibilities and Duties Develop and operate Kubernetes managed end user services on our private clouds and support internal users in their use. You will turn end user and product requirements into deployed services. Work with our Datacentre Operations Engineers to maintain and operate the fleet of AI systems at peak performance in our private clouds. Configure and test new Graphcore AI hardware and systems using Continuous Deployment and Infrastructure as code in internal and external datacentres. Skills and Experience (all required) Bachelor's degree or equivalent practical experience in a relevant subject. Experience with managing production Kubernetes clusters and workloads with a continuous delivery tool such as ArgoCD. Solid software engineering or IT experience with a proven track record of delivering technical output as an individual contributor. Experience working in an AGILE and SCRUM framework, including understanding of priorities, risks, issues, impacts and constraints. Strong proven Linux scripting ability (bash, python, awk, sed). Strong proven Linux system administration (Ubuntu, RHEL and variants). Experience with a version control system (preferably Git) and using it to manage system configuration or automation. Experience with Continuous Integration or testing pipelines using GitLab, GitHub or similar. A solid hands on understanding of the technologies underpinning cloud services (APIs, virtualization of CPUs, IO, systems), virtual networks, block storage, resource management and monitoring. Experience with IAC automation tools (Terraform/OpenTofu, Ansible, Packer). Good communication and presentation skills, and experience dealing with end users of IT services. An ability to work independently on critical infrastructure with minimal oversight, and with a focus on end user availability. Desirable but not required Experience with Openstack cloud platforms. Experience with solutions for monitoring and observability. e.g. Grafana, Prometheus, OpenSearch/ElasticSearch, Loki. Experience with High Performance Computing (HPC) environments using SLURM or similar batch workload solutions. Programming experience with Python3 utilising classes and inheritance. Benefits In addition to a competitive salary, Graphcore offers flexible working, a generous annual leave policy, private medical insurance and health cash plan, a dental plan, pension (matched up to 5%), life assurance and income protection. We have a generous parental leave policy and an employee assistance programme (which includes health, mental wellbeing, and bereavement support). We offer a range of healthy food and snacks at our central Bristol office and have our own barista bar! We welcome people of different backgrounds and experiences; we're committed to building an inclusive work environment that makes Graphcore a great home for everyone. We offer an equal opportunity process and understand that there are visible and invisible differences in all of us. We can provide a flexible approach to interview and encourage you to chat to us if you require any reasonable adjustments. Sponsorship Applicants for this position must hold the right to work in the UK. Unfortunately at this time, we are unable to provide visa sponsorship or support for visa applications.
About Graphcore At Graphcore, we're building the future of AI compute. We're a team of semiconductor, software and AI experts, with deep experience in creating the complete AI compute stack - from silicon and software to infrastructure at datacenter scale. As part of the SoftBank Group, backed by significant long term investment, we are delivering key technology into the fast growing SoftBank AI ecosystem. To meet the vast and exciting AI opportunity, Graphcore is expanding its teams around the world. We are bringing together the brightest minds to solve the toughest problems, in a place where everyone has the opportunity to make an impact on the company, our products and the future of artificial intelligence. Job Summary We are looking for a Senior Engineer to join our Cloud Platform Team and help develop and deploy clouds and services. Working closely with our colleagues in Software Platform, Datacentre Operations and Product Development teams, you will deploy services on our fleet of cutting edge AI systems. As part of our Software Platform organisation, you will be involved in the cloud integration, validation, performance benchmarking, optimisation, and development of our high performance AI solutions. These include in house AI systems alongside off the shelf high performance servers, switches and storage solutions. This is a hand on technical role requiring a solid background in the use of cloud infrastructure, deployment using Infrastructure as Code, observability, high performance networking and storage systems. You may have been working in an IT organisation, a datacentre, a cloud provider or as a developer of orchestration or cloud services. The Software Platform team at Graphcore We build Graphcore products into large scale AI solutions for our customers and the Cloud Platform Team is responsible for providing such systems to both internal users via private clouds and customers via our own public clouds. Often the internal systems will be using and developing pre release hardware and software, so it's vital you are comfortable with unproven components. Responsibilities and Duties Develop and operate Kubernetes managed end user services on our private clouds and support internal users in their use. You will turn end user and product requirements into deployed services. Work with our Datacentre Operations Engineers to maintain and operate the fleet of AI systems at peak performance in our private clouds. Configure and test new Graphcore AI hardware and systems using Continuous Deployment and Infrastructure as code in internal and external datacentres. Skills and Experience (all required) Bachelor's degree or equivalent practical experience in a relevant subject. Experience with managing production Kubernetes clusters and workloads with a continuous delivery tool such as ArgoCD. Solid software engineering or IT experience with a proven track record of delivering technical output as an individual contributor. Experience working in an AGILE and SCRUM framework, including understanding of priorities, risks, issues, impacts and constraints. Strong proven Linux scripting ability (bash, python, awk, sed). Strong proven Linux system administration (Ubuntu, RHEL and variants). Experience with a version control system (preferably Git) and using it to manage system configuration or automation. Experience with Continuous Integration or testing pipelines using GitLab, GitHub or similar. A solid hands on understanding of the technologies underpinning cloud services (APIs, virtualization of CPUs, IO, systems), virtual networks, block storage, resource management and monitoring. Experience with IAC automation tools (Terraform/OpenTofu, Ansible, Packer). Good communication and presentation skills, and experience dealing with end users of IT services. An ability to work independently on critical infrastructure with minimal oversight, and with a focus on end user availability. Desirable but not required Experience with Openstack cloud platforms. Experience with solutions for monitoring and observability. e.g. Grafana, Prometheus, OpenSearch/ElasticSearch, Loki. Experience with High Performance Computing (HPC) environments using SLURM or similar batch workload solutions. Programming experience with Python3 utilising classes and inheritance. Benefits In addition to a competitive salary, Graphcore offers flexible working, a generous annual leave policy, private medical insurance and health cash plan, a dental plan, pension (matched up to 5%), life assurance and income protection. We have a generous parental leave policy and an employee assistance programme (which includes health, mental wellbeing, and bereavement support). We offer a range of healthy food and snacks at our central Bristol office and have our own barista bar! We welcome people of different backgrounds and experiences; we're committed to building an inclusive work environment that makes Graphcore a great home for everyone. We offer an equal opportunity process and understand that there are visible and invisible differences in all of us. We can provide a flexible approach to interview and encourage you to chat to us if you require any reasonable adjustments. Sponsorship Applicants for this position must hold the right to work in the UK. Unfortunately at this time, we are unable to provide visa sponsorship or support for visa applications.
27/05/2026
Full time
About Graphcore At Graphcore, we're building the future of AI compute. We're a team of semiconductor, software and AI experts, with deep experience in creating the complete AI compute stack - from silicon and software to infrastructure at datacenter scale. As part of the SoftBank Group, backed by significant long term investment, we are delivering key technology into the fast growing SoftBank AI ecosystem. To meet the vast and exciting AI opportunity, Graphcore is expanding its teams around the world. We are bringing together the brightest minds to solve the toughest problems, in a place where everyone has the opportunity to make an impact on the company, our products and the future of artificial intelligence. Job Summary We are looking for a Senior Engineer to join our Cloud Platform Team and help develop and deploy clouds and services. Working closely with our colleagues in Software Platform, Datacentre Operations and Product Development teams, you will deploy services on our fleet of cutting edge AI systems. As part of our Software Platform organisation, you will be involved in the cloud integration, validation, performance benchmarking, optimisation, and development of our high performance AI solutions. These include in house AI systems alongside off the shelf high performance servers, switches and storage solutions. This is a hand on technical role requiring a solid background in the use of cloud infrastructure, deployment using Infrastructure as Code, observability, high performance networking and storage systems. You may have been working in an IT organisation, a datacentre, a cloud provider or as a developer of orchestration or cloud services. The Software Platform team at Graphcore We build Graphcore products into large scale AI solutions for our customers and the Cloud Platform Team is responsible for providing such systems to both internal users via private clouds and customers via our own public clouds. Often the internal systems will be using and developing pre release hardware and software, so it's vital you are comfortable with unproven components. Responsibilities and Duties Develop and operate Kubernetes managed end user services on our private clouds and support internal users in their use. You will turn end user and product requirements into deployed services. Work with our Datacentre Operations Engineers to maintain and operate the fleet of AI systems at peak performance in our private clouds. Configure and test new Graphcore AI hardware and systems using Continuous Deployment and Infrastructure as code in internal and external datacentres. Skills and Experience (all required) Bachelor's degree or equivalent practical experience in a relevant subject. Experience with managing production Kubernetes clusters and workloads with a continuous delivery tool such as ArgoCD. Solid software engineering or IT experience with a proven track record of delivering technical output as an individual contributor. Experience working in an AGILE and SCRUM framework, including understanding of priorities, risks, issues, impacts and constraints. Strong proven Linux scripting ability (bash, python, awk, sed). Strong proven Linux system administration (Ubuntu, RHEL and variants). Experience with a version control system (preferably Git) and using it to manage system configuration or automation. Experience with Continuous Integration or testing pipelines using GitLab, GitHub or similar. A solid hands on understanding of the technologies underpinning cloud services (APIs, virtualization of CPUs, IO, systems), virtual networks, block storage, resource management and monitoring. Experience with IAC automation tools (Terraform/OpenTofu, Ansible, Packer). Good communication and presentation skills, and experience dealing with end users of IT services. An ability to work independently on critical infrastructure with minimal oversight, and with a focus on end user availability. Desirable but not required Experience with Openstack cloud platforms. Experience with solutions for monitoring and observability. e.g. Grafana, Prometheus, OpenSearch/ElasticSearch, Loki. Experience with High Performance Computing (HPC) environments using SLURM or similar batch workload solutions. Programming experience with Python3 utilising classes and inheritance. Benefits In addition to a competitive salary, Graphcore offers flexible working, a generous annual leave policy, private medical insurance and health cash plan, a dental plan, pension (matched up to 5%), life assurance and income protection. We have a generous parental leave policy and an employee assistance programme (which includes health, mental wellbeing, and bereavement support). We offer a range of healthy food and snacks at our central Bristol office and have our own barista bar! We welcome people of different backgrounds and experiences; we're committed to building an inclusive work environment that makes Graphcore a great home for everyone. We offer an equal opportunity process and understand that there are visible and invisible differences in all of us. We can provide a flexible approach to interview and encourage you to chat to us if you require any reasonable adjustments. Sponsorship Applicants for this position must hold the right to work in the UK. Unfortunately at this time, we are unable to provide visa sponsorship or support for visa applications.
Network Site Reliability Engineer Location: London, United Kingdom Posted about 1 year ago Tech Stack Hardware Python Go Amazon AWS Operating systems Reliability Tools and Techniques The role involves collaborative work across various teams, exploring domains like hardware, operating systems, Python/ Go development, AWS, and storage. Responsibilities Develop network and datacenter infrastructure with consistent and straightforward Compensation Competitive Role type Full time Visa sponsorship Not provided
27/05/2026
Full time
Network Site Reliability Engineer Location: London, United Kingdom Posted about 1 year ago Tech Stack Hardware Python Go Amazon AWS Operating systems Reliability Tools and Techniques The role involves collaborative work across various teams, exploring domains like hardware, operating systems, Python/ Go development, AWS, and storage. Responsibilities Develop network and datacenter infrastructure with consistent and straightforward Compensation Competitive Role type Full time Visa sponsorship Not provided
About Graphcore At Graphcore, we're building the future of AI compute. We're a team of semiconductor, software and AI experts, with deep experience in creating the complete AI compute stack - from silicon and software to infrastructure at datacenter scale. As part of the SoftBank Group, backed by significant long-term investment, we are delivering key technology into the fast-growing SoftBank AI ecosystem. To meet the vast and exciting AI opportunity, Graphcore is expanding its teams around the world. We are bringing together the brightest minds to solve the toughest problems, in a place where everyone has the opportunity to make an impact on the company, our products and the future of artificial intelligence. Job Summary We are looking for a Senior Staff Engineer to join our Cloud Platform Team and help develop and deploy cloud services. Working closely with our colleagues in Software Platform, Datacentre Operations and Product Development teams, you will deploy services on our fleet of cutting edge AI systems. As part of our Software Platform organisation, you will be involved in the cloud integration, validation, performance benchmarking, optimisation, and development of our high performance AI solutions, including in house AI systems and off the shelf high performance servers, switches and storage solutions. This is a hand on technical role requiring a solid background in the use of cloud infrastructure, deployment using Infrastructure as Code, observability, high performance networking and storage systems. You may have been working in an IT organisation, a datacentre, a cloud provider or as a developer of orchestration or cloud services. The Software Platform team We build Graphcore products into large scale AI solutions for our customers. The Cloud Platform Team is responsible for providing such systems to both internal users via private clouds and customers via our own public clouds. Often the internal systems will be using and developing pre release hardware and software, so it's vital you are comfortable with unproven components. Responsibilities and Duties Operate and extend existing OpenStack based cloud services and contribute to the deployment and development of new ones. Develop and operate end user services on our clouds and support internal users in their use. Turn end user and product requirements into deployed services. Help build automation to collect and analyse metrics and other observability data from the cloud services to support clear identification and reporting of any issues. Work with users to provide information on any product related issues to Engineering and QA departments. Work with our Datacentre Operations Engineers to maintain and operate the fleet of AI systems at peak performance in our private clouds. Configure and test new Graphcore AI hardware and systems using Continuous Deployment and Infrastructure as Code in internal and external datacentres. Drive corrective actions for systems that are not operating correctly, working with DC operations and Graphcore Engineering as required. Work with external vendors of off the shelf switches, servers and storage solutions to specify, benchmark and integrate 3rd party products into our Cloud Reference Design. Skills and Experience ALL REQUIRED Bachelor's degree or equivalent practical experience in a relevant subject. Solid infrastructure or IT experience with a proven track record of delivering technical output as an individual contributor. Experience managing or operating on premises or private cloud environments. Experience specifying, scoping, estimating and detailing work plans in an AGILE and SCRUM framework, including priorities, risks, issues, impacts and constraints. Strong proven Linux scripting ability (bash and python required). Strong proven Linux system administration (Ubuntu, RHEL and variants). Experience with a version control system (preferably Git) and using it to manage system configuration or automation. Experience with Continuous Integration or testing pipelines using GitLab, GitHub or similar. Hands on experience deploying services into public or private clouds using Infrastructure as Code. A solid understanding of the technologies underpinning cloud services (APIs, virtualisation of CPUs, IO, systems), virtual networks, block storage, resource management and monitoring. Experience with IAC automation tools (e.g. Terraform/OpenTofu, Ansible, Packer). Experience with container deployment and management tools (e.g. Docker, Podman, Apptainer). Experience with solutions for monitoring and observability (Grafana, Prometheus, OpenSearch/ElasticSearch, Loki, Mimir, OpenTelemetry, Fluentd, Kafka). Good communication and presentation skills, and experience dealing with end users of IT or cloud services. An ability to work independently on critical infrastructure without oversight, and with a focus on end user availability. Desirable but not required Experience with OpenStack deployments or the technologies they rely on (e.g. Ceph, Open vSwitch, KVM, QEMU). Experience with High Performance Computing (HPC) environments using SLURM or similar batch workload solutions. Strong skillset and experience in end to end deployment automation and CI of containerised services. Complete automation of pipelines for build, test, deploy, manage, alert, destroy, rebuild. Experience with managing production Kubernetes clusters and workloads. Experience with workload queue management systems (SLURM, LSF, Kueue). Experience with managed switch configuration (e.g. EOS, SONiC, DNOS). Programming experience with Python3 utilising classes and inheritance. Programming experience with Go. Benefits In addition to a competitive salary, Graphcore offers flexible working, a generous annual leave policy, private medical insurance and health cash plan, a dental plan, pension (matched up to 5%), life assurance and income protection. We have a generous parental leave policy and an employee assistance programme (which includes health, mental wellbeing, and bereavement support). We offer a range of healthy food and snacks at our central Bristol office and have our own barista bar! We welcome people of different backgrounds and experiences; we're committed to building an inclusive work environment that makes Graphcore a great home for everyone. We offer an equal opportunity process and understand that there are visible and invisible differences in all of us. We can provide a flexible approach to interview and encourage you to chat to us if you require any reasonable adjustments. Sponsorship Applicants for this position must hold the right to work in the UK. Unfortunately at this time, we are unable to provide visa sponsorship or support for visa applications.
27/05/2026
Full time
About Graphcore At Graphcore, we're building the future of AI compute. We're a team of semiconductor, software and AI experts, with deep experience in creating the complete AI compute stack - from silicon and software to infrastructure at datacenter scale. As part of the SoftBank Group, backed by significant long-term investment, we are delivering key technology into the fast-growing SoftBank AI ecosystem. To meet the vast and exciting AI opportunity, Graphcore is expanding its teams around the world. We are bringing together the brightest minds to solve the toughest problems, in a place where everyone has the opportunity to make an impact on the company, our products and the future of artificial intelligence. Job Summary We are looking for a Senior Staff Engineer to join our Cloud Platform Team and help develop and deploy cloud services. Working closely with our colleagues in Software Platform, Datacentre Operations and Product Development teams, you will deploy services on our fleet of cutting edge AI systems. As part of our Software Platform organisation, you will be involved in the cloud integration, validation, performance benchmarking, optimisation, and development of our high performance AI solutions, including in house AI systems and off the shelf high performance servers, switches and storage solutions. This is a hand on technical role requiring a solid background in the use of cloud infrastructure, deployment using Infrastructure as Code, observability, high performance networking and storage systems. You may have been working in an IT organisation, a datacentre, a cloud provider or as a developer of orchestration or cloud services. The Software Platform team We build Graphcore products into large scale AI solutions for our customers. The Cloud Platform Team is responsible for providing such systems to both internal users via private clouds and customers via our own public clouds. Often the internal systems will be using and developing pre release hardware and software, so it's vital you are comfortable with unproven components. Responsibilities and Duties Operate and extend existing OpenStack based cloud services and contribute to the deployment and development of new ones. Develop and operate end user services on our clouds and support internal users in their use. Turn end user and product requirements into deployed services. Help build automation to collect and analyse metrics and other observability data from the cloud services to support clear identification and reporting of any issues. Work with users to provide information on any product related issues to Engineering and QA departments. Work with our Datacentre Operations Engineers to maintain and operate the fleet of AI systems at peak performance in our private clouds. Configure and test new Graphcore AI hardware and systems using Continuous Deployment and Infrastructure as Code in internal and external datacentres. Drive corrective actions for systems that are not operating correctly, working with DC operations and Graphcore Engineering as required. Work with external vendors of off the shelf switches, servers and storage solutions to specify, benchmark and integrate 3rd party products into our Cloud Reference Design. Skills and Experience ALL REQUIRED Bachelor's degree or equivalent practical experience in a relevant subject. Solid infrastructure or IT experience with a proven track record of delivering technical output as an individual contributor. Experience managing or operating on premises or private cloud environments. Experience specifying, scoping, estimating and detailing work plans in an AGILE and SCRUM framework, including priorities, risks, issues, impacts and constraints. Strong proven Linux scripting ability (bash and python required). Strong proven Linux system administration (Ubuntu, RHEL and variants). Experience with a version control system (preferably Git) and using it to manage system configuration or automation. Experience with Continuous Integration or testing pipelines using GitLab, GitHub or similar. Hands on experience deploying services into public or private clouds using Infrastructure as Code. A solid understanding of the technologies underpinning cloud services (APIs, virtualisation of CPUs, IO, systems), virtual networks, block storage, resource management and monitoring. Experience with IAC automation tools (e.g. Terraform/OpenTofu, Ansible, Packer). Experience with container deployment and management tools (e.g. Docker, Podman, Apptainer). Experience with solutions for monitoring and observability (Grafana, Prometheus, OpenSearch/ElasticSearch, Loki, Mimir, OpenTelemetry, Fluentd, Kafka). Good communication and presentation skills, and experience dealing with end users of IT or cloud services. An ability to work independently on critical infrastructure without oversight, and with a focus on end user availability. Desirable but not required Experience with OpenStack deployments or the technologies they rely on (e.g. Ceph, Open vSwitch, KVM, QEMU). Experience with High Performance Computing (HPC) environments using SLURM or similar batch workload solutions. Strong skillset and experience in end to end deployment automation and CI of containerised services. Complete automation of pipelines for build, test, deploy, manage, alert, destroy, rebuild. Experience with managing production Kubernetes clusters and workloads. Experience with workload queue management systems (SLURM, LSF, Kueue). Experience with managed switch configuration (e.g. EOS, SONiC, DNOS). Programming experience with Python3 utilising classes and inheritance. Programming experience with Go. Benefits In addition to a competitive salary, Graphcore offers flexible working, a generous annual leave policy, private medical insurance and health cash plan, a dental plan, pension (matched up to 5%), life assurance and income protection. We have a generous parental leave policy and an employee assistance programme (which includes health, mental wellbeing, and bereavement support). We offer a range of healthy food and snacks at our central Bristol office and have our own barista bar! We welcome people of different backgrounds and experiences; we're committed to building an inclusive work environment that makes Graphcore a great home for everyone. We offer an equal opportunity process and understand that there are visible and invisible differences in all of us. We can provide a flexible approach to interview and encourage you to chat to us if you require any reasonable adjustments. Sponsorship Applicants for this position must hold the right to work in the UK. Unfortunately at this time, we are unable to provide visa sponsorship or support for visa applications.
About Graphcore At Graphcore, we're building the future of AI compute.We're a team of semiconductor, software and AI experts, with deep experience in creating the complete AI compute stack - from silicon and software to infrastructure at datacenter scale.As part of the SoftBank Group, backed by significant long term investment, we are delivering key technology into the fast growing SoftBank AI ecosystem.To meet the vast and exciting AI opportunity, Graphcore is expanding its teams around the world.We are bringing together the brightest minds to solve the toughest problems, in a place where everyone has the opportunity to make an impact on the company, our products and the future of artificial intelligence. Job Summary As a Senior Machine Learning Engineer in the Applied AI team at Graphcore, you will contribute to advancing AI technology by developing and optimising AI models tailored to our specialised hardware. You will work on large scale systems where performance is critical to the success of our projects. Working closely with the Software development and Research teams, you will play a critical role in identifying opportunities to innovate and differentiate Graphcore's technology. We seek engineers with strong technical skills and an understanding of AI model implementation at scale, eager to make a tangible impact in this rapidly evolving field. The Team The Applied AI team's role is to be proxies for our customers, we need to understand the latest AI models, applications, and software to ensure that Graphcore's technology works seamlessly with the AI ecosystem and at scale. We build reference applications, contribute to key software libraries e.g. optimising kernels for efficiency on our hardware, and collaborate with the Research team to develop and publish novel ideas in domains such as efficient compute, model scaling and distributed training and inference of AI models for multiple modalities and applications. If you're excited about advancing the next generation of AI models on cutting edge hardware, we'd love to hear from you! Responsibilities and Duties Implement latest machine learning models and optimise them for performance and accuracy, scaling to 1000s of accelerators. Test and evaluate new internal software releases, provide feedback to software engineering teams, make necessary code fixes, and conduct code reviews. Benchmark models and key ML techniques to identify performance bottlenecks and improve model efficiency. Design and conduct experiments on novel AI methods, implement them and evaluate results. Collaborate with Research, Software, and Product teams to define, build, and test Graphcore's next generation of AI hardware. Engage with AI community and keep in touch with the latest developments in AI. Candidate Profile Essential Bachelor/Master's/PhD or equivalent experience in Machine Learning, Computer Science, Maths, Data Science, or related field. Proficiency in deep learning frameworks like PyTorch/JAX. Strong Python or C++ software development skills Expertise in deep learning from model training to optimisation and evaluation. Experience in distributed training or inference of ML models across 64+ accelerators. Capable of designing, executing and reporting from ML experiments. Developed deep understanding of performance bottlenecks and how to overcome them. Ability to move quickly in a dynamic environment Enjoy cross functional work collaborating with other teams. Strong communicator - able to explain complex technical concepts to different audiences. Desirable Experience in one or more of: MLOps for Kubernetes based clusters Building production systems with large language models Efficient computing based on low precision arithmetic. Experience writing C++/Triton/CUDA kernels for performance optimisation of ML models. Familiarity with HPC systems and networking including Infiniband, NVLink, RoCE technologies. Have contributed to open source projects or published research papers in relevant fields. Knowledge of cloud computing platforms. Keen to present, publish and deliver talks in the AI community. Benefits In addition to a competitive salary, Graphcore offers flexible working, a generous annual leave policy, private medical insurance and health cash plan, a dental plan, pension (matched up to 5%), life assurance and income protection. We have a generous parental leave policy and an employee assistance programme (which includes health, mental wellbeing, and bereavement support). We offer a range of healthy food and snacks at our central Bristol office and have our own barista bar! We welcome people of different backgrounds and experiences; we're committed to building an inclusive work environment that makes Graphcore a great home for everyone. We offer an equal opportunity process and understand that there are visible and invisible differences in all of us. We can provide a flexible approach to interview and encourage you to chat to us if you require any reasonable adjustments. Applicants for this position must hold the right to work in the UK. Unfortunately at this time, we are unable to provide visa sponsorship or support for visa applications
26/05/2026
Full time
About Graphcore At Graphcore, we're building the future of AI compute.We're a team of semiconductor, software and AI experts, with deep experience in creating the complete AI compute stack - from silicon and software to infrastructure at datacenter scale.As part of the SoftBank Group, backed by significant long term investment, we are delivering key technology into the fast growing SoftBank AI ecosystem.To meet the vast and exciting AI opportunity, Graphcore is expanding its teams around the world.We are bringing together the brightest minds to solve the toughest problems, in a place where everyone has the opportunity to make an impact on the company, our products and the future of artificial intelligence. Job Summary As a Senior Machine Learning Engineer in the Applied AI team at Graphcore, you will contribute to advancing AI technology by developing and optimising AI models tailored to our specialised hardware. You will work on large scale systems where performance is critical to the success of our projects. Working closely with the Software development and Research teams, you will play a critical role in identifying opportunities to innovate and differentiate Graphcore's technology. We seek engineers with strong technical skills and an understanding of AI model implementation at scale, eager to make a tangible impact in this rapidly evolving field. The Team The Applied AI team's role is to be proxies for our customers, we need to understand the latest AI models, applications, and software to ensure that Graphcore's technology works seamlessly with the AI ecosystem and at scale. We build reference applications, contribute to key software libraries e.g. optimising kernels for efficiency on our hardware, and collaborate with the Research team to develop and publish novel ideas in domains such as efficient compute, model scaling and distributed training and inference of AI models for multiple modalities and applications. If you're excited about advancing the next generation of AI models on cutting edge hardware, we'd love to hear from you! Responsibilities and Duties Implement latest machine learning models and optimise them for performance and accuracy, scaling to 1000s of accelerators. Test and evaluate new internal software releases, provide feedback to software engineering teams, make necessary code fixes, and conduct code reviews. Benchmark models and key ML techniques to identify performance bottlenecks and improve model efficiency. Design and conduct experiments on novel AI methods, implement them and evaluate results. Collaborate with Research, Software, and Product teams to define, build, and test Graphcore's next generation of AI hardware. Engage with AI community and keep in touch with the latest developments in AI. Candidate Profile Essential Bachelor/Master's/PhD or equivalent experience in Machine Learning, Computer Science, Maths, Data Science, or related field. Proficiency in deep learning frameworks like PyTorch/JAX. Strong Python or C++ software development skills Expertise in deep learning from model training to optimisation and evaluation. Experience in distributed training or inference of ML models across 64+ accelerators. Capable of designing, executing and reporting from ML experiments. Developed deep understanding of performance bottlenecks and how to overcome them. Ability to move quickly in a dynamic environment Enjoy cross functional work collaborating with other teams. Strong communicator - able to explain complex technical concepts to different audiences. Desirable Experience in one or more of: MLOps for Kubernetes based clusters Building production systems with large language models Efficient computing based on low precision arithmetic. Experience writing C++/Triton/CUDA kernels for performance optimisation of ML models. Familiarity with HPC systems and networking including Infiniband, NVLink, RoCE technologies. Have contributed to open source projects or published research papers in relevant fields. Knowledge of cloud computing platforms. Keen to present, publish and deliver talks in the AI community. Benefits In addition to a competitive salary, Graphcore offers flexible working, a generous annual leave policy, private medical insurance and health cash plan, a dental plan, pension (matched up to 5%), life assurance and income protection. We have a generous parental leave policy and an employee assistance programme (which includes health, mental wellbeing, and bereavement support). We offer a range of healthy food and snacks at our central Bristol office and have our own barista bar! We welcome people of different backgrounds and experiences; we're committed to building an inclusive work environment that makes Graphcore a great home for everyone. We offer an equal opportunity process and understand that there are visible and invisible differences in all of us. We can provide a flexible approach to interview and encourage you to chat to us if you require any reasonable adjustments. Applicants for this position must hold the right to work in the UK. Unfortunately at this time, we are unable to provide visa sponsorship or support for visa applications
Network Engineer - 5 days onsite in Gloucestershire At DXC Technology, delivering excellence for our customers and colleagues is more than just a motto, it's something we strive towards constantly through our work. Every day we deliver mission critical services in a secure environment whilst promoting our people first agenda, a real sense of community and a healthy work-life balance. Our consistently positive customer feedback and continuous growth helps us cement our place as one of the world's leading IT solutions enterprises, helping us deliver services and solutions in both challenging and exciting situations. We believe that hiring a diverse team is crucial to our success and our recruiting decisions are based on your skills and experience as an individual. We actively encourage consistent growth on our journey towards a culture of inclusion and recognise that the people we employ are vital to providing a great customer experience. As such, we have a variety of training, support, and tools available to aid in your continual personal and professional development. Our ongoing goal is to drive innovation and modernise operations across the board, which includes furthering the skills of our colleagues. At DXC, building a better you, builds a better us. At DXC, one of our platinum accounts has openings for on site Network System Administrators for varying skill levels. The successful candidate will work within multiple teams and will be innovative and analytical with a good eye for detail. Your role will include implementing standards, policies, and procedures for continual service improvement. Role Responsibilities Provide first and second level technical support on incidents and problems Monitor overall system performance and ensure smooth system functionality Create, maintain, and utilise documentation Assist building compliance with our processes and policies What You Will Bring To The Team Excellent organisation and time management skills Working to ITIL best practices Desire to improve processes, looking for the root cause of a problem Willingness to both share your knowledge and learn from others A proactive approach towards looking for risks and problemsExcellent written and verbal communication skills An ability to adapt quickly and work in an agile fashion Desirable Skills And Technologies Experience in Cisco, including Nexus family, ASA family, ACI and Hyperflex Knowledge of F5 software, including LTM, ASM and GTM Experience in Ansible Experience working on Datacenters Knowledge in VMware, including vSphere, NSX-V and NSX-T Exposure to cloud services, such as AWS and Microsoft Azure Experience in Dell VxRAIL Knowledge of SolarWinds NCM - NPM would also be useful Experience in Software Defined Networking What We Will Do For You Competitive compensation Pension scheme DXC Select - Our comprehensive benefits package (includes private health/medical insurance, childcare vouchers, gym membership and more) Perks at Work (discounts on technology, groceries, travel and more) DXC incentives (recognition tools, employee lunches, regular social events etc) At DXC Technology, we believe strong connections and community are key to our success. Our work model prioritizes in-person collaboration while offering flexibility to support wellbeing, productivity, individual work styles, and life circumstances. We're committed to fostering an inclusive environment where everyone can thrive.
26/05/2026
Full time
Network Engineer - 5 days onsite in Gloucestershire At DXC Technology, delivering excellence for our customers and colleagues is more than just a motto, it's something we strive towards constantly through our work. Every day we deliver mission critical services in a secure environment whilst promoting our people first agenda, a real sense of community and a healthy work-life balance. Our consistently positive customer feedback and continuous growth helps us cement our place as one of the world's leading IT solutions enterprises, helping us deliver services and solutions in both challenging and exciting situations. We believe that hiring a diverse team is crucial to our success and our recruiting decisions are based on your skills and experience as an individual. We actively encourage consistent growth on our journey towards a culture of inclusion and recognise that the people we employ are vital to providing a great customer experience. As such, we have a variety of training, support, and tools available to aid in your continual personal and professional development. Our ongoing goal is to drive innovation and modernise operations across the board, which includes furthering the skills of our colleagues. At DXC, building a better you, builds a better us. At DXC, one of our platinum accounts has openings for on site Network System Administrators for varying skill levels. The successful candidate will work within multiple teams and will be innovative and analytical with a good eye for detail. Your role will include implementing standards, policies, and procedures for continual service improvement. Role Responsibilities Provide first and second level technical support on incidents and problems Monitor overall system performance and ensure smooth system functionality Create, maintain, and utilise documentation Assist building compliance with our processes and policies What You Will Bring To The Team Excellent organisation and time management skills Working to ITIL best practices Desire to improve processes, looking for the root cause of a problem Willingness to both share your knowledge and learn from others A proactive approach towards looking for risks and problemsExcellent written and verbal communication skills An ability to adapt quickly and work in an agile fashion Desirable Skills And Technologies Experience in Cisco, including Nexus family, ASA family, ACI and Hyperflex Knowledge of F5 software, including LTM, ASM and GTM Experience in Ansible Experience working on Datacenters Knowledge in VMware, including vSphere, NSX-V and NSX-T Exposure to cloud services, such as AWS and Microsoft Azure Experience in Dell VxRAIL Knowledge of SolarWinds NCM - NPM would also be useful Experience in Software Defined Networking What We Will Do For You Competitive compensation Pension scheme DXC Select - Our comprehensive benefits package (includes private health/medical insurance, childcare vouchers, gym membership and more) Perks at Work (discounts on technology, groceries, travel and more) DXC incentives (recognition tools, employee lunches, regular social events etc) At DXC Technology, we believe strong connections and community are key to our success. Our work model prioritizes in-person collaboration while offering flexibility to support wellbeing, productivity, individual work styles, and life circumstances. We're committed to fostering an inclusive environment where everyone can thrive.
Nutanix builds enterprise software to help companies run their own private cloud or software-defined datacenter. At the heart of Nutanix is the AHV team who develop the Acropolis Hypervisor - proven to be reliable, performant and scalable. Alongside the AHV team we are building a new Tools Team to develop and maintain our internal tools for scalable deployments, test frameworks and observability or performance analysis. This is fundamentally a software development role, but an interest in automation, orchestration and DevOps would be an asset. Some experience of developing testing automation frameworks would be beneficial too. You will find yourself working on highly complex distributed systems; thousands of VMs running inside hypervisors nested inside a fleet of bare metal hypervisors, with networking complexity to match - if you find this sort of environment an intriguing challenge rather than a nightmare, you could be a good fit for the role. We currently have tools written in Python and Rust, but there will be opportunities to work on other components inside AHV. Experience in these languages, as well as C or Golang, would therefore be a bonus. About the Team The team is led by industry experts with 20+ years of experience, who are leading AHV development globally. We have a forward-thinking approach to our work that has retained many of the best elements of start-up mentality whilst also recognising the need for mature delivery and execution. We work with open-source technologies including Linux KVM, QEMU, Open vSwitch and Libvirt. Your Role Design, develop, and maintain the internal tools for AHV Enable AHV developers to push towards making AHV even more reliable, performant and secure Participate in hypervisor development on occasion, particularly when it involves features which are needed for interaction with the framework Help the team develop reliable system tests to prevent regression in existing functionality and discover defects as we develop and deploy new features Help improve the build system, other developer productivity tools, and consistently find ways that the team can automate their day-to-day tasks What You Will Bring Bachelor's, Master's degree in Computer Science (preferred) or another technical discipline/equivalent experience Knowledge of UNIX/Linux Excellent coding skills in Python or Rust, ideally from working on enterprise-quality software Familiarity with API design, REST and distributed systems would be helpful Experience with Pytest or similar test frameworks Hands on knowledge of Git and Docker Experience with automation, CI/CD and DevOps tools like Jenkins, Ansible, or GitHub Actions is desirable Some industry experience or equivalent research experience would be an advantage Familiarity with OS internals and concepts of distributed systems Work Arrangement Hybrid: This role operates in a hybrid capacity, blending the benefits of remote work with the advantages of in person collaboration. In locations where our workplace policy applies (i.e. Cambridge, San Jose, Durham, Mexico City, Bangalore, Pune, Hoofddorp, Belgrade, Barcelona, Singapore, Sydney and Tokyo), employees are expected to work onsite a minimum of 3 days per week to foster collaboration, team alignment, and access to in office resources. Workplace type may vary based on location and team requirements. Please speak with your recruiter for details. Additional team specific guidance and norms will be provided by your manager. Nutanix is an equal opportunity employer. Nutanix is an Equal Employment Opportunity and (in the U.S.) an Affimative Action employer. Qualified applicants are considered for employment opportunities without regard to race, color, religion, sex, sexual orientation, gender identity or expression, national origin, age, marital status, protected veteran status, disability status or any other category protected by applicable law. We hire and promote individuals solely on the basis of qualifications for the job to be filled. We strive to foster an inclusive working environment that enables all our Nutants to be themselves and to do great work in a safe and welcoming environment, free of unlawful discrimination, intimidation or harassment. As part of this commitment, we will ensure that persons with disabilities are provided reasonable accommodations. If you need a reasonable accommodation, please let us know by contacting .
24/05/2026
Full time
Nutanix builds enterprise software to help companies run their own private cloud or software-defined datacenter. At the heart of Nutanix is the AHV team who develop the Acropolis Hypervisor - proven to be reliable, performant and scalable. Alongside the AHV team we are building a new Tools Team to develop and maintain our internal tools for scalable deployments, test frameworks and observability or performance analysis. This is fundamentally a software development role, but an interest in automation, orchestration and DevOps would be an asset. Some experience of developing testing automation frameworks would be beneficial too. You will find yourself working on highly complex distributed systems; thousands of VMs running inside hypervisors nested inside a fleet of bare metal hypervisors, with networking complexity to match - if you find this sort of environment an intriguing challenge rather than a nightmare, you could be a good fit for the role. We currently have tools written in Python and Rust, but there will be opportunities to work on other components inside AHV. Experience in these languages, as well as C or Golang, would therefore be a bonus. About the Team The team is led by industry experts with 20+ years of experience, who are leading AHV development globally. We have a forward-thinking approach to our work that has retained many of the best elements of start-up mentality whilst also recognising the need for mature delivery and execution. We work with open-source technologies including Linux KVM, QEMU, Open vSwitch and Libvirt. Your Role Design, develop, and maintain the internal tools for AHV Enable AHV developers to push towards making AHV even more reliable, performant and secure Participate in hypervisor development on occasion, particularly when it involves features which are needed for interaction with the framework Help the team develop reliable system tests to prevent regression in existing functionality and discover defects as we develop and deploy new features Help improve the build system, other developer productivity tools, and consistently find ways that the team can automate their day-to-day tasks What You Will Bring Bachelor's, Master's degree in Computer Science (preferred) or another technical discipline/equivalent experience Knowledge of UNIX/Linux Excellent coding skills in Python or Rust, ideally from working on enterprise-quality software Familiarity with API design, REST and distributed systems would be helpful Experience with Pytest or similar test frameworks Hands on knowledge of Git and Docker Experience with automation, CI/CD and DevOps tools like Jenkins, Ansible, or GitHub Actions is desirable Some industry experience or equivalent research experience would be an advantage Familiarity with OS internals and concepts of distributed systems Work Arrangement Hybrid: This role operates in a hybrid capacity, blending the benefits of remote work with the advantages of in person collaboration. In locations where our workplace policy applies (i.e. Cambridge, San Jose, Durham, Mexico City, Bangalore, Pune, Hoofddorp, Belgrade, Barcelona, Singapore, Sydney and Tokyo), employees are expected to work onsite a minimum of 3 days per week to foster collaboration, team alignment, and access to in office resources. Workplace type may vary based on location and team requirements. Please speak with your recruiter for details. Additional team specific guidance and norms will be provided by your manager. Nutanix is an equal opportunity employer. Nutanix is an Equal Employment Opportunity and (in the U.S.) an Affimative Action employer. Qualified applicants are considered for employment opportunities without regard to race, color, religion, sex, sexual orientation, gender identity or expression, national origin, age, marital status, protected veteran status, disability status or any other category protected by applicable law. We hire and promote individuals solely on the basis of qualifications for the job to be filled. We strive to foster an inclusive working environment that enables all our Nutants to be themselves and to do great work in a safe and welcoming environment, free of unlawful discrimination, intimidation or harassment. As part of this commitment, we will ensure that persons with disabilities are provided reasonable accommodations. If you need a reasonable accommodation, please let us know by contacting .
About Graphcore At Graphcore, we're building the future of AI compute. We're a team of semiconductor, software and AI experts, with deep experience in creating the complete AI compute stack - from silicon and software to infrastructure at datacenter scale. As part of the SoftBank Group, backed by significant long-term investment, we are delivering key technology into the fast-growing SoftBank AI ecosystem. To meet the vast and exciting AI opportunity, Graphcore is expanding its teams around the world. We are bringing together the brightest minds to solve the toughest problems, in a place where everyone has the opportunity to make an impact on the company, our products and the future of artificial intelligence. Job Summary Applicants for this role should have strong experience working with machine learning systems and frameworks, along with a solid understanding of core AI concepts and model behaviour. The role centres on testing, validating, and benchmarking a complex ML software stack, with a particular focus on performance, reliability, and correctness across modern AI workloads. The ideal candidate is an experienced ML engineer who understands how contemporary models are trained and executed, and who has hands on experience debugging functional and performance issues in ML systems. This person will be comfortable working with industry standard frameworks and state of the art models, bringing them up on internal infrastructure, and collaborating closely with software and hardware teams in a technically demanding environment spanning ML frameworks, infrastructure, and AI accelerator hardware. The Team The ML QA team is composed of highly skilled software engineers with a strong focus on automation, software quality, and data driven validation. The team works closely with industry standard machine learning frameworks and models, contributing to upstream open source projects and collaborating across the wider software organization. Operating in a fast paced environment, the team plays a critical role in ensuring reliability, performance, and maintainability across the ML software stack, helping to deliver robust and high quality products to customers. Responsibilities and Duties Benchmark ML models and frameworks, analysing results to identify regressions, performance bottlenecks, and correctness issues. Work hands on with industry standard ML frameworks to validate functionality and performance across different execution environments. Build and maintain automated testing and benchmarking pipelines targeting simulators, emulators, and physical hardware. Collaborate closely with software teams to ensure adequate test coverage for new and existing features. Develop tooling and scripts (primarily in Python) to support testing, benchmarking, and functional reporting. Take ownership over aspects of our testing and infrastructure, owning the roadmap and driving innovation independently. Candidate Profile Essential: 6+ years of experience working in Machine Learning or ML adjacent engineering roles. Strong foundation in core AI and ML concepts (e.g. neural networks, training vs inference, numerical precision, performance trade offs). Hands on experience with one or more major ML frameworks such as PyTorch, TensorFlow, JAX, or similar. Strong proficiency in Python for ML workflows, experimentation, and automation. Experience designing, running, and analysing ML benchmarks or experiments. Experience working in Linux environments. Strong analytical and debugging skills, with the ability to reason about model behaviour and system performance. Bachelor/Master's/PhD or equivalent experience in Computer Science, Maths, Machine Learning, Data Science, or related field. Desirable: Experience with MLOps pipelines, model deployment, or production ML systems. Familiarity with performance analysis, profiling tools, or numerical accuracy validation. Exposure to distributed training or inference systems. Experience with hardware accelerated ML, compilers, or system level performance considerations. Familiarity with CI/CD systems used for ML workflows. Experience contributing to open source ML frameworks or tooling. Benefits In addition to a competitive salary, Graphcore offers flexible working, a generous annual leave policy, private medical insurance and health cash plan, a dental plan, pension (matched up to 5%), life assurance and income protection. We have a generous parental leave policy and an employee assistance programme (which includes health, mental wellbeing, and bereavement support). We offer a range of healthy food and snacks at our central Bristol office and have our own barista bar! We welcome people of different backgrounds and experiences; we're committed to building an inclusive work environment that makes Graphcore a great home for everyone. We offer an equal opportunity process and understand that there are visible and invisible differences in all of us. We can provide a flexible approach to interview and encourage you to chat to us if you require any reasonable adjustments. Applicants for this position must hold the right to work in the UK. Unfortunately at this time, we are unable to provide visa sponsorship or support for visa applications.
24/05/2026
Full time
About Graphcore At Graphcore, we're building the future of AI compute. We're a team of semiconductor, software and AI experts, with deep experience in creating the complete AI compute stack - from silicon and software to infrastructure at datacenter scale. As part of the SoftBank Group, backed by significant long-term investment, we are delivering key technology into the fast-growing SoftBank AI ecosystem. To meet the vast and exciting AI opportunity, Graphcore is expanding its teams around the world. We are bringing together the brightest minds to solve the toughest problems, in a place where everyone has the opportunity to make an impact on the company, our products and the future of artificial intelligence. Job Summary Applicants for this role should have strong experience working with machine learning systems and frameworks, along with a solid understanding of core AI concepts and model behaviour. The role centres on testing, validating, and benchmarking a complex ML software stack, with a particular focus on performance, reliability, and correctness across modern AI workloads. The ideal candidate is an experienced ML engineer who understands how contemporary models are trained and executed, and who has hands on experience debugging functional and performance issues in ML systems. This person will be comfortable working with industry standard frameworks and state of the art models, bringing them up on internal infrastructure, and collaborating closely with software and hardware teams in a technically demanding environment spanning ML frameworks, infrastructure, and AI accelerator hardware. The Team The ML QA team is composed of highly skilled software engineers with a strong focus on automation, software quality, and data driven validation. The team works closely with industry standard machine learning frameworks and models, contributing to upstream open source projects and collaborating across the wider software organization. Operating in a fast paced environment, the team plays a critical role in ensuring reliability, performance, and maintainability across the ML software stack, helping to deliver robust and high quality products to customers. Responsibilities and Duties Benchmark ML models and frameworks, analysing results to identify regressions, performance bottlenecks, and correctness issues. Work hands on with industry standard ML frameworks to validate functionality and performance across different execution environments. Build and maintain automated testing and benchmarking pipelines targeting simulators, emulators, and physical hardware. Collaborate closely with software teams to ensure adequate test coverage for new and existing features. Develop tooling and scripts (primarily in Python) to support testing, benchmarking, and functional reporting. Take ownership over aspects of our testing and infrastructure, owning the roadmap and driving innovation independently. Candidate Profile Essential: 6+ years of experience working in Machine Learning or ML adjacent engineering roles. Strong foundation in core AI and ML concepts (e.g. neural networks, training vs inference, numerical precision, performance trade offs). Hands on experience with one or more major ML frameworks such as PyTorch, TensorFlow, JAX, or similar. Strong proficiency in Python for ML workflows, experimentation, and automation. Experience designing, running, and analysing ML benchmarks or experiments. Experience working in Linux environments. Strong analytical and debugging skills, with the ability to reason about model behaviour and system performance. Bachelor/Master's/PhD or equivalent experience in Computer Science, Maths, Machine Learning, Data Science, or related field. Desirable: Experience with MLOps pipelines, model deployment, or production ML systems. Familiarity with performance analysis, profiling tools, or numerical accuracy validation. Exposure to distributed training or inference systems. Experience with hardware accelerated ML, compilers, or system level performance considerations. Familiarity with CI/CD systems used for ML workflows. Experience contributing to open source ML frameworks or tooling. Benefits In addition to a competitive salary, Graphcore offers flexible working, a generous annual leave policy, private medical insurance and health cash plan, a dental plan, pension (matched up to 5%), life assurance and income protection. We have a generous parental leave policy and an employee assistance programme (which includes health, mental wellbeing, and bereavement support). We offer a range of healthy food and snacks at our central Bristol office and have our own barista bar! We welcome people of different backgrounds and experiences; we're committed to building an inclusive work environment that makes Graphcore a great home for everyone. We offer an equal opportunity process and understand that there are visible and invisible differences in all of us. We can provide a flexible approach to interview and encourage you to chat to us if you require any reasonable adjustments. Applicants for this position must hold the right to work in the UK. Unfortunately at this time, we are unable to provide visa sponsorship or support for visa applications.
Please note this posting is to advertise potential job opportunities. This exact role may not be open today but could open in the near future. When you apply, a Cisco representative may contact you directly if a relevant position opens. Start Date: as soon as possible Location: Feltham, United Kingdom (Hybrid work approach, working from the Feltham office 1-2 days per week.) Meet the Team We at Cisco are looking for a Site Reliability Engineer, with a passion for technology and solid academic foundations in analytical disciplines. Cisco is a strong advocate of using its own enterprise networking, datacenter, collaboration products, and solutions internally; Cisco IT deploys all these technologies - the result being that Cisco IT accrues a great deal of experience in how to design, deploy, operate, and automate these solutions within a large global enterprise. In the Network Engineering Core Team, we are responsible for connecting our offices to our enterprise network across Cisco. We maintain and support the Wan and Core infrastructure, alongside several hardware and software remote access solutions with an Agile, SRE mindset and have lots of fun along the way. Your Impact As a Site Reliability Engineer, daily activities of the role involve working within a large global team of DevOps Network Engineers, Product Owners, and Product Managers to enable the efficient running of all Cisco offices and remote/hybrid working solutions. You'll also have the opportunity to work on a variety of different projects across our technology portfolio. Activities include but are not limited to: Use creative problem-solving to provide Cisco with advanced, essential business capabilities. Developing technical prototype environments and concepts. Supporting existing platforms and network solutions,including but not limited toWAN, LAN, and Core. Single working/or part of a team dependentofthe project, using theSAFemethodology. Identify and work on areas that can be automated to streamline processes within the team. There will be some on call work required as you become familiar with our network but this is limited to 1 week in every 6 which will cover the working day during the week and include the weekend. Minimum Qualifications We are looking for someone that can demonstrate thefollowing; Including but not limited to a recent/upcoming graduate of a Bachelor's degree (or higher) or a certification program (e.g. a Bootcamp or Apprenticeship). Equivalent experience accepted in lieu of these. Demonstrate a keen interest in some of the following technologies: Networking (Routing,Switching, and WAN/SDWAN) Automation / Programming-i.e.Python,Ansible,REST, APIsare advantageous but not essential Virtualisation Technologies-VMware, OpenStack, Dockerare advantageous Able to legally live and work in the country for which you're applying Preferred Qualifications Strong analytical mind-set Familiarity with design concepts Why Cisco? At Cisco, we're revolutionizing how data and infrastructure connect and protect organizations in the AI era - and beyond. We've been innovating fearlessly for 40 years to create solutions that power how humans and technology work together across the physical and digital worlds. These solutions provide customers with unparalleled security, visibility, and insights across the entire digital footprint. Fueled by the depth and breadth of our technology, we experiment and create meaningful solutions. Add to that our worldwide network of doers and experts, and you'll see that the opportunities to grow and build are limitless. We work as a team, collaborating with empathy to make really big things happen on a global scale. Because our solutions are everywhere, our impact is everywhere. We are Cisco, and our power starts with you.
24/05/2026
Full time
Please note this posting is to advertise potential job opportunities. This exact role may not be open today but could open in the near future. When you apply, a Cisco representative may contact you directly if a relevant position opens. Start Date: as soon as possible Location: Feltham, United Kingdom (Hybrid work approach, working from the Feltham office 1-2 days per week.) Meet the Team We at Cisco are looking for a Site Reliability Engineer, with a passion for technology and solid academic foundations in analytical disciplines. Cisco is a strong advocate of using its own enterprise networking, datacenter, collaboration products, and solutions internally; Cisco IT deploys all these technologies - the result being that Cisco IT accrues a great deal of experience in how to design, deploy, operate, and automate these solutions within a large global enterprise. In the Network Engineering Core Team, we are responsible for connecting our offices to our enterprise network across Cisco. We maintain and support the Wan and Core infrastructure, alongside several hardware and software remote access solutions with an Agile, SRE mindset and have lots of fun along the way. Your Impact As a Site Reliability Engineer, daily activities of the role involve working within a large global team of DevOps Network Engineers, Product Owners, and Product Managers to enable the efficient running of all Cisco offices and remote/hybrid working solutions. You'll also have the opportunity to work on a variety of different projects across our technology portfolio. Activities include but are not limited to: Use creative problem-solving to provide Cisco with advanced, essential business capabilities. Developing technical prototype environments and concepts. Supporting existing platforms and network solutions,including but not limited toWAN, LAN, and Core. Single working/or part of a team dependentofthe project, using theSAFemethodology. Identify and work on areas that can be automated to streamline processes within the team. There will be some on call work required as you become familiar with our network but this is limited to 1 week in every 6 which will cover the working day during the week and include the weekend. Minimum Qualifications We are looking for someone that can demonstrate thefollowing; Including but not limited to a recent/upcoming graduate of a Bachelor's degree (or higher) or a certification program (e.g. a Bootcamp or Apprenticeship). Equivalent experience accepted in lieu of these. Demonstrate a keen interest in some of the following technologies: Networking (Routing,Switching, and WAN/SDWAN) Automation / Programming-i.e.Python,Ansible,REST, APIsare advantageous but not essential Virtualisation Technologies-VMware, OpenStack, Dockerare advantageous Able to legally live and work in the country for which you're applying Preferred Qualifications Strong analytical mind-set Familiarity with design concepts Why Cisco? At Cisco, we're revolutionizing how data and infrastructure connect and protect organizations in the AI era - and beyond. We've been innovating fearlessly for 40 years to create solutions that power how humans and technology work together across the physical and digital worlds. These solutions provide customers with unparalleled security, visibility, and insights across the entire digital footprint. Fueled by the depth and breadth of our technology, we experiment and create meaningful solutions. Add to that our worldwide network of doers and experts, and you'll see that the opportunities to grow and build are limitless. We work as a team, collaborating with empathy to make really big things happen on a global scale. Because our solutions are everywhere, our impact is everywhere. We are Cisco, and our power starts with you.
The opportunity Scale and tune high-throughput PostgreSQL clusters that support continued global expansion and new product initiatives. Own PostgreSQL reliability fundamentals: WAL behavior, checkpoints, autovacuum, query planning, locking, replication lag, backup/restore, and capacity planning. Help build and operate vector data capabilities, including pgvector and dedicated VectorDB platforms where appropriate. Define production patterns for embeddings, approximate nearest neighbor indexes, metadata filtering, recall/latency tradeoffs, reindexing, data freshness, and drift management. Strengthen high-availability, disaster-recovery, PITR, and backup approaches through sound design and regularly validated procedures. Reduce manual operational work by building automation, improving process consistency, and enabling safe, low-friction database workflows. Improve observability and alert quality by championing meaningful metrics, reducing noise, and ensuring operational clarity across PostgreSQL and vector workloads. Enhance database security through robust access controls, disciplined patching and upgrade practices, encryption, auditability, and secure operational patterns. Contribute to modern platform initiatives, including containerized environments, infrastructure-as-code workflows, GitOps, reproducible deployments, and self-service database operations. Partner with service, AI, and platform teams to drive better performance patterns, operational readiness, data hygiene, and safe use of vector search across the engineering organization. Participate in on-call rotations with a long-term focus on making on-call predictable, well instrumented, and shaped by preventative engineering. Skills you should HODL 5+ years operating PostgreSQL in high-volume production environments, including performance tuning, replication, backup/restore, upgrades, and incident troubleshooting. Strong understanding of PostgreSQL internals and operations: MVCC, transaction isolation, locks, WAL, checkpoints, autovacuum, bloat, statistics, query planner behavior, partitioning, and index strategy. Hands on experience with high availability and read scaling: streaming replication, replication slots, failover, lag management, PITR, backups, and disaster recovery drills. Experience with connection pooling and traffic management for PostgreSQL, especially PgBouncer, HAProxy, Kubernetes service routing, or comparable patterns. Practical VectorDB or vector search experience, such as pgvector, Qdrant, Milvus, Weaviate, OpenSearch vector search, or similar systems. Ability to reason about vector index types and operational tradeoffs, including HNSW/IVFFlat style indexes, recall, latency, memory, ingestion throughput, metadata filters, rebuilds, and versioned embeddings. Practical experience with CI/CD, GitOps, and Infrastructure as Code workflows. Terraform experience is ideal. Solid cloud, Linux, storage, and networking fundamentals. Experience with containers and orchestration platforms, including building container images and managing Kubernetes workloads at scale. Strong security instincts around access control, credential lifecycle, encryption, auditability, upgrade processes, and safe operational workflows. Observability expertise: monitoring, alerting hygiene, SLOs, dashboards, query level visibility, and readiness for incident response. Strong communication and collaboration skills with the ability to partner with stakeholders, negotiate long term plans, write formal documentation, and tie success to specific metrics and KPIs. Nice to haves Experience with SRE methodologies such as error budgets, operational reviews, reliability programs, and failure exercises. Strong scripting or programming ability, preferably Python, Go, or Rust, used to build automation and internal tools. Hands on experience with GitOps tooling such as ArgoCD, GitHub Actions, or GitLab CI. Exposure to multi region, multi datacenter, or active/passive disaster recovery designs. Experience supporting AI, retrieval, personalization, fraud, search, or recommendation systems that depend on embeddings or hybrid search. Interest in cryptocurrency or decentralized systems. Please note, applicants are permitted to redact or remove information on their resume that identifies age, date of birth, or dates of attendance at or graduation from an educational institution. We consider qualified applicants with criminal histories for employment on our team, assessing candidates in a manner consistent with the requirements of the San Francisco Fair Chance Ordinance. As an equal opportunity employer, we don't tolerate discrimination or harassment of any kind. Whether that's based on race, ethnicity, age, gender identity, citizenship, religion, sexual orientation, disability, pregnancy, veteran status or any other protected characteristic as outlined by federal, state or local laws.
23/05/2026
Full time
The opportunity Scale and tune high-throughput PostgreSQL clusters that support continued global expansion and new product initiatives. Own PostgreSQL reliability fundamentals: WAL behavior, checkpoints, autovacuum, query planning, locking, replication lag, backup/restore, and capacity planning. Help build and operate vector data capabilities, including pgvector and dedicated VectorDB platforms where appropriate. Define production patterns for embeddings, approximate nearest neighbor indexes, metadata filtering, recall/latency tradeoffs, reindexing, data freshness, and drift management. Strengthen high-availability, disaster-recovery, PITR, and backup approaches through sound design and regularly validated procedures. Reduce manual operational work by building automation, improving process consistency, and enabling safe, low-friction database workflows. Improve observability and alert quality by championing meaningful metrics, reducing noise, and ensuring operational clarity across PostgreSQL and vector workloads. Enhance database security through robust access controls, disciplined patching and upgrade practices, encryption, auditability, and secure operational patterns. Contribute to modern platform initiatives, including containerized environments, infrastructure-as-code workflows, GitOps, reproducible deployments, and self-service database operations. Partner with service, AI, and platform teams to drive better performance patterns, operational readiness, data hygiene, and safe use of vector search across the engineering organization. Participate in on-call rotations with a long-term focus on making on-call predictable, well instrumented, and shaped by preventative engineering. Skills you should HODL 5+ years operating PostgreSQL in high-volume production environments, including performance tuning, replication, backup/restore, upgrades, and incident troubleshooting. Strong understanding of PostgreSQL internals and operations: MVCC, transaction isolation, locks, WAL, checkpoints, autovacuum, bloat, statistics, query planner behavior, partitioning, and index strategy. Hands on experience with high availability and read scaling: streaming replication, replication slots, failover, lag management, PITR, backups, and disaster recovery drills. Experience with connection pooling and traffic management for PostgreSQL, especially PgBouncer, HAProxy, Kubernetes service routing, or comparable patterns. Practical VectorDB or vector search experience, such as pgvector, Qdrant, Milvus, Weaviate, OpenSearch vector search, or similar systems. Ability to reason about vector index types and operational tradeoffs, including HNSW/IVFFlat style indexes, recall, latency, memory, ingestion throughput, metadata filters, rebuilds, and versioned embeddings. Practical experience with CI/CD, GitOps, and Infrastructure as Code workflows. Terraform experience is ideal. Solid cloud, Linux, storage, and networking fundamentals. Experience with containers and orchestration platforms, including building container images and managing Kubernetes workloads at scale. Strong security instincts around access control, credential lifecycle, encryption, auditability, upgrade processes, and safe operational workflows. Observability expertise: monitoring, alerting hygiene, SLOs, dashboards, query level visibility, and readiness for incident response. Strong communication and collaboration skills with the ability to partner with stakeholders, negotiate long term plans, write formal documentation, and tie success to specific metrics and KPIs. Nice to haves Experience with SRE methodologies such as error budgets, operational reviews, reliability programs, and failure exercises. Strong scripting or programming ability, preferably Python, Go, or Rust, used to build automation and internal tools. Hands on experience with GitOps tooling such as ArgoCD, GitHub Actions, or GitLab CI. Exposure to multi region, multi datacenter, or active/passive disaster recovery designs. Experience supporting AI, retrieval, personalization, fraud, search, or recommendation systems that depend on embeddings or hybrid search. Interest in cryptocurrency or decentralized systems. Please note, applicants are permitted to redact or remove information on their resume that identifies age, date of birth, or dates of attendance at or graduation from an educational institution. We consider qualified applicants with criminal histories for employment on our team, assessing candidates in a manner consistent with the requirements of the San Francisco Fair Chance Ordinance. As an equal opportunity employer, we don't tolerate discrimination or harassment of any kind. Whether that's based on race, ethnicity, age, gender identity, citizenship, religion, sexual orientation, disability, pregnancy, veteran status or any other protected characteristic as outlined by federal, state or local laws.
Integral to the Nutanix software stack is the Acropolis Hypervisor (AHV). AHV is an enterprise-grade hypervisor tailor-made for Nutanix's software solution and has reliability, performance and scalability characteristics proven to be capable of meeting the demands of the toughest enterprise and private cloud workloads. We are seeking to grow our Cambridge-based engineering team with talented software engineers who will help us develop AHV and shape the future of the software-defined datacenter. About the Team The team is led by industry experts with 20+ years of experience, who are leading AHV development globally. We have a forward-thinking approach to our work that has retained many of the best elements of start-up mentality whilst also recognising the need for mature delivery and execution. We work with open-source technologies including Linux KVM, QEMU, Open vSwitch and Libvirt. Your Role Design, develop, and maintain AHV features, often interacting with Open Source communities. Constantly push towards making AHV highly reliable, performant and secure. Be passionate about datacenter management problems and strive to come up with innovative solutions. Leading the development of features from concept to market, often interacting with cross-functional areas such as product management, sales, and support. Mentoring other software engineers. What You Will Bring Bachelor's, Master's, and/or PhD degree in Computer Science (preferred) or another technical discipline/equivalent experience. 2+ years of industry experience or equivalent research experience. Rock solid coding skills in C and Python, ideally for enterprise-quality software. Coding skills in Rust and GoLang are desired, but not necessary. Extensive knowledge of UNIX/Linux. Familiarity with OS internals and concepts of distributed systems. Familiarity with x86 architecture, virtualisation, and/or storage and network management. Familiarity with KVM and QEMU is preferred. Experience in interaction with open source communities is preferred. Work Arrangement Hybrid: This role operates in a hybrid capacity, blending the benefits of remote work with the advantages of in-person collaboration. In locations where our workplace policy applies (i.e. Cambridge, San Jose, Durham, Mexico City, Bangalore, Pune, Hoofddorp, Belgrade, Barcelona, Singapore, Sydney and Tokyo), employees are expected to work onsite a minimum of 3 days per week to foster collaboration, team alignment, and access to in-office resources. Workplace type may vary based on location and team requirements. Please speak with your recruiter for details. Additional team-specific guidance and norms will be provided by your manager. Nutanix is an equal opportunity employer. Nutanix is an Equal Employment Opportunity and (in the U.S.) an Aff irmative Action employer. Qualified applicants are considered for employment opportunities without regard to race, color, religion, sex, sexual orientation, gender identity or expression, national origin, age, marital status, protected veteran status, disability status or any other category protected by applicable law. We hire and promote individuals solely on the basis of qualifications for the job to be filled. We strive to foster an inclusive working environment that enables all our Nutants to be themselves and to do great work in a safe and welcoming environment, free of unlawful discrimination, intimidation or harassment. As part of this commitment, we will ensure that persons with disabilities are provided reasonable accommodations. If you need a reasonable accommodation, please let us know by contacting .
20/05/2026
Full time
Integral to the Nutanix software stack is the Acropolis Hypervisor (AHV). AHV is an enterprise-grade hypervisor tailor-made for Nutanix's software solution and has reliability, performance and scalability characteristics proven to be capable of meeting the demands of the toughest enterprise and private cloud workloads. We are seeking to grow our Cambridge-based engineering team with talented software engineers who will help us develop AHV and shape the future of the software-defined datacenter. About the Team The team is led by industry experts with 20+ years of experience, who are leading AHV development globally. We have a forward-thinking approach to our work that has retained many of the best elements of start-up mentality whilst also recognising the need for mature delivery and execution. We work with open-source technologies including Linux KVM, QEMU, Open vSwitch and Libvirt. Your Role Design, develop, and maintain AHV features, often interacting with Open Source communities. Constantly push towards making AHV highly reliable, performant and secure. Be passionate about datacenter management problems and strive to come up with innovative solutions. Leading the development of features from concept to market, often interacting with cross-functional areas such as product management, sales, and support. Mentoring other software engineers. What You Will Bring Bachelor's, Master's, and/or PhD degree in Computer Science (preferred) or another technical discipline/equivalent experience. 2+ years of industry experience or equivalent research experience. Rock solid coding skills in C and Python, ideally for enterprise-quality software. Coding skills in Rust and GoLang are desired, but not necessary. Extensive knowledge of UNIX/Linux. Familiarity with OS internals and concepts of distributed systems. Familiarity with x86 architecture, virtualisation, and/or storage and network management. Familiarity with KVM and QEMU is preferred. Experience in interaction with open source communities is preferred. Work Arrangement Hybrid: This role operates in a hybrid capacity, blending the benefits of remote work with the advantages of in-person collaboration. In locations where our workplace policy applies (i.e. Cambridge, San Jose, Durham, Mexico City, Bangalore, Pune, Hoofddorp, Belgrade, Barcelona, Singapore, Sydney and Tokyo), employees are expected to work onsite a minimum of 3 days per week to foster collaboration, team alignment, and access to in-office resources. Workplace type may vary based on location and team requirements. Please speak with your recruiter for details. Additional team-specific guidance and norms will be provided by your manager. Nutanix is an equal opportunity employer. Nutanix is an Equal Employment Opportunity and (in the U.S.) an Aff irmative Action employer. Qualified applicants are considered for employment opportunities without regard to race, color, religion, sex, sexual orientation, gender identity or expression, national origin, age, marital status, protected veteran status, disability status or any other category protected by applicable law. We hire and promote individuals solely on the basis of qualifications for the job to be filled. We strive to foster an inclusive working environment that enables all our Nutants to be themselves and to do great work in a safe and welcoming environment, free of unlawful discrimination, intimidation or harassment. As part of this commitment, we will ensure that persons with disabilities are provided reasonable accommodations. If you need a reasonable accommodation, please let us know by contacting .
Network SRE London, United Kingdom Network Site Reliability Engineer role focused on developing and maintaining network and datacenter infrastructure. Responsibilities Develop network and datacenter infrastructure with consistent and straightforward processes. Collaborate across teams exploring domains such as hardware, operating systems, Python/Go development, AWS, and storage. Tech Stack Amazon AWS Operating systems Python Hardware Go SRE Compensation Competitive Role Type Full time Benefits & Perks Not provided
20/05/2026
Full time
Network SRE London, United Kingdom Network Site Reliability Engineer role focused on developing and maintaining network and datacenter infrastructure. Responsibilities Develop network and datacenter infrastructure with consistent and straightforward processes. Collaborate across teams exploring domains such as hardware, operating systems, Python/Go development, AWS, and storage. Tech Stack Amazon AWS Operating systems Python Hardware Go SRE Compensation Competitive Role Type Full time Benefits & Perks Not provided
Golang Works in London is seeking a Network Site Reliability Engineer to develop and maintain network and datacenter infrastructure. The role emphasizes collaboration across teams working with hardware, operating systems, and development in Python and Go. This is a full-time position offering competitive compensation. The tech stack includes Amazon AWS and systems management, making it ideal for engineers passionate about SRE.
20/05/2026
Full time
Golang Works in London is seeking a Network Site Reliability Engineer to develop and maintain network and datacenter infrastructure. The role emphasizes collaboration across teams working with hardware, operating systems, and development in Python and Go. This is a full-time position offering competitive compensation. The tech stack includes Amazon AWS and systems management, making it ideal for engineers passionate about SRE.
A Career with Point72's Technology Team As Point72 reimagines the future of investing, our Technology group is constantly improving our company's IT infrastructure, positioning us at the forefront of a rapidly evolving technology landscape. We're a team of experts experimenting, discovering new ways to harness the power of open source solutions, and embracing enterprise agile methodology. We encourage professional development to ensure you bring innovative ideas to our products while satisfying your own intellectual curiosity. Our Technology Infrastructure Team engineers and operates the foundational technology platforms that power our firm's applications and businesses. Our disciplines span a broad array of technologies from datacenter infrastructure to large scale cloud services, with the shared goal of providing the most reliable, performant, modern technology platforms to improve time-to-market for our business. We also deliver end-user technology solutions to support the evolving collaboration, and productivity needs of our global teams. Our team focuses on innovation and challenging the current state of our infrastructure technology in a fast-paced, dynamic, and collaborative working environment. What you'll do Actively monitor the Linux team's incident/request queue, set prioritization, and process tickets in a timely manner Perform production changes while adhering to change management guidelines Install, configure, and troubleshoot Linux servers and related hardware/software components Perform in-depth analysis for post impact resolution, RCA, remedial actions, and problem management Manage and troubleshoot AD integrated authentication on Linux servers Manage and troubleshoot existing Ansible playbooks used for configuration management across the Linux environment Collaborate cross-functionally to monitor system performance, analyze system logs, and proactively address issues Create and maintain system documentation including configuration guides, standard operating procedures, and troubleshooting steps What's required 3-5+ years of experience Linux Administration in a production environment Experience with infrastructure as code concepts (Terraform, Ansible, Puppet or similar) Experience in authentication, privilege management, and integration with Active Directory Familiarity with Ansible for automation and configuration management Hands-on experience with Red Hat Enterprise Linux (RHEL) administration, including installation, configuration, performance tuning, and troubleshooting Experience with creating and maintaining remote mounts (CIFS/samba/NFS) for both client and server configurations Familiarity with local disk management as well as network-based storage concepts and related configurations (HBA, PowerPath, LVM, etc.) Experience in scripting languages like Bash or Python for automation and task scripting Understanding of networking concepts, protocols, and services (TCP/IP, DNS, DHCP, Kerberos, etc.) Excellent communication and interpersonal skills Commitment to the highest ethical standards We take care of our people We invest in our people, their careers, their health, and their well-being. When you work here, we provide: Health care benefits Generous parental and family leave policies Mental and physical wellness programs Volunteer opportunities Support for employee-led affinity groups representing women, minorities and the LGBT+ community Tuition assistance About Point72 Point72 is a leading global alternative investment firm led by Steven A. Cohen. Building on more than 30 years of investing experience, Point72 seeks to deliver superior returns for its investors through fundamental and systematic investing strategies across asset classes and geographies. We aim to attract and retain the industry's brightest talent by cultivating an investor-led culture and committing to our people's long-term growth. For more information, visit
19/05/2026
Full time
A Career with Point72's Technology Team As Point72 reimagines the future of investing, our Technology group is constantly improving our company's IT infrastructure, positioning us at the forefront of a rapidly evolving technology landscape. We're a team of experts experimenting, discovering new ways to harness the power of open source solutions, and embracing enterprise agile methodology. We encourage professional development to ensure you bring innovative ideas to our products while satisfying your own intellectual curiosity. Our Technology Infrastructure Team engineers and operates the foundational technology platforms that power our firm's applications and businesses. Our disciplines span a broad array of technologies from datacenter infrastructure to large scale cloud services, with the shared goal of providing the most reliable, performant, modern technology platforms to improve time-to-market for our business. We also deliver end-user technology solutions to support the evolving collaboration, and productivity needs of our global teams. Our team focuses on innovation and challenging the current state of our infrastructure technology in a fast-paced, dynamic, and collaborative working environment. What you'll do Actively monitor the Linux team's incident/request queue, set prioritization, and process tickets in a timely manner Perform production changes while adhering to change management guidelines Install, configure, and troubleshoot Linux servers and related hardware/software components Perform in-depth analysis for post impact resolution, RCA, remedial actions, and problem management Manage and troubleshoot AD integrated authentication on Linux servers Manage and troubleshoot existing Ansible playbooks used for configuration management across the Linux environment Collaborate cross-functionally to monitor system performance, analyze system logs, and proactively address issues Create and maintain system documentation including configuration guides, standard operating procedures, and troubleshooting steps What's required 3-5+ years of experience Linux Administration in a production environment Experience with infrastructure as code concepts (Terraform, Ansible, Puppet or similar) Experience in authentication, privilege management, and integration with Active Directory Familiarity with Ansible for automation and configuration management Hands-on experience with Red Hat Enterprise Linux (RHEL) administration, including installation, configuration, performance tuning, and troubleshooting Experience with creating and maintaining remote mounts (CIFS/samba/NFS) for both client and server configurations Familiarity with local disk management as well as network-based storage concepts and related configurations (HBA, PowerPath, LVM, etc.) Experience in scripting languages like Bash or Python for automation and task scripting Understanding of networking concepts, protocols, and services (TCP/IP, DNS, DHCP, Kerberos, etc.) Excellent communication and interpersonal skills Commitment to the highest ethical standards We take care of our people We invest in our people, their careers, their health, and their well-being. When you work here, we provide: Health care benefits Generous parental and family leave policies Mental and physical wellness programs Volunteer opportunities Support for employee-led affinity groups representing women, minorities and the LGBT+ community Tuition assistance About Point72 Point72 is a leading global alternative investment firm led by Steven A. Cohen. Building on more than 30 years of investing experience, Point72 seeks to deliver superior returns for its investors through fundamental and systematic investing strategies across asset classes and geographies. We aim to attract and retain the industry's brightest talent by cultivating an investor-led culture and committing to our people's long-term growth. For more information, visit
Leeds, England, United Kingdom - Full Time Location: Hybrid - Remote / Leeds Position Title: Implementation Engineer Job Type: Full-Time Hours: Will be working 2pm-10pm to cover EST timezone About Us Assured Data Protection is a global leader in data backup and disaster recovery managed services, specialising in safeguarding against data loss and downtime in the event of a disaster, cyber, or ransomware attack. Our fully managed services include immutable backup, disaster recovery, and cyber resiliency to protect data on-premises and in the cloud, with 24/7/365 expert support. We offer a flexible, consumption-based model to grow with your business, making data protection cost effective and scalable. Our purpose built software provides industry leading monitoring and reporting capabilities to provide actionable insights into your data protection strategy. Our global datacenters ensure data sovereignty, meeting your organisation's compliance requirements. A dedicated team is always available to recover your data and minimise disruption in the event of a disaster. Job Summary The Implementation Engineer will be responsible for the onboarding of new customers. Under direction from the Implementation Manager, the candidate will work with newly signed customers through the implementations process from initial kick off through steady state transition to support. Customers have a wide range of technologies that require a broad spectrum of skills to interpret how best to protect their environments. Candidates will need to be able to learn quickly and work independently with customer IT administrators, and key stakeholders to answer questions or direct them to resources for answers. Key Responsibilities Conduct kickoff calls to begin the Implementation process Gather customer information for the purposes of deploying Assured Data Protection Solutions Build and ship appliances to customer locations Configure and implement on prem appliances Troubleshoot any installation based issues with customer and OEM Hold steady state transition calls Create and maintain customer documentation and knowledge base articles Identify areas where the implementation processes can be improved for DR or Backup Key Skills / Requirements Excellent time management / organisational skills; being able to work well in critical or high pressure situations Experience with enterprise backup and recovery systems such as Rubrik, Zerto, Veritas NetBackup, Commvault, or Veeam is a major advantage Experience with networking configurations (IP addressing, DNS, VPN etc.) Experience with virtualisation technologies such as VMware, Nutanix, or Hype V. Experience with Cloud Technologies Excellent communication skills and the ability to interact with end users at all levels What We Offer Hybrid working options for flexibility Regular team building and off site company events. A dynamic, inclusive, and collaborative work environment At Assured Data Protection we value diversity and inclusivity. We offer perks such as flex holidays and flexible working practices to allow our employees to show up as their whole selves. We are an equal opportunity employer, and we do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status. If you have a disability or special need that requires accommodation, please do not hesitate to let us know. You must have the legal right to work in the UK at the time of application, as we are unable to offer visa sponsorship for this role.
19/05/2026
Full time
Leeds, England, United Kingdom - Full Time Location: Hybrid - Remote / Leeds Position Title: Implementation Engineer Job Type: Full-Time Hours: Will be working 2pm-10pm to cover EST timezone About Us Assured Data Protection is a global leader in data backup and disaster recovery managed services, specialising in safeguarding against data loss and downtime in the event of a disaster, cyber, or ransomware attack. Our fully managed services include immutable backup, disaster recovery, and cyber resiliency to protect data on-premises and in the cloud, with 24/7/365 expert support. We offer a flexible, consumption-based model to grow with your business, making data protection cost effective and scalable. Our purpose built software provides industry leading monitoring and reporting capabilities to provide actionable insights into your data protection strategy. Our global datacenters ensure data sovereignty, meeting your organisation's compliance requirements. A dedicated team is always available to recover your data and minimise disruption in the event of a disaster. Job Summary The Implementation Engineer will be responsible for the onboarding of new customers. Under direction from the Implementation Manager, the candidate will work with newly signed customers through the implementations process from initial kick off through steady state transition to support. Customers have a wide range of technologies that require a broad spectrum of skills to interpret how best to protect their environments. Candidates will need to be able to learn quickly and work independently with customer IT administrators, and key stakeholders to answer questions or direct them to resources for answers. Key Responsibilities Conduct kickoff calls to begin the Implementation process Gather customer information for the purposes of deploying Assured Data Protection Solutions Build and ship appliances to customer locations Configure and implement on prem appliances Troubleshoot any installation based issues with customer and OEM Hold steady state transition calls Create and maintain customer documentation and knowledge base articles Identify areas where the implementation processes can be improved for DR or Backup Key Skills / Requirements Excellent time management / organisational skills; being able to work well in critical or high pressure situations Experience with enterprise backup and recovery systems such as Rubrik, Zerto, Veritas NetBackup, Commvault, or Veeam is a major advantage Experience with networking configurations (IP addressing, DNS, VPN etc.) Experience with virtualisation technologies such as VMware, Nutanix, or Hype V. Experience with Cloud Technologies Excellent communication skills and the ability to interact with end users at all levels What We Offer Hybrid working options for flexibility Regular team building and off site company events. A dynamic, inclusive, and collaborative work environment At Assured Data Protection we value diversity and inclusivity. We offer perks such as flex holidays and flexible working practices to allow our employees to show up as their whole selves. We are an equal opportunity employer, and we do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status. If you have a disability or special need that requires accommodation, please do not hesitate to let us know. You must have the legal right to work in the UK at the time of application, as we are unable to offer visa sponsorship for this role.