Company Overview Radiant is redefining how AI infrastructure is built. We design and operate AI native cloud platforms engineered for sovereignty, performance, and scale. Our infrastructure powers GPU native workloads, multi tenant control planes, and high performance AI systems designed for the most demanding environments. We are not building a generic cloud. We are building purpose built AI infrastructure - from powered land, to compute, to software. As we scale our platform and expand our engineering organization, we are looking for leaders who can build strong teams, uphold high standards, and deliver reliably at pace. Job Description As a Senior Network Engineer, you'll design, deploy, and operate the high performance, low latency network fabric that underpins our GPU infrastructure. You'll play a critical role in shaping the architecture, automation, and operational standards of our network, ensuring it meets the rigorous demands of modern AI training and inference workloads. The ideal candidate is a strong advocate of engineering excellence and operational best practices. You're customer focused, collaborative, and equally comfortable discussing technical issues with enterprise customers, internal teams, and cross functional stakeholders. You'll contribute to everything from design and implementation through to L3/L4 operational support. Role Responsibilities Network Architecture & Collaboration: Design scalable leaf spine networks using Mellanox switches with Cumulus Linux, integrated with Juniper PTX routers and SRX firewalls. Lead 3rd party and internal provider collaboration to deploy resilient site to site and internet connectivity, optimise topology, and resolve complex faults. Operations & Maintenance: Perform configuration, upgrades, and deep troubleshooting of Juniper and NVIDIA/Cumulus devices in production environments. Software Defined Networking (SDN): Implement programmable network designs using controller based platforms to support automation and scalability. Kubernetes Networking: Support networking within container platforms using CNIs such as Calico, Cilium, or Kube OVN. Understand microservices traffic patterns and service mesh integrations. Automation & Scripting: Develop and maintain Ansible playbooks and Python automation for configuration management, provisioning, and compliance, including working against API interfaces of network equipment. Monitoring & Telemetry: Implement observability tools using SNMP, sFlow, and gRPC to detect and address network bottlenecks at scale. Incident Management: Lead L3/L4 network incident response, escalation management and root cause analysis in high pressure, 24x7 production environments. Stakeholder & Customer Engagement: Act as a trusted technical expert, engaging directly with internal stakeholders and enterprise customers. Present solutions and troubleshoot effectively across all levels of the organisation. Engineering Excellence: Promote and uphold best practices in configuration management, change control, documentation, and continuous improvement. Requirements HPC/AI Networking: 3-5 years' experience supporting high throughput, low latency infrastructure for GPU based or HPC clusters. 8-12+ years in networking, often with at least 5 years in globally distributed environments. Experience with multi region or multi continent backbone networks, transit/peering, and high availability design at Internet scale. Hands on with large scale routing (BGP/MPLS), automation at scale, and often SDN or custom orchestration frameworks. Routing & Switching: Strong understanding of BGP, EVPN, VXLAN, and Data Centre Interconnects (DCI). Hardware Platforms and Cloud Native Networking: Proficiency with Kubernetes networking and container based infrastructure, including CNIs (e.g., Calico, Cilium). Linux Proficiency: Confident with CLI environments, scripting, and diagnostics. Automation & Tooling: Experience with Ansible and Python. Familiarity with Terraform or similar infrastructure as code tools is a plus. Customer Facing Skills: Comfortable explaining complex networking concepts to customers and internal stakeholders, from engineers to executives. Operational Support: Hands on experience in L3/L4 support within production environments, driving root cause and preventative measures. Engineering Discipline: Strong proponent of code quality, peer reviews, change control, and infrastructure versioning. Mellanox Ethernet switches with Cumulus Linux. Juniper PTX core routers and SRX firewalls. Preferred Qualifications Industry certifications such as JNCIE, CCIE, or equivalent. Familiarity with network security frameworks and best practices. Experience with hybrid cloud and cloud connectivity solutions (e.g., AWS/Azure Direct Connect). Exposure to observability platforms and time series databases (e.g., Grafana, Prometheus, InfluxDB). Qualities we look for Set the standard: Every single day, you spot opportunities to constructively shake things up. Inspire the change: There's no blueprint for the future. You'll embrace challenges and change. You're real and you're true to yourself: We cherish and celebrate diversity so you'll feel right at home, whoever you are and whoever you're talking to, you treat everyone the same. Benefits 30 days of annual leave: we value your peace of mind. With 30 days off (excluding public holidays) and access to mental health resources, we make sure you're as strong mentally as you are professionally. A culture that emphasises results over hierarchy, process & ego: we place great emphasis on the quality, ingenuity and creativity of work. Open communication, regular feedback: we value smooth collaboration, direct and actionable feedback, and believe that leading with empathy and a growth mindset makes us better together. Learning Time: we all have dedicated learning time to focus on new skills, projects or interests that lay outside of your day to day job. Health & Wellbeing: we want everyone to feel healthy and happy, so we offer private medical insurance via Bupa. Cycle to Work Scheme: we're committed to building a sustainable business, so we encourage cycling to work. Gympass subscription to a variety of gyms and wellbeing apps. Participation in the company shares program. Enhanced parental pay & leave. Diversity, Equality, Inclusion and Belonging We are an equal opportunity employer and we strive to reduce unconscious bias throughout our hiring process. All applicants will be considered for employment without attention to ethnicity, religion, sexual orientation, gender identity, family or parental status, national origin, veteran, neurodiversity status or disability status. To ensure our recruitment processes provide an equal opportunity for all applicants to succeed, we encourage you to let us know if there are any adjustments that we can make.
13/05/2026
Full time
Company Overview Radiant is redefining how AI infrastructure is built. We design and operate AI native cloud platforms engineered for sovereignty, performance, and scale. Our infrastructure powers GPU native workloads, multi tenant control planes, and high performance AI systems designed for the most demanding environments. We are not building a generic cloud. We are building purpose built AI infrastructure - from powered land, to compute, to software. As we scale our platform and expand our engineering organization, we are looking for leaders who can build strong teams, uphold high standards, and deliver reliably at pace. Job Description As a Senior Network Engineer, you'll design, deploy, and operate the high performance, low latency network fabric that underpins our GPU infrastructure. You'll play a critical role in shaping the architecture, automation, and operational standards of our network, ensuring it meets the rigorous demands of modern AI training and inference workloads. The ideal candidate is a strong advocate of engineering excellence and operational best practices. You're customer focused, collaborative, and equally comfortable discussing technical issues with enterprise customers, internal teams, and cross functional stakeholders. You'll contribute to everything from design and implementation through to L3/L4 operational support. Role Responsibilities Network Architecture & Collaboration: Design scalable leaf spine networks using Mellanox switches with Cumulus Linux, integrated with Juniper PTX routers and SRX firewalls. Lead 3rd party and internal provider collaboration to deploy resilient site to site and internet connectivity, optimise topology, and resolve complex faults. Operations & Maintenance: Perform configuration, upgrades, and deep troubleshooting of Juniper and NVIDIA/Cumulus devices in production environments. Software Defined Networking (SDN): Implement programmable network designs using controller based platforms to support automation and scalability. Kubernetes Networking: Support networking within container platforms using CNIs such as Calico, Cilium, or Kube OVN. Understand microservices traffic patterns and service mesh integrations. Automation & Scripting: Develop and maintain Ansible playbooks and Python automation for configuration management, provisioning, and compliance, including working against API interfaces of network equipment. Monitoring & Telemetry: Implement observability tools using SNMP, sFlow, and gRPC to detect and address network bottlenecks at scale. Incident Management: Lead L3/L4 network incident response, escalation management and root cause analysis in high pressure, 24x7 production environments. Stakeholder & Customer Engagement: Act as a trusted technical expert, engaging directly with internal stakeholders and enterprise customers. Present solutions and troubleshoot effectively across all levels of the organisation. Engineering Excellence: Promote and uphold best practices in configuration management, change control, documentation, and continuous improvement. Requirements HPC/AI Networking: 3-5 years' experience supporting high throughput, low latency infrastructure for GPU based or HPC clusters. 8-12+ years in networking, often with at least 5 years in globally distributed environments. Experience with multi region or multi continent backbone networks, transit/peering, and high availability design at Internet scale. Hands on with large scale routing (BGP/MPLS), automation at scale, and often SDN or custom orchestration frameworks. Routing & Switching: Strong understanding of BGP, EVPN, VXLAN, and Data Centre Interconnects (DCI). Hardware Platforms and Cloud Native Networking: Proficiency with Kubernetes networking and container based infrastructure, including CNIs (e.g., Calico, Cilium). Linux Proficiency: Confident with CLI environments, scripting, and diagnostics. Automation & Tooling: Experience with Ansible and Python. Familiarity with Terraform or similar infrastructure as code tools is a plus. Customer Facing Skills: Comfortable explaining complex networking concepts to customers and internal stakeholders, from engineers to executives. Operational Support: Hands on experience in L3/L4 support within production environments, driving root cause and preventative measures. Engineering Discipline: Strong proponent of code quality, peer reviews, change control, and infrastructure versioning. Mellanox Ethernet switches with Cumulus Linux. Juniper PTX core routers and SRX firewalls. Preferred Qualifications Industry certifications such as JNCIE, CCIE, or equivalent. Familiarity with network security frameworks and best practices. Experience with hybrid cloud and cloud connectivity solutions (e.g., AWS/Azure Direct Connect). Exposure to observability platforms and time series databases (e.g., Grafana, Prometheus, InfluxDB). Qualities we look for Set the standard: Every single day, you spot opportunities to constructively shake things up. Inspire the change: There's no blueprint for the future. You'll embrace challenges and change. You're real and you're true to yourself: We cherish and celebrate diversity so you'll feel right at home, whoever you are and whoever you're talking to, you treat everyone the same. Benefits 30 days of annual leave: we value your peace of mind. With 30 days off (excluding public holidays) and access to mental health resources, we make sure you're as strong mentally as you are professionally. A culture that emphasises results over hierarchy, process & ego: we place great emphasis on the quality, ingenuity and creativity of work. Open communication, regular feedback: we value smooth collaboration, direct and actionable feedback, and believe that leading with empathy and a growth mindset makes us better together. Learning Time: we all have dedicated learning time to focus on new skills, projects or interests that lay outside of your day to day job. Health & Wellbeing: we want everyone to feel healthy and happy, so we offer private medical insurance via Bupa. Cycle to Work Scheme: we're committed to building a sustainable business, so we encourage cycling to work. Gympass subscription to a variety of gyms and wellbeing apps. Participation in the company shares program. Enhanced parental pay & leave. Diversity, Equality, Inclusion and Belonging We are an equal opportunity employer and we strive to reduce unconscious bias throughout our hiring process. All applicants will be considered for employment without attention to ethnicity, religion, sexual orientation, gender identity, family or parental status, national origin, veteran, neurodiversity status or disability status. To ensure our recruitment processes provide an equal opportunity for all applicants to succeed, we encourage you to let us know if there are any adjustments that we can make.
Radiant in the United Kingdom is seeking an experienced Infrastructure Site Reliability Engineer to enhance and run their AI infrastructure stack. Responsibilities include operating resilient, scalable systems for AI workloads, optimizing Linux configurations, managing bare-metal infrastructure, and mentoring junior engineers. The ideal candidate should possess strong Linux administration skills and a proactive approach to problem-solving. Benefits include 30 days annual leave, a strong focus on mental health, and a collaborative workplace culture.
13/05/2026
Full time
Radiant in the United Kingdom is seeking an experienced Infrastructure Site Reliability Engineer to enhance and run their AI infrastructure stack. Responsibilities include operating resilient, scalable systems for AI workloads, optimizing Linux configurations, managing bare-metal infrastructure, and mentoring junior engineers. The ideal candidate should possess strong Linux administration skills and a proactive approach to problem-solving. Benefits include 30 days annual leave, a strong focus on mental health, and a collaborative workplace culture.
Radiant is seeking a Senior Network Engineer based in the United Kingdom. This role involves designing and maintaining high-performance networks for AI applications, offering 30 days of annual leave and private medical insurance. Candidates should have extensive networking experience and strong skills in HPC networking. You will collaborate closely with internal teams and enterprise customers, ensuring the reliability of our AI infrastructure. Join a culture that values results, open communication, and inclusion.
13/05/2026
Full time
Radiant is seeking a Senior Network Engineer based in the United Kingdom. This role involves designing and maintaining high-performance networks for AI applications, offering 30 days of annual leave and private medical insurance. Candidates should have extensive networking experience and strong skills in HPC networking. You will collaborate closely with internal teams and enterprise customers, ensuring the reliability of our AI infrastructure. Join a culture that values results, open communication, and inclusion.
About Radiant Radiant is redefining how AI infrastructure is built. We design and operate AI-native cloud platforms engineered for sovereignty, performance, and scale. Our infrastructure powers GPU-native workloads, multi-tenant control planes, and high-performance AI systems designed for the most demanding environments. We are not building a generic cloud. We are building purpose-built AI infrastructure - from powered land, to compute, to software . As we scale our platform and expand our engineering organisation, we are looking for leaders who can build strong teams, uphold high standards, and deliver reliably at pace. Job Summary: We're looking for an experienced Infrastructure Site Reliability Engineer to run and evolve our infrastructure stack. You'll contribute across bare-metal, virtualization, and orchestration layers, keeping things stable and secure 24/7 x 365 - all while mentoring teammates, improving process and automation as well as helping translate deep technical concepts for a wide range of collaborators and customers. What You'll Do : Deploy and operate resilient, scalable infrastructure supporting AI/HPC workloads Optimize Linux system configuration, BIOS/firmware, kernel, and disk subsystem for performance Configure, monitor and manage bare-metal infrastructure using IPMI, Redfish, etc Build and maintain automation scripts and infrastructure as code to support platform lifecycle, as well as simplifying troubleshooting for Incident resolution and provision of tooling for our support organisation Apply ITSM frameworks: Incident, Major Incident, Change Management, and service improvement. Maintain and enhance 's observability stack: Prometheus, Grafana, and custom monitoring integrations Operate and support services in 24x7 production environments, including on-call rotation Contribute to Incident postmortem analyses, root cause analysis, document learnings, and automate remediations Mentor junior engineers and act as an Operational requirements consultant to other departments Communicate technical decisions clearly to non-technical stakeholders and customers Uphold a culture of: do, document, automate Willingness to cross train with Platform Engineering/Platform SRE to fully support both our infrastructure and platform stacks. Willingness to cross train with HPC Engineering, supported by NVIDIA to enhance our HPC supportability offering What you bring: 5+ Years Proven experience in globally scaled, performance-intensive environments operating to a 24/7 support model Expert-level Linux administration, especially Ubuntu distributions Proficiency in system tuning, disk I/O optimization, and hardware-level performance tweaks Familiarity with Out of Band management tools (IPMI, Redfish, PXE, etc.) Strong networking fundamentals: TCP/IP, DNS, DHCP, VLANs, routing, switching Strong experience with infrastructure scripting and automation (Bash, Python, Ansible) Deep understanding of observability principles and tools (Prometheus, Grafana) Hands-on experience operating orchestration platforms (Kubernetes, MAAS, Tinkerbell) Strong grasp of ITSM and service operation best practices Excellent communication and mentorship skills Comfortable interfacing with internal stakeholders and external customers Bonus: Knowledge of HPC workloads and GPU-based infrastructure Bonus: Experience with InfiniBand networks and HPC performance tuning Nice to have: Bachelor or Masters Level degree in Computer Science, Engineering or related field, or equivalent experience. LPIC Certifications ITIL Foundation level qualification or equivalent experience How you work: You approach problems with a systems mindset - balancing practical execution with long term scalability You elevate the team, setting high standards for technical quality and engineering excellence. You hold yourself and others accountable - giving direct feedback and expecting the same You take initiative, owning challenges end-to-end and proactively driving solutions. You invest in others, mentoring to build both capability and confidence. You communicate clearly - translating complexity into clarity across engineering and business audiences Why should you join us? What sets us apart is our blend of modern technology, competitive benefits, and an open, welcoming work culture that enables our people to thrive. Here are just some of the great things you can expect from us: 30 days of annual leave: we value your peace of mind. With 30 days off (excluding public holidays) and access to mental health resources, we make sure you're as strong mentally as you are professionally. A culture that emphasises results over hierarchy, process & ego: we place great emphasis on the quality, ingenuity and creativity of work. Open communication, regular feedback: we value smooth collaboration, direct and actionable feedback, and believe that leading with empathy and a growth mindset makes us better together. Learning Time: we all have dedicated learning time to focus on new skills, projects or interests that lay outside of your day to day job. Health & Wellbeing: we want everyone to feel healthy and happy, so we offer private medical insurance via Bupa. Cycle to Work Scheme: we're committed to building a sustainable business, so we encourage cycling to work. Gympass subscription to a variety of gyms and wellbeing apps Participation in the company shares program Enhanced parental pay & leave Diversity, Equality, Inclusion and Belonging We are an equal opportunity employer and we strive to reduce unconscious bias throughout our hiring process. All applicants will be considered for employment without attention to ethnicity, religion, sexual orientation, gender identity, family or parental status, national origin, veteran, neurodiversity status or disability status. To ensure our recruitment processes provide an equal opportunity for all applicants to succeed, we encourage you to let us know if there are any adjustments that we can make.
13/05/2026
Full time
About Radiant Radiant is redefining how AI infrastructure is built. We design and operate AI-native cloud platforms engineered for sovereignty, performance, and scale. Our infrastructure powers GPU-native workloads, multi-tenant control planes, and high-performance AI systems designed for the most demanding environments. We are not building a generic cloud. We are building purpose-built AI infrastructure - from powered land, to compute, to software . As we scale our platform and expand our engineering organisation, we are looking for leaders who can build strong teams, uphold high standards, and deliver reliably at pace. Job Summary: We're looking for an experienced Infrastructure Site Reliability Engineer to run and evolve our infrastructure stack. You'll contribute across bare-metal, virtualization, and orchestration layers, keeping things stable and secure 24/7 x 365 - all while mentoring teammates, improving process and automation as well as helping translate deep technical concepts for a wide range of collaborators and customers. What You'll Do : Deploy and operate resilient, scalable infrastructure supporting AI/HPC workloads Optimize Linux system configuration, BIOS/firmware, kernel, and disk subsystem for performance Configure, monitor and manage bare-metal infrastructure using IPMI, Redfish, etc Build and maintain automation scripts and infrastructure as code to support platform lifecycle, as well as simplifying troubleshooting for Incident resolution and provision of tooling for our support organisation Apply ITSM frameworks: Incident, Major Incident, Change Management, and service improvement. Maintain and enhance 's observability stack: Prometheus, Grafana, and custom monitoring integrations Operate and support services in 24x7 production environments, including on-call rotation Contribute to Incident postmortem analyses, root cause analysis, document learnings, and automate remediations Mentor junior engineers and act as an Operational requirements consultant to other departments Communicate technical decisions clearly to non-technical stakeholders and customers Uphold a culture of: do, document, automate Willingness to cross train with Platform Engineering/Platform SRE to fully support both our infrastructure and platform stacks. Willingness to cross train with HPC Engineering, supported by NVIDIA to enhance our HPC supportability offering What you bring: 5+ Years Proven experience in globally scaled, performance-intensive environments operating to a 24/7 support model Expert-level Linux administration, especially Ubuntu distributions Proficiency in system tuning, disk I/O optimization, and hardware-level performance tweaks Familiarity with Out of Band management tools (IPMI, Redfish, PXE, etc.) Strong networking fundamentals: TCP/IP, DNS, DHCP, VLANs, routing, switching Strong experience with infrastructure scripting and automation (Bash, Python, Ansible) Deep understanding of observability principles and tools (Prometheus, Grafana) Hands-on experience operating orchestration platforms (Kubernetes, MAAS, Tinkerbell) Strong grasp of ITSM and service operation best practices Excellent communication and mentorship skills Comfortable interfacing with internal stakeholders and external customers Bonus: Knowledge of HPC workloads and GPU-based infrastructure Bonus: Experience with InfiniBand networks and HPC performance tuning Nice to have: Bachelor or Masters Level degree in Computer Science, Engineering or related field, or equivalent experience. LPIC Certifications ITIL Foundation level qualification or equivalent experience How you work: You approach problems with a systems mindset - balancing practical execution with long term scalability You elevate the team, setting high standards for technical quality and engineering excellence. You hold yourself and others accountable - giving direct feedback and expecting the same You take initiative, owning challenges end-to-end and proactively driving solutions. You invest in others, mentoring to build both capability and confidence. You communicate clearly - translating complexity into clarity across engineering and business audiences Why should you join us? What sets us apart is our blend of modern technology, competitive benefits, and an open, welcoming work culture that enables our people to thrive. Here are just some of the great things you can expect from us: 30 days of annual leave: we value your peace of mind. With 30 days off (excluding public holidays) and access to mental health resources, we make sure you're as strong mentally as you are professionally. A culture that emphasises results over hierarchy, process & ego: we place great emphasis on the quality, ingenuity and creativity of work. Open communication, regular feedback: we value smooth collaboration, direct and actionable feedback, and believe that leading with empathy and a growth mindset makes us better together. Learning Time: we all have dedicated learning time to focus on new skills, projects or interests that lay outside of your day to day job. Health & Wellbeing: we want everyone to feel healthy and happy, so we offer private medical insurance via Bupa. Cycle to Work Scheme: we're committed to building a sustainable business, so we encourage cycling to work. Gympass subscription to a variety of gyms and wellbeing apps Participation in the company shares program Enhanced parental pay & leave Diversity, Equality, Inclusion and Belonging We are an equal opportunity employer and we strive to reduce unconscious bias throughout our hiring process. All applicants will be considered for employment without attention to ethnicity, religion, sexual orientation, gender identity, family or parental status, national origin, veteran, neurodiversity status or disability status. To ensure our recruitment processes provide an equal opportunity for all applicants to succeed, we encourage you to let us know if there are any adjustments that we can make.
About Radiant Radiant is redefining how AI infrastructure is built. We design and operate AI-native cloud platforms engineered for sovereignty, performance, and scale. Our infrastructure powers GPU-native workloads, multi-tenant control planes, and high-performance AI systems designed for the most demanding environments. We are not building a generic cloud. We are building purpose-built AI infrastructure - from powered land, to compute, to software . As we scale our platform and expand our engineering organisation, we are looking for leaders who can build strong teams, uphold high standards, and deliver reliably at pace. Role Responsibilities Deploy and Manage Kubernetes Clusters, deployed at scale to support AI centric workloads, across both our bare metal clusters and via trusted partner infrastructure Develop Kubernetes Manifests and Operators: Facilitate application deployments and maintain Kubernetes-native services for networking, storage, security, identity and infrastructure management Optimize Linux system configuration including kernel, driver, filesystem and services to support workloads running via our orchestration layer Build and maintain automation scripts and infrastructure as code to support platform lifecycle, as well as simplifying troubleshooting for Incident resolution and provision of tooling for our support organisation Apply ITSM frameworks: Incident, Major Incident, Change Management, and service improvement. Maintain and enhance Radiant's observability stack: Prometheus, Grafana, and custom monitoring integrations Operate and support services in 24x7 production environments, including on-call rotation Contribute to Incident postmortem analyses, root cause analysis, document learnings, and automate remediations Mentor junior engineers and act as an Operational requirements consultant to other departments Communicate technical decisions clearly to non-technical stakeholders and customers Uphold a culture of: do, document, automate Willingness to cross train with Platform Engineering/Platform SRE to fully support both our infrastructure and platform stacks. Willingness to cross train with HPC Engineering, supported by NVIDIA to enhance our HPC supportability offering Requirements 5+ Years Proven experience in globally scaled, performance-intensive environments operating to a 24/7 support model in an SRE or equivalent role 3+ years experience in both running, deploying and optimising orchestration platforms with a strong emphasis on Kubernetes Expert-level Linux administration, especially Ubuntu distributions Proficiency in system tuning, disk I/O optimization, and hardware-level performance tweaks Strong networking fundamentals: TCP/IP, DNS, DHCP, VLANs, routing, switching Strong experience with API interrogation Strong experience with infrastructure scripting and automation (Bash, Python, Ansible) Deep understanding of observability principles and tools (Prometheus, Grafana preferred) Strong grasp of ITSM and service operation best practices Excellent communication and mentorship skills Comfortable interfacing with internal stakeholders and external customers Bonus: Knowledge of running AI workloads via orchestration platforms Bonus Requirements Bachelor or Masters Level degree in Computer Science, Engineering or related field, or equivalent experience. LPIC Certifications ITIL Foundation level qualification or equivalent experience Certified Kubernetes Administrator (CKA) Qualities we look for You approach problems with a systems mindset - balancing practical execution with long-term scalability You elevate the team, setting high standards for technical quality and engineering excellence. You hold yourself and others accountable - giving direct feedback and expecting the same You take initiative, owning challenges end-to-end and proactively driving solutions. You invest in others, mentoring to build both capability and confidence. Why should you join us? What sets us apart is our blend of modern technology, competitive benefits, and an open, welcoming work culture that enables our people to thrive. 30 days of annual leave: we value your peace of mind. With 30 days off (excluding public holidays) and access to mental health resources, we make sure you're as strong mentally as you are professionally. A culture that emphasises results over hierarchy, process & ego: we place great emphasis on the quality, ingenuity and creativity of work. Open communication, regular feedback: we value smooth collaboration, direct and actionable feedback, and believe that leading with empathy and a growth mindset makes us better together. Learning Time: we all have dedicated learning time to focus on new skills, projects or interests that lay outside of your day-to-day job. Health & Wellbeing: we want everyone to feel healthy and happy, so we offer private medical insurance via Bupa. Cycle to Work Scheme: we're committed to building a sustainable business, so we encourage cycling to work. Gympass subscription to a variety of gyms and wellbeing apps Participation in the company shares program Enhanced parental pay & leave Diversity, Equality, Inclusion and Belonging We are an equal opportunity employer and we strive to reduce unconscious bias throughout our hiring process. All applicants will be considered for employment without attention to ethnicity, religion, sexual orientation, gender identity, family or parental status, national origin, veteran, neurodiversity status or disability status. To ensure our recruitment processes provide an equal opportunity for all applicants to succeed, we encourage you to let us know if there are any adjustments that we can make.
13/05/2026
Full time
About Radiant Radiant is redefining how AI infrastructure is built. We design and operate AI-native cloud platforms engineered for sovereignty, performance, and scale. Our infrastructure powers GPU-native workloads, multi-tenant control planes, and high-performance AI systems designed for the most demanding environments. We are not building a generic cloud. We are building purpose-built AI infrastructure - from powered land, to compute, to software . As we scale our platform and expand our engineering organisation, we are looking for leaders who can build strong teams, uphold high standards, and deliver reliably at pace. Role Responsibilities Deploy and Manage Kubernetes Clusters, deployed at scale to support AI centric workloads, across both our bare metal clusters and via trusted partner infrastructure Develop Kubernetes Manifests and Operators: Facilitate application deployments and maintain Kubernetes-native services for networking, storage, security, identity and infrastructure management Optimize Linux system configuration including kernel, driver, filesystem and services to support workloads running via our orchestration layer Build and maintain automation scripts and infrastructure as code to support platform lifecycle, as well as simplifying troubleshooting for Incident resolution and provision of tooling for our support organisation Apply ITSM frameworks: Incident, Major Incident, Change Management, and service improvement. Maintain and enhance Radiant's observability stack: Prometheus, Grafana, and custom monitoring integrations Operate and support services in 24x7 production environments, including on-call rotation Contribute to Incident postmortem analyses, root cause analysis, document learnings, and automate remediations Mentor junior engineers and act as an Operational requirements consultant to other departments Communicate technical decisions clearly to non-technical stakeholders and customers Uphold a culture of: do, document, automate Willingness to cross train with Platform Engineering/Platform SRE to fully support both our infrastructure and platform stacks. Willingness to cross train with HPC Engineering, supported by NVIDIA to enhance our HPC supportability offering Requirements 5+ Years Proven experience in globally scaled, performance-intensive environments operating to a 24/7 support model in an SRE or equivalent role 3+ years experience in both running, deploying and optimising orchestration platforms with a strong emphasis on Kubernetes Expert-level Linux administration, especially Ubuntu distributions Proficiency in system tuning, disk I/O optimization, and hardware-level performance tweaks Strong networking fundamentals: TCP/IP, DNS, DHCP, VLANs, routing, switching Strong experience with API interrogation Strong experience with infrastructure scripting and automation (Bash, Python, Ansible) Deep understanding of observability principles and tools (Prometheus, Grafana preferred) Strong grasp of ITSM and service operation best practices Excellent communication and mentorship skills Comfortable interfacing with internal stakeholders and external customers Bonus: Knowledge of running AI workloads via orchestration platforms Bonus Requirements Bachelor or Masters Level degree in Computer Science, Engineering or related field, or equivalent experience. LPIC Certifications ITIL Foundation level qualification or equivalent experience Certified Kubernetes Administrator (CKA) Qualities we look for You approach problems with a systems mindset - balancing practical execution with long-term scalability You elevate the team, setting high standards for technical quality and engineering excellence. You hold yourself and others accountable - giving direct feedback and expecting the same You take initiative, owning challenges end-to-end and proactively driving solutions. You invest in others, mentoring to build both capability and confidence. Why should you join us? What sets us apart is our blend of modern technology, competitive benefits, and an open, welcoming work culture that enables our people to thrive. 30 days of annual leave: we value your peace of mind. With 30 days off (excluding public holidays) and access to mental health resources, we make sure you're as strong mentally as you are professionally. A culture that emphasises results over hierarchy, process & ego: we place great emphasis on the quality, ingenuity and creativity of work. Open communication, regular feedback: we value smooth collaboration, direct and actionable feedback, and believe that leading with empathy and a growth mindset makes us better together. Learning Time: we all have dedicated learning time to focus on new skills, projects or interests that lay outside of your day-to-day job. Health & Wellbeing: we want everyone to feel healthy and happy, so we offer private medical insurance via Bupa. Cycle to Work Scheme: we're committed to building a sustainable business, so we encourage cycling to work. Gympass subscription to a variety of gyms and wellbeing apps Participation in the company shares program Enhanced parental pay & leave Diversity, Equality, Inclusion and Belonging We are an equal opportunity employer and we strive to reduce unconscious bias throughout our hiring process. All applicants will be considered for employment without attention to ethnicity, religion, sexual orientation, gender identity, family or parental status, national origin, veteran, neurodiversity status or disability status. To ensure our recruitment processes provide an equal opportunity for all applicants to succeed, we encourage you to let us know if there are any adjustments that we can make.
Radiant is seeking a Senior Software Engineer in Greater London to design high-performance, cloud-native systems and scalable APIs. You will work closely with cross-functional teams, mentor junior engineers, and contribute to tooling around AI workloads. The position offers a salary range of £50,000-95,000, emphasizing a culture of learning, feedback, and wellbeing, along with strong benefits including 30 days of annual leave, private medical insurance, and participation in the company shares program.
13/05/2026
Full time
Radiant is seeking a Senior Software Engineer in Greater London to design high-performance, cloud-native systems and scalable APIs. You will work closely with cross-functional teams, mentor junior engineers, and contribute to tooling around AI workloads. The position offers a salary range of £50,000-95,000, emphasizing a culture of learning, feedback, and wellbeing, along with strong benefits including 30 days of annual leave, private medical insurance, and participation in the company shares program.
Job Description We're looking for a Senior Software Engineer with strong experience in Golang and Kubernetes to join our team. In this role, you'll design and build high-performance, cloud-native back-end systems, scalable APIs, and infrastructure to support AI workloads, including LLMs and GPU-based services. You'll collaborate closely with cross-functional teams - including platform, infrastructure, and machine learning - to design, develop, and deliver high-quality software solutions. You'll help build tools and services that power model training, inference, and orchestration in production environments, while mentoring junior engineers, upholding engineering best practices, and driving initiatives to improve code quality and system performance. What You'll Do: Design, develop, and maintain robust applications and services in Go Build and manage gRPC and RESTful APIs for scalable system integration Work with PostgreSQL or similar relational databases for high-performance querying and storage Deploy and operate applications in Kubernetes, leveraging Helm charts and the Kubernetes API Design and develop Kubernetes Operators to automate custom workload management Build and manage containerized services using Docker and industry best practices What you bring: Proven experience developing production systems in Golang Proven ability to improve software quality through unit testing, integration testing, code reviews, and adherence to clean code principles Deep knowledge of Kubernetes and cloud-native architectures Hands on experience with containers, Helm, and microservice design patterns Strong understanding of modern DevOps workflows and CI/CD practices Ability to work autonomously with a proactive, solution-oriented mindset Experience collaborating effectively with front-end developers and cross functional teams. Preferred Skills (Nice to Have) Experience deploying or serving LLMs or other GPU workloads (e.g., using vLLM, KServe) Proficiency in Python, especially AI/ML libraries such as transformers, vLLM, or similar Experience tuning and scaling machine learning inference pipelines Salary Range Information Based on market data and other factors, the salary range for this position is £50,000-95,000 and will vary depending on the candidate's experience. Qualities we look for: Set the standard: Every single day, you spot opportunities to constructively shake things up. Inspire the change: There's no blueprint for the future. You'll embrace challenges and change. You're real and you're true to yourself: We cherish and celebrate diversity so you'll feel right at home whoever you are and whoever you're talking to, you treat everyone the same. Why should you join us? What sets us apart is our blend of modern technology, competitive benefits, and an open, welcoming work culture that enables our people to thrive. Here are just some of the great things you can expect from us: 30 days of annual leave: we value your peace of mind. With 30 days off (excluding public holidays) and access to mental health resources, we make sure you're as strong mentally as you are professionally. A culture that emphasises results over hierarchy, process & ego: we place great emphasis on the quality, ingenuity and creativity of work. Open communication, regular feedback: we value smooth collaboration, direct and actionable feedback, and believe that leading with empathy and a growth mindset makes us better together. Learning Time: we all have dedicated learning time to focus on new skills, projects or interests that lay outside of your day to day job. Health & Wellbeing: we want everyone to feel healthy and happy, so we offer private medical insurance via Bupa. Cycle to Work Scheme: we're committed to building a sustainable business, so we encourage cycling to work. Gympass subscription to a variety of gyms and wellbeing apps Participation in the company shares program Enhanced parental pay & leave Diversity, Equality, Inclusion and Belonging We are an equal opportunity employer and we strive to reduce unconscious bias throughout our hiring process. All applicants will be considered for employment without attention to ethnicity, religion, sexual orientation, gender identity, family or parental status, national origin, veteran, neurodiversity status or disability status. To ensure our recruitment processes provide an equal opportunity for all applicants to succeed, we encourage you to let us know if there are any adjustments that we can make.
13/05/2026
Full time
Job Description We're looking for a Senior Software Engineer with strong experience in Golang and Kubernetes to join our team. In this role, you'll design and build high-performance, cloud-native back-end systems, scalable APIs, and infrastructure to support AI workloads, including LLMs and GPU-based services. You'll collaborate closely with cross-functional teams - including platform, infrastructure, and machine learning - to design, develop, and deliver high-quality software solutions. You'll help build tools and services that power model training, inference, and orchestration in production environments, while mentoring junior engineers, upholding engineering best practices, and driving initiatives to improve code quality and system performance. What You'll Do: Design, develop, and maintain robust applications and services in Go Build and manage gRPC and RESTful APIs for scalable system integration Work with PostgreSQL or similar relational databases for high-performance querying and storage Deploy and operate applications in Kubernetes, leveraging Helm charts and the Kubernetes API Design and develop Kubernetes Operators to automate custom workload management Build and manage containerized services using Docker and industry best practices What you bring: Proven experience developing production systems in Golang Proven ability to improve software quality through unit testing, integration testing, code reviews, and adherence to clean code principles Deep knowledge of Kubernetes and cloud-native architectures Hands on experience with containers, Helm, and microservice design patterns Strong understanding of modern DevOps workflows and CI/CD practices Ability to work autonomously with a proactive, solution-oriented mindset Experience collaborating effectively with front-end developers and cross functional teams. Preferred Skills (Nice to Have) Experience deploying or serving LLMs or other GPU workloads (e.g., using vLLM, KServe) Proficiency in Python, especially AI/ML libraries such as transformers, vLLM, or similar Experience tuning and scaling machine learning inference pipelines Salary Range Information Based on market data and other factors, the salary range for this position is £50,000-95,000 and will vary depending on the candidate's experience. Qualities we look for: Set the standard: Every single day, you spot opportunities to constructively shake things up. Inspire the change: There's no blueprint for the future. You'll embrace challenges and change. You're real and you're true to yourself: We cherish and celebrate diversity so you'll feel right at home whoever you are and whoever you're talking to, you treat everyone the same. Why should you join us? What sets us apart is our blend of modern technology, competitive benefits, and an open, welcoming work culture that enables our people to thrive. Here are just some of the great things you can expect from us: 30 days of annual leave: we value your peace of mind. With 30 days off (excluding public holidays) and access to mental health resources, we make sure you're as strong mentally as you are professionally. A culture that emphasises results over hierarchy, process & ego: we place great emphasis on the quality, ingenuity and creativity of work. Open communication, regular feedback: we value smooth collaboration, direct and actionable feedback, and believe that leading with empathy and a growth mindset makes us better together. Learning Time: we all have dedicated learning time to focus on new skills, projects or interests that lay outside of your day to day job. Health & Wellbeing: we want everyone to feel healthy and happy, so we offer private medical insurance via Bupa. Cycle to Work Scheme: we're committed to building a sustainable business, so we encourage cycling to work. Gympass subscription to a variety of gyms and wellbeing apps Participation in the company shares program Enhanced parental pay & leave Diversity, Equality, Inclusion and Belonging We are an equal opportunity employer and we strive to reduce unconscious bias throughout our hiring process. All applicants will be considered for employment without attention to ethnicity, religion, sexual orientation, gender identity, family or parental status, national origin, veteran, neurodiversity status or disability status. To ensure our recruitment processes provide an equal opportunity for all applicants to succeed, we encourage you to let us know if there are any adjustments that we can make.