Critical Cloud Limited
Cardiff, South Glamorgan
The delivery function at Critical Cloud needs to be built, not inherited. You'd be the first Graduate DM, working directly with the COO to build the processes, documentation, and customer relationships that hold the whole operation together. Real ownership from week one. Not a training programme. About the Role Critical Support is our flagship managed service. This role owns the delivery of it, keeping customers informed, SLAs on track, and the operational engine running smoothly. It's the connective tissue between our SRE team and the customers they serve. We are the world's first "Powered by Datadog" accredited MSP, a Datadog-native cloud managed service provider built for European tech-led SMBs. Our founders have scaled and exited multiple technology businesses. We operate lean, move fast, and take observability seriously. This is a graduate-entry delivery management role working directly under Andrew Phillips, COO. You'll own the customer-facing side of service delivery, onboarding, reporting, service reviews, escalation management, and the processes that hold everything together. You won't be writing Terraform, but you'll need to understand what our engineers are doing well enough to represent it to customers and flag when something's off. What You're Delivering 24/7 observability across customer infrastructure. Alerts, dashboards, and SLOs configured and maintained by our SRE team. Cloud Operations Managed AWS and Azure environments. Incident response, change management, and infrastructure operations on behalf of the customer. Structured SEV-based incident response with postmortem discipline, customer communications, and SLA accountability under our IMS framework. What You'll Do Own the end-to-end delivery experience for a portfolio of Critical Support customers, you're the face of the service Lead customer onboarding: coordinate across SRE, commercial, and the customer's technical team to get new accounts live and well-instrumented Run monthly and quarterly service reviews, preparing reporting packs, presenting SLA/SLO performance, and identifying improvement actions Track and manage service delivery against contracted SLAs, escalating to the SRE team and COO when at risk Act as the first escalation point for customer concerns, triaging, communicating, and coordinating resolution without technical hand-holding Maintain the change management calendar: coordinating planned changes, customer approvals, and CAB participation in line with our ISO 27001 IMS Own service delivery documentation, runbooks, onboarding packs, RACI matrices, and meeting records, kept accurate and audit-ready Work with the COO on service improvement initiatives: identifying patterns in incidents, customer feedback, and operational metrics to sharpen delivery Support the commercial team with renewal and upsell context, you'll know your customers better than anyone Contribute to developing and refining Critical Cloud's delivery processes as we scale the customer base Service Lifecycle Ownership Highlighted stages are where you have primary ownership. Customer Onboarding Service Governance Reporting & Reviews Renewal & Growth Metrics You'll Own SLA Adherence Customer Health CSAT & NPS Onboarding Time to Live Days from contract to monitored Retention Net Revenue Retention Renewals & expansions Requirements A degree in any discipline, Business, Management, Engineering, or Computer Science all fit well Exceptional written and verbal communication, you'll be in front of customer CTOs and technical teams Natural organisational instinct: you track things, follow up without being chased, and hate loose ends Comfortable working with data: producing reports, spotting trends, and presenting findings clearly Tech-literate enough to understand what our SREs are doing, curiosity matters more than deep knowledge Right to work in the UK without sponsorship Nice to Have Placement year or internship in a technical or IT services environment Awareness of service management or structured delivery practice Experience with project or service management tooling (Jira, Linear, Notion, or similar) Familiarity with cloud concepts, AWS, Azure, or basic infrastructure principles Exposure to ISO 27001 or similar compliance/governance frameworks Any customer-facing work experience, even outside tech Who Thrives Here The best delivery managers we've worked with share one trait: they make complexity invisible to customers. When something goes wrong, the customer hears from you before they notice. When something's at risk, you've already escalated internally. You're not a gatekeeper, you're the person who makes sure both sides of the relationship get what they need. You'll be working directly with Andrew Phillips, COO, someone who has built and scaled a managed service before. You'll get real exposure to how a cloud MSP operates commercially and operationally, not a training programme or a shadow role. Expect to be given genuine ownership fast. This isn't a purely administrative role. You'll be expected to develop a real working knowledge of cloud infrastructure, Datadog, and SRE practice over time, not to become an engineer, but to be a credible partner to the customers and technical teams you're working with every day. Start Graduate Delivery Manager Year 1-2 Delivery Manager Critical Support Year 2-3 Senior DM or Head of Delivery Year 3+ VP Delivery or Customer Success Base salary DOE Remote -first UK-based, async-friendly Certs funded 25 days holiday + bank holidays plus a paid day off in your birthday month, taken in the month it falls Holiday grows with tenure: +1 day per year after your second work anniversary, up to 28 days total Enhanced maternity pay: 26 weeks at your full basic salary Enhanced paternity pay: 2 weeks at your full basic salary Datadog awareness, AWS, and Azure certifications funded by the company, you need platform literacy to do this role credibly, and the company pays for it. Contractual, not discretionary. Flexible working requests from your first day of employment, statutory right, supported in full Company-provided laptop and peripherals, set up before you start How We Work Customer First Every decision about how you run a service review or handle an escalation should start with: what does this look like from the customer's side? If the customer doesn't know we're working on their problem, we're not communicating enough. Own the Problem When a customer raises an issue, you don't pass it to the SRE team and wait. You track it, update the customer proactively, and elevate if it's not moving. Problems don't get passed along, they get owned until they're resolved. Earn Trust by Delivering Every SLA report and service review is a moment where the customer decides whether to trust us with more. Consistency builds that trust. Deliver on the basics, every time, and the relationship compounds. Move with Urgency A slow response to a customer question creates anxiety that compounds. Even "we're looking at it" is better than silence. Speed of communication is itself a service quality metric, and it's one you own directly. Join the pipeline We're not actively hiring right now, but we keep applications on file. The cover letter matters most: tell us what draws you to delivery management in a cloud MSP context, and how you think about keeping customers and technical teams aligned. No templates. Pipeline open Cover letter required Direct to founders
The delivery function at Critical Cloud needs to be built, not inherited. You'd be the first Graduate DM, working directly with the COO to build the processes, documentation, and customer relationships that hold the whole operation together. Real ownership from week one. Not a training programme. About the Role Critical Support is our flagship managed service. This role owns the delivery of it, keeping customers informed, SLAs on track, and the operational engine running smoothly. It's the connective tissue between our SRE team and the customers they serve. We are the world's first "Powered by Datadog" accredited MSP, a Datadog-native cloud managed service provider built for European tech-led SMBs. Our founders have scaled and exited multiple technology businesses. We operate lean, move fast, and take observability seriously. This is a graduate-entry delivery management role working directly under Andrew Phillips, COO. You'll own the customer-facing side of service delivery, onboarding, reporting, service reviews, escalation management, and the processes that hold everything together. You won't be writing Terraform, but you'll need to understand what our engineers are doing well enough to represent it to customers and flag when something's off. What You're Delivering 24/7 observability across customer infrastructure. Alerts, dashboards, and SLOs configured and maintained by our SRE team. Cloud Operations Managed AWS and Azure environments. Incident response, change management, and infrastructure operations on behalf of the customer. Structured SEV-based incident response with postmortem discipline, customer communications, and SLA accountability under our IMS framework. What You'll Do Own the end-to-end delivery experience for a portfolio of Critical Support customers, you're the face of the service Lead customer onboarding: coordinate across SRE, commercial, and the customer's technical team to get new accounts live and well-instrumented Run monthly and quarterly service reviews, preparing reporting packs, presenting SLA/SLO performance, and identifying improvement actions Track and manage service delivery against contracted SLAs, escalating to the SRE team and COO when at risk Act as the first escalation point for customer concerns, triaging, communicating, and coordinating resolution without technical hand-holding Maintain the change management calendar: coordinating planned changes, customer approvals, and CAB participation in line with our ISO 27001 IMS Own service delivery documentation, runbooks, onboarding packs, RACI matrices, and meeting records, kept accurate and audit-ready Work with the COO on service improvement initiatives: identifying patterns in incidents, customer feedback, and operational metrics to sharpen delivery Support the commercial team with renewal and upsell context, you'll know your customers better than anyone Contribute to developing and refining Critical Cloud's delivery processes as we scale the customer base Service Lifecycle Ownership Highlighted stages are where you have primary ownership. Customer Onboarding Service Governance Reporting & Reviews Renewal & Growth Metrics You'll Own SLA Adherence Customer Health CSAT & NPS Onboarding Time to Live Days from contract to monitored Retention Net Revenue Retention Renewals & expansions Requirements A degree in any discipline, Business, Management, Engineering, or Computer Science all fit well Exceptional written and verbal communication, you'll be in front of customer CTOs and technical teams Natural organisational instinct: you track things, follow up without being chased, and hate loose ends Comfortable working with data: producing reports, spotting trends, and presenting findings clearly Tech-literate enough to understand what our SREs are doing, curiosity matters more than deep knowledge Right to work in the UK without sponsorship Nice to Have Placement year or internship in a technical or IT services environment Awareness of service management or structured delivery practice Experience with project or service management tooling (Jira, Linear, Notion, or similar) Familiarity with cloud concepts, AWS, Azure, or basic infrastructure principles Exposure to ISO 27001 or similar compliance/governance frameworks Any customer-facing work experience, even outside tech Who Thrives Here The best delivery managers we've worked with share one trait: they make complexity invisible to customers. When something goes wrong, the customer hears from you before they notice. When something's at risk, you've already escalated internally. You're not a gatekeeper, you're the person who makes sure both sides of the relationship get what they need. You'll be working directly with Andrew Phillips, COO, someone who has built and scaled a managed service before. You'll get real exposure to how a cloud MSP operates commercially and operationally, not a training programme or a shadow role. Expect to be given genuine ownership fast. This isn't a purely administrative role. You'll be expected to develop a real working knowledge of cloud infrastructure, Datadog, and SRE practice over time, not to become an engineer, but to be a credible partner to the customers and technical teams you're working with every day. Start Graduate Delivery Manager Year 1-2 Delivery Manager Critical Support Year 2-3 Senior DM or Head of Delivery Year 3+ VP Delivery or Customer Success Base salary DOE Remote -first UK-based, async-friendly Certs funded 25 days holiday + bank holidays plus a paid day off in your birthday month, taken in the month it falls Holiday grows with tenure: +1 day per year after your second work anniversary, up to 28 days total Enhanced maternity pay: 26 weeks at your full basic salary Enhanced paternity pay: 2 weeks at your full basic salary Datadog awareness, AWS, and Azure certifications funded by the company, you need platform literacy to do this role credibly, and the company pays for it. Contractual, not discretionary. Flexible working requests from your first day of employment, statutory right, supported in full Company-provided laptop and peripherals, set up before you start How We Work Customer First Every decision about how you run a service review or handle an escalation should start with: what does this look like from the customer's side? If the customer doesn't know we're working on their problem, we're not communicating enough. Own the Problem When a customer raises an issue, you don't pass it to the SRE team and wait. You track it, update the customer proactively, and elevate if it's not moving. Problems don't get passed along, they get owned until they're resolved. Earn Trust by Delivering Every SLA report and service review is a moment where the customer decides whether to trust us with more. Consistency builds that trust. Deliver on the basics, every time, and the relationship compounds. Move with Urgency A slow response to a customer question creates anxiety that compounds. Even "we're looking at it" is better than silence. Speed of communication is itself a service quality metric, and it's one you own directly. Join the pipeline We're not actively hiring right now, but we keep applications on file. The cover letter matters most: tell us what draws you to delivery management in a cloud MSP context, and how you think about keeping customers and technical teams aligned. No templates. Pipeline open Cover letter required Direct to founders
Critical Cloud Limited
Cardiff, South Glamorgan
This role is an entry point into the SRE team. You'll work directly alongside senior engineers, learning how we operate production environments, instrument systems with Datadog, and respond to incidents. From the start you'll contribute to real work, monitoring customer environments, writing runbooks, supporting infrastructure changes, with progressively more ownership as your confidence and knowledge grows. Responsibilities Monitor customer AWS and Azure environments using Datadog, learning to triage alerts, identify signal from noise, and escalating with context. Support incident response workflows alongside senior engineers, contributing to post mortem documentation and remediation tracking. Assist with Datadog onboarding and instrumentation for new customers: agents, integrations, dashboards, monitors, and log pipelines. Support infrastructure as code work (Terraform) for provisioning and configuration changes across customer accounts, under senior review. Write and maintain runbooks and operational documentation, clear, accurate, and usable by anyone on the team at 3 am. Participate in proactive reliability reviews: alert tuning, capacity checks, dependency mapping, with guidance from senior engineers. Contribute to internal tooling and AI assisted automation initiatives as part of the wider engineering team. Communicate directly with customers on day to day operational queries with a professional, calm, and clear style. Qualifications - Must Have A degree in Computer Science, Software Engineering, or a related technical discipline or equivalent demonstrable self taught fundamentals. Comfort with scripting in Bash, Python, or similar; you've automated something, even if small. Understanding of core observability concepts: what metrics, logs, and traces are and what they tell you. Awareness of cloud fundamentals; you know what EC2, S3, VPCs, and load balancers do, even without production experience. Clear written and verbal communication; you'll be in customer facing situations from early on. Right to work in the UK without sponsorship. Qualifications - Nice to Have Any hands on Datadog experience, trial, personal project, or university lab. Terraform or any infrastructure as code exposure. Docker or Kubernetes, even containerising a personal project counts. A cloud certification (AWS Cloud Practitioner, Azure Fundamentals, or equivalent). Experience in a customer facing environment, even outside tech. Any personal projects involving monitoring, automation, or infrastructure. Benefits 25 days holiday + bank holidays plus a paid day off in your birthday month, taken in the month it falls. Holiday grows with tenure: +1 day per year after your second work anniversary, up to 28 days total. Enhanced maternity pay: 26 weeks at your full basic salary. Enhanced paternity pay: 2 weeks at your full basic salary. Datadog, AWS, and Azure certifications paid by the company, contractual, not discretionary. AI tooling certifications also funded, staying current is part of the role. Flexible working requests from your first day of employment, statutory right, supported in full. Company provided laptop and peripherals, set up before you start. On call allowance (in addition to base salary): SREs join a shared rota, typically one week in five or six, reducing as the team grows. Paid £500 per on call week, which works out at roughly £5-6k a year on top of salary, varying with the rota size. Base salary DOE. Remote first. UK based, async friendly. Certs funded. Tech Stack Datadog Core observability platform. AWS primary cloud, multi account. Azure secondary cloud workloads. Terraform infrastructure as code. GitHub Actions CI/CD pipelines. Python / Bash automation & tooling.
This role is an entry point into the SRE team. You'll work directly alongside senior engineers, learning how we operate production environments, instrument systems with Datadog, and respond to incidents. From the start you'll contribute to real work, monitoring customer environments, writing runbooks, supporting infrastructure changes, with progressively more ownership as your confidence and knowledge grows. Responsibilities Monitor customer AWS and Azure environments using Datadog, learning to triage alerts, identify signal from noise, and escalating with context. Support incident response workflows alongside senior engineers, contributing to post mortem documentation and remediation tracking. Assist with Datadog onboarding and instrumentation for new customers: agents, integrations, dashboards, monitors, and log pipelines. Support infrastructure as code work (Terraform) for provisioning and configuration changes across customer accounts, under senior review. Write and maintain runbooks and operational documentation, clear, accurate, and usable by anyone on the team at 3 am. Participate in proactive reliability reviews: alert tuning, capacity checks, dependency mapping, with guidance from senior engineers. Contribute to internal tooling and AI assisted automation initiatives as part of the wider engineering team. Communicate directly with customers on day to day operational queries with a professional, calm, and clear style. Qualifications - Must Have A degree in Computer Science, Software Engineering, or a related technical discipline or equivalent demonstrable self taught fundamentals. Comfort with scripting in Bash, Python, or similar; you've automated something, even if small. Understanding of core observability concepts: what metrics, logs, and traces are and what they tell you. Awareness of cloud fundamentals; you know what EC2, S3, VPCs, and load balancers do, even without production experience. Clear written and verbal communication; you'll be in customer facing situations from early on. Right to work in the UK without sponsorship. Qualifications - Nice to Have Any hands on Datadog experience, trial, personal project, or university lab. Terraform or any infrastructure as code exposure. Docker or Kubernetes, even containerising a personal project counts. A cloud certification (AWS Cloud Practitioner, Azure Fundamentals, or equivalent). Experience in a customer facing environment, even outside tech. Any personal projects involving monitoring, automation, or infrastructure. Benefits 25 days holiday + bank holidays plus a paid day off in your birthday month, taken in the month it falls. Holiday grows with tenure: +1 day per year after your second work anniversary, up to 28 days total. Enhanced maternity pay: 26 weeks at your full basic salary. Enhanced paternity pay: 2 weeks at your full basic salary. Datadog, AWS, and Azure certifications paid by the company, contractual, not discretionary. AI tooling certifications also funded, staying current is part of the role. Flexible working requests from your first day of employment, statutory right, supported in full. Company provided laptop and peripherals, set up before you start. On call allowance (in addition to base salary): SREs join a shared rota, typically one week in five or six, reducing as the team grows. Paid £500 per on call week, which works out at roughly £5-6k a year on top of salary, varying with the rota size. Base salary DOE. Remote first. UK based, async friendly. Certs funded. Tech Stack Datadog Core observability platform. AWS primary cloud, multi account. Azure secondary cloud workloads. Terraform infrastructure as code. GitHub Actions CI/CD pipelines. Python / Bash automation & tooling.