Swisstech Recruitment

3 job(s) at Swisstech Recruitment

Swisstech Recruitment
12/06/2026
Contractor
Key Responsibilities: Observability Platform Implementation: Deliver the implementation of the observability platform based on Grafana Mimir, Loki, Tempo, Grafana Alloy and Grafana Enterprise tooling. Design and implement highly available observability services across multiple co-location and production sites. Configure telemetry ingestion pipelines for metrics, logs, and future distributed tracing workloads. Develop and maintain observability architecture documentation, high-level designs, low-level designs, and operational runbooks. Define platform standards for telemetry collection, labelling, metadata enrichment, retention policies, and data governance. Implement multi-tenant observability controls and tenant isolation strategies. Configure and maintain object-storage-backed telemetry platforms for long-term retention and scalability. Telemetry Collection & Integration: Deploy and manage Grafana Alloy collectors across Kubernetes clusters, Linux hosts, network infrastructure, storage platforms, and hardware management systems. Integrate telemetry from Kubernetes, GPU infrastructure, HPE hardware, storage platforms, network devices, and cloud-native services. Develop and maintain observability integrations using OpenTelemetry standards and protocols. Establish onboarding processes for new platforms, applications, and infrastructure services. Collaborate with application teams to define observability requirements and future tracing adoption strategies. Alerting & Operational Insights: Design and implement alerting frameworks using recording rules, AlertManager, and operational best practices. Develop operational dashboards and service health views for infrastructure, platform, and application services. Support integration of observability events with ITSM and incident-management platforms. Define SLIs, SLOs, alert thresholds, and operational KPIs. Continuously improve platform observability, incident detection, and root-cause analysis capabilities. Reliability & Automation: Implement Infrastructure-as-Code and GitOps practices for observability platform deployment and configuration management. Develop automation for dashboard provisioning, alert deployment, tenant onboarding, and telemetry configuration. Design and validate disaster recovery, resilience, and failover capabilities across observability services. Contribute to platform security, compliance, and operational governance initiatives. Work with operational teams to ensure observability services remain reliable, scalable, and maintainable. Required Experience & Skills: Significant experience implementing and operating enterprise observability or monitoring platforms. Strong understanding of metrics, logs, traces, OpenTelemetry, and modern observability principles. Experience with Grafana ecosystem technologies including Grafana, Prometheus, Grafana Mimir, Grafana Loki, Grafana Tempo, and Grafana Alloy. Experience designing Kubernetes-native solutions and operating distributed platforms at scale. Knowledge of Linux systems administration and cloud-native infrastructure. Experience implementing Infrastructure-as-Code and GitOps approaches (preferably including Ansible). Skilled in developing automation and operational tooling using Python and/or Go. Previous exposure to creating technical architecture, operational documentation, and deployment designs. Experience with object storage technologies and distributed data platforms. Strong understanding of monitoring, alerting, and operational event management.
Swisstech Recruitment
12/06/2026
Contractor
Role Summary: Platform Developer required to contribute to the technical architecture, design, and implementation of our clients cloud infrastructure and applications. In this role you will work to devise and realise technical components, including software, of new production environments and to deliver technical change into existing environments. The successful candidate will employ their professional experience of automation, schema modelling, and software development with their knowledge of cloud infrastructure in our collaborative culture. Key Responsibilities: Technical Components: Contribute to the technical architecture and design of significant aspects of our Kubernetes-based AI infrastructure, across networking, compute, GPU, storage. Production of high-level and low-level design documentation. Specification of acceptance and integration tests. Work with Project Managers, other developers, and operational teams to successfully deliver technical components into production environments. Software Development: Prototyping, definition, and realisation of scalable and maintainable software solutions for automation and services. Active participation in software development processes. Essential Experience: Significant contribution to technical architecture/engineering of infrastructure, cloud, platform. Designing Kubernetes and cloud-native solutions for operation at scale. Delivery of technical documentation to a high standard. Database schema modelling (relational database and/or graph database). Strong understanding of automation, Infrastructure-as-code, GitOps (preferably including Ansible). Software development for production services (preferably including Python, Go). Version control using git. One or more would be an advantage: Experience of building and consuming significant cloud services Experience of working with industry partners Contribution to relevant open-source projects Understanding/awareness of AI/ML system design
Swisstech Recruitment
12/06/2026
Contractor
The Identity & Platform Engineer is responsible for designing, implementing and operating the core platform services that provide: Kubernetes platform services Sovereign identity management Federation and authentication services Privileged access management Secrets management Customer identity integration Platform security and governance The successful candidate will play a key role in delivering a Zero Trust, sovereign cloud platform built around: FreeIPA, Teleport, authentic, Bitwarden, Kubernetes. Key Responsibilities: Identity & Access Management Engineering: Design, implement and operate the sovereign identity platform supporting workforce, administrative and customer identity domains. Implement and maintain FreeIPA as the authoritative administrative identity platform. Deploy, configure and operate authentik for customer federation, SAML and OIDC integration. Implement and maintain Teleport as the privileged access management platform. Design and maintain RBAC models across Kubernetes, Rafay and supporting platform services. Integrate phishing-resistant MFA technologies including WebAuthn and FIDO2 security keys. Implement identity life cycle management processes including onboarding, access reviews and deprovisioning. Support customer identity federation onboarding and integration activities. Contribute to the ongoing evolution of the platform's Zero Trust architecture Security, Governance & Zero Trust: Implement Zero Trust security controls across platform services. Design and maintain Kubernetes RBAC and tenant isolation controls. Implement privileged access governance using Teleport. Maintain audit logging, compliance evidence collection and security monitoring capabilities. Support security reviews, threat modelling and risk assessments. Implement security hardening standards across Kubernetes, Linux and supporting infrastructure. Participate in security incident response and root cause analysis activities. Maintain compliance with security and governance requirements Secrets & Certificate Management: Operate Bitwarden and Bitwarden Secrets Manager platforms. Manage operational credentials, API keys and automation secrets. Implement secure secret distribution patterns for platform and application workloads. Support certificate life cycle management and PKI integration. Maintain operational processes for break-glass credential governance and recovery. Required Experience & Skills: Hands-on experience operating production Kubernetes environments. Soild Linux systems administration and troubleshooting experience. Knowledge designing and operating Identity and Access Management (IAM) solutions Experience with LDAP, Kerberos, SAML and OpenID Connect (OIDC). Previous experience implementing authentication, federation and RBAC solutions. Skilled in operating infrastructure and platform security services. Experience with Infrastructure as Code and automation tooling. Knowledge implementing monitoring, logging and observability solutions. Soild understanding of Zero Trust security principles. Experience with GitOps practices and cloud-native operational models. Proven incident management and root cause analysis experience. One or more would be an advantage Prior experience with FreeIPA or enterprise directory services. Experience with authentik, Keycloak or similar federation platforms. Knowledge with Teleport, CyberArk or other privileged access management technologies. Experience with Bitwarden, Vault or secrets management platforms. Knowledge operating GPU-enabled Kubernetes environments. Previously supported AI, HPC or large-scale compute platforms. Experience implementing PKI and certificate management solutions. Kubernetes multi-tenancy and platform security experience. Sovereign, regulated or highly secure environments exposure. Familiarity with SOC2, ISO27001, NCSC or equivalent security frameworks. Background in Platform Engineering, DevOps or Site Reliability Engineering