Senior Operations Reliability Engineer - Enterprise Platforms and Tools

Genesys
25/05/2026

Full time Information Technology Telecommunications

Job Description

Senior Operations Reliability Engineer - Enterprise Platforms and ToolsSkip to main contentThis site uses cookies and related technologies, as described in our privacy policy, for purposes that may include site operation, analytics, enhanced user experience, or communicating job openings. You may choose to decline cookies that are not necessary for the function of the site. See further details in the Genesys Privacy Policy# Careers at GenesysSenior Operations Reliability Engineer - Enterprise Platforms and Tools page is loaded Senior Operations Reliability Engineer - Enterprise Platforms and ToolsApplylocations: United Kingdom: Northern Irelandtime type: Full timeposted on: Posted Todayjob requisition id: JR111086Genesys empowers organizations of all sizes to improve loyalty and business outcomes by creating the best experiences for their customers and employees. Through Genesys Cloud, the AI-powered Experience Orchestration platform, organizations can accelerate growth by delivering empathetic, personalized experiences at scale to drive customer loyalty, workforce engagement, efficiency and operational improvements.We employ more than 6,000 people across the globe who embrace empathy and cultivate collaboration to succeed. And, while we offer great benefits and perks like larger tech companies, our employees have the independence to make a larger impact on the company and take ownership of their work. Join the team and create the future of customer experience together. Senior Operations Reliability Engineer - Enterprise Platforms & Tools Level: P3 - Career (Professional Track) Overview As a Senior Operations Reliability Engineer specializing in Enterprise Platforms and Tools , you will own the operational reliability, health, and lifecycle management of enterprise productivity and collaboration platforms. This role combines hands-on platform administration with day-to-day operational ownership and governance of enterprise SaaS tools such as Jira, Confluence, Figma, Lucid, and other Saas related platforms . In addition to serving as a senior escalation point, you will improve monitoring accuracy, reduce alert noise, validate automation workflows, and contribute to AIOps tuning and observability standards. You will help transition enterprise tool operations from reactive issues handling toward proactive, automation-driven reliability practices that improve uptime, user communication, and service maturity. Responsibilities General Reliability Operations Monitor observability and AIOps platforms to detect anomalies, performance degradation, and emerging issues across enterprise systems. Perform advanced incident triage and event correlation to identify root cause and reduce duplicate or misrouted incidents. Lead or contribute to post-incident reviews, identifying systemic fixes and automation opportunities. Validate automated remediation workflows prior to production adoption. Identify recurring manual tasks and translate them into automation requirements or scripted improvements. Improve alert signal quality by refining thresholds, suppression logic, and event correlation rules. Ensure platform telemetry, SaaS health signals, and configuration data align with monitoring and CMDB standards. Collaborate with Cloud, IAM, Network, Security, and ServiceNow teams to improve enterprise service reliability. Enterprise Tools Ownership & Operational Management Own day-to-day operational health and administration of enterprise SaaS platforms (e.g., Jira, Confluence, Figma, Lucid, monitoring tools, and similar productivity platforms). Monitor vendor service health dashboards and integrate SaaS outage signals into internal observability and AIOps workflows. Lead user-impact communications during enterprise tool outages or service degradations in partnership with IT Communications and ServiceNow teams. Review vendor release notes and roadmap updates; assess feature changes, security updates, and deprecations. Plan and coordinate controlled feature rollouts, configuration updates, and tenant-level optimizations. Provide guidance and education to end users on new features, configuration changes, and best practices. Manage licensing, usage monitoring, and cost optimization for enterprise tools. Partner with Security and IAM teams to ensure access governance and compliance standards are maintained. Improve monitoring coverage for enterprise tools by integrating telemetry and health signals into AIOps platforms. Document operational standards, support models, and escalation paths for each owned platform. Enterprise Platform Responsibilities Diagnose and remediate integration issues between enterprise platforms and supporting systems. Validate patching and upgrade activities to ensure minimal service disruption. Participate in resilience validation exercises, including failover and recovery testing. Provide mentorship and knowledge-sharing to junior reliability engineers. Support operational reliability of Microsoft Power Platform components (Power Apps, Power Automate, Power BI), including: Monitoring flow failures Troubleshooting environment-level issues Supporting connector configuration Assisting with environment governance and data loss prevention policies Automation & AIOps Contributions Develop and maintain automation scripts (PowerShell, Python) to reduce repetitive operational effort. Contribute to ServiceNow and Power Automate workflow improvements tied to enterprise tool incidents. Partner with teams to refine automated remediation logic. Improve enterprise tool signal quality by integrating vendor health data and usage telemetry into AIOps systems. Support tuning of alert correlation and anomaly detection models for enterprise services. Track improvements in MTTR, alert noise reduction, automation coverage, and platform uptime. Requirements Bachelor's degree in Computer Science, Information Technology, or related field; equivalent experience considered. 5+ years of experience in enterprise platform operations, SaaS administration, or infrastructure support roles. Hands-on experience administering enterprise tools such as Jira, Confluence, Figma, Lucid, or similar SaaS platforms. This includes setting up monitoring and event management capabilities to alert for outage or service degradation. Experience with SQL Server and IIS/Apache administration is an asset Experience managing SaaS service health, vendor communications, and feature rollouts. Proficiency in PowerShell or equivalent scripting for automation tasks. Solid understanding of monitoring, observability, and event management practices. Familiarity with ITIL principles and ServiceNow workflows. Strong troubleshooting and analytical skills. Effective communication skills, including experience communicating user-facing outages or changes. Motivation to deepen expertise in automation, AIOps, and reliability engineering. Preferred Qualifications Experience integrating SaaS platforms with identity providers (Okta, Entra ID). Familiarity with CI/CD pipelines or automation-driven configuration management. Exposure to cloud platforms (AWS or Azure). Additional Information On-Call Support: Participation in a shared, rotational on-call schedule is required.

Senior Operations Reliability Engineer - Enterprise Platforms and Tools

Job Description

Modal Window