The Role
We're looking for an AI Engineer to help build and run customer facing AI agent capabilities in production. You'll work in a team of highly skilled engineers delivering reliable, secure, and observable agent driven workflows. Our orchestration backbone is Akka (Java), deployed on AWS, with top observability/evaluation platform and Postgres + pgvector for retrieval.
This is an engineering first role: you'll focus on distributed systems, orchestration, robustness, and production operations-bringing AI agent components into a dependable runtime.
What You'll Do
- Build and operate customer facing services that incorporate AI agents and tool using workflows
- Integrate retrieval components using Postgres + pgvector (query patterns, latency considerations, embedding/retrieval flows)
- Instrument, monitor, and improve agent quality using observability tools (quality signals, evaluation/regression tracking, incident triage)
- Engineer for reliability and safety: timeouts, retries, idempotency, graceful degradation, auditability, and secure service integration
- Collaborate with product/engineering stakeholders to translate requirements into robust designs, delivery plans, and runbooks
Must Have Requirements
- Eligible and willing to pass relevant background checks
- PhD or Master's degree in Computer Science / AI / Data Science / Machine Learning / Engineering (or equivalent practical experience)
- At least 3+ years post graduate experience delivering production software systems and building and operating scalable backend/distributed systems (high concurrency services, reliability patterns, production operations)
- Strong Java engineering skills and experience with concurrent/distributed systems
- Experience building customer facing services with high reliability expectations (timeouts, retries, observability, performance)
- Familiarity with AWS fundamentals for deploying and operating services (including security/IAM awareness)
- Strong engineering practices: testing, code review, CI/CD, documentation, and operational readiness
Nice to Have
- Hands on experience with Akka in production (actors/streams; reliability patterns such as supervision)
- Python experience (useful for integrating AI/agent tooling and adjacent components)
- Experience with LLM/agentic systems (tool calling, structured outputs, evaluation practices)
- Experience with Arize (or similar ML/LLM observability tooling)
- Experience with Postgres + pgvector and retrieval optimisation
- Security by design experience for AI systems (PII handling, prompt injection mitigation, least privilege, audit trails)
What We Offer
- Work on a state of the art high profile, production AI system with modern runtime and observability practices
- A collaborative Northern Ireland based engineering team with hybrid flexibility
- Competitive package and professional growth opportunities