AI Research Engineer (Kernel & Inference Optimization)

  • Jobgether
  • 25/05/2026
Full time Information Technology Telecommunications

Job Description

This position is posted by Jobgether on behalf of a partner company. We are currently looking for an AI Research Engineer (Kernel & Inference Optimization) in United Kingdom.

This is an exciting opportunity for a highly technical AI engineer to contribute to the next generation of scalable and high-performance inference systems powering real-world AI applications. In this role, you will work on optimizing model serving architectures, improving latency and throughput, and enhancing deployment efficiency across cloud, edge, and resource-constrained environments. You will collaborate with globally distributed engineering and research teams focused on advanced AI systems, multi-modal architectures, and infrastructure innovation. The position offers a research-driven environment where experimentation, benchmarking, and performance optimization are central to daily work. Ideal candidates are passionate about low-level optimization, inference scalability, and building robust AI systems that deliver measurable production impact at scale.

Accountabilities
  • Design, develop, and optimize advanced model serving architectures focused on high throughput, low latency, and efficient memory utilization.
  • Build scalable inference pipelines capable of running across cloud, edge, and resource constrained environments.
  • Conduct controlled inference experiments in simulated and production environments to evaluate system performance and reliability.
  • Monitor and analyze key performance metrics such as latency, throughput, memory consumption, token response time, and error rates.
  • Develop and maintain benchmarking methodologies and performance validation frameworks for AI inference systems.
  • Identify bottlenecks in serving pipelines, including batch processing inefficiencies, network overhead, and excessive memory usage.
  • Optimize inference frameworks and deployment strategies for scalability, resilience, and operational efficiency.
  • Collaborate with cross functional engineering and research teams to integrate optimized inference solutions into production environments.
  • Create high quality testing datasets and deployment scenarios that reflect real world operational challenges.
  • Continuously improve inference infrastructure through experimentation, iteration, and adoption of cutting edge AI serving techniques.
Requirements
  • Strong experience in AI/ML engineering with a focus on inference optimization, model serving, or AI systems performance.
  • Deep understanding of model deployment architectures and inference frameworks for large scale AI applications.
  • Expertise in optimizing latency, throughput, scalability, and memory footprint in production AI systems.
  • Hands on experience with performance monitoring, benchmarking, profiling, and bottleneck analysis.
  • Strong knowledge of advanced AI model architectures, including multi modal systems and resource efficient models.
  • Experience building and deploying AI systems across cloud, edge, or low resource hardware environments.
  • Proficiency in programming languages commonly used in AI infrastructure and optimization workflows.
  • Strong analytical and problem solving abilities with a research oriented mindset.
  • Ability to work independently in a highly distributed and fast moving global environment.
  • Excellent English communication skills and ability to collaborate across technical and non technical teams.
  • Passion for innovation, experimentation, and scalable AI infrastructure development.
Benefits
  • Fully remote global work environment with flexible location options.
  • Opportunity to work on cutting edge AI, blockchain, and fintech technologies.
  • Collaborative international team of highly skilled engineers and researchers.
  • Exposure to innovative projects involving AI infrastructure, digital finance, and decentralized technologies.
  • High impact role with significant technical ownership and influence on product direction.
  • Fast paced and innovation driven culture focused on experimentation and growth.
  • Opportunities for continuous learning and professional development.
  • Work environment that values autonomy, creativity, and technical excellence.
  • Participation in projects with global reach and real world scalability challenges.