AI Research Engineer (Kernel & Inference Optimization)

Jobgether
25/05/2026

Full time Information Technology Telecommunications

Job Description

This position is posted by Jobgether on behalf of a partner company. We are currently looking for an AI Research Engineer (Kernel & Inference Optimization) in United Kingdom.

This is an exciting opportunity for a highly technical AI engineer to contribute to the next generation of scalable and high-performance inference systems powering real-world AI applications. In this role, you will work on optimizing model serving architectures, improving latency and throughput, and enhancing deployment efficiency across cloud, edge, and resource-constrained environments. You will collaborate with globally distributed engineering and research teams focused on advanced AI systems, multi-modal architectures, and infrastructure innovation. The position offers a research-driven environment where experimentation, benchmarking, and performance optimization are central to daily work. Ideal candidates are passionate about low-level optimization, inference scalability, and building robust AI systems that deliver measurable production impact at scale.

Accountabilities

Design, develop, and optimize advanced model serving architectures focused on high throughput, low latency, and efficient memory utilization.
Build scalable inference pipelines capable of running across cloud, edge, and resource constrained environments.
Conduct controlled inference experiments in simulated and production environments to evaluate system performance and reliability.
Monitor and analyze key performance metrics such as latency, throughput, memory consumption, token response time, and error rates.
Develop and maintain benchmarking methodologies and performance validation frameworks for AI inference systems.
Identify bottlenecks in serving pipelines, including batch processing inefficiencies, network overhead, and excessive memory usage.
Optimize inference frameworks and deployment strategies for scalability, resilience, and operational efficiency.
Collaborate with cross functional engineering and research teams to integrate optimized inference solutions into production environments.
Create high quality testing datasets and deployment scenarios that reflect real world operational challenges.
Continuously improve inference infrastructure through experimentation, iteration, and adoption of cutting edge AI serving techniques.

Requirements

Strong experience in AI/ML engineering with a focus on inference optimization, model serving, or AI systems performance.
Deep understanding of model deployment architectures and inference frameworks for large scale AI applications.
Expertise in optimizing latency, throughput, scalability, and memory footprint in production AI systems.
Hands on experience with performance monitoring, benchmarking, profiling, and bottleneck analysis.
Strong knowledge of advanced AI model architectures, including multi modal systems and resource efficient models.
Experience building and deploying AI systems across cloud, edge, or low resource hardware environments.
Proficiency in programming languages commonly used in AI infrastructure and optimization workflows.
Strong analytical and problem solving abilities with a research oriented mindset.
Ability to work independently in a highly distributed and fast moving global environment.
Excellent English communication skills and ability to collaborate across technical and non technical teams.
Passion for innovation, experimentation, and scalable AI infrastructure development.

Benefits

Fully remote global work environment with flexible location options.
Opportunity to work on cutting edge AI, blockchain, and fintech technologies.
Collaborative international team of highly skilled engineers and researchers.
Exposure to innovative projects involving AI infrastructure, digital finance, and decentralized technologies.
High impact role with significant technical ownership and influence on product direction.
Fast paced and innovation driven culture focused on experimentation and growth.
Opportunities for continuous learning and professional development.
Work environment that values autonomy, creativity, and technical excellence.
Participation in projects with global reach and real world scalability challenges.

AI Research Engineer (Kernel & Inference Optimization)

Job Description

Modal Window