SoftInWay Inc
Bristol, Gloucestershire
02/02/2026
Full time
Senior/Principal ML Systems Architect (TensorFlow + Python) Overview We are seeking a highly experienced ML Systems Architect to design and implement a scalable, production-grade architecture for our machine learning solver. This role bridges research prototypes and commercial deployment, ensuring reliability, maintainability, and performance in a mixed technology stack. Responsibilities Architect the ML Solver Platform: Define modular architecture for data preprocessing, model execution, and post-processing. Establish clear API contracts between Python/TensorFlow and C# services. Convert research code into robust, testable, and observable services. Implement CI/CD pipelines, automated testing, and reproducibility standards. Design REST/gRPC endpoints for cross-language communication. Ensure compatibility with C#/.NET services. Performance & Scalability: Optimize GPU/CPU utilization, batching strategies, and memory management. Plan for multi-model and multi-tenant scenarios. MLOps & Lifecycle Management: Implement model versioning, artifact registries, and deployment workflows. Set up monitoring, logging, and alerting for solver performance. Security & Compliance: Apply best practices for secrets management, dependency scanning, and secure artifact storage. Required Skills & Experience ML Frameworks: Expert in TensorFlow (TF2/Keras), experience with ONNX Runtime for inference. Programming: Advanced Python for ML; strong understanding of packaging, type checking, and performance profiling. APIs: Proficiency in gRPC/Protobuf and REST for cross-language integration. Performance Optimization: GPU acceleration (CUDA/cuDNN), mixed precision, XLA, profiling. Observability: Metrics, tracing, structured logging, dashboards. Security: SBOM, image signing, role-based access, vulnerability scanning. Preferred Qualifications Experience with ONNX Runtime Training, PyTorch, or hybrid ML architectures. Familiarity with distributed training strategies and multi-GPU setups. Knowledge of feature stores and data validation frameworks. Exposure to regulated environments and compliance frameworks. Tools & Technologies ML: TensorFlow, ONNX Runtime, tf2onnx. APIs: FastAPI, gRPC. Why Join Us? Work on cutting-edge ML solutions integrated into commercial engineering software. Define architecture that scales across global deployments. Collaborate with a team of experts in ML, software engineering, and UI development.