Main duties and responsibilities
- Programming and build - You can design, write and iterate code from prototype to production-ready. You understand security, accessibility and version control. You can use a range of coding tools and languages. You can develop code that self-generates documentation that supports Data Scientists and Data Analysts.
- Technical understanding - You know about the specific technologies that underpin your ability to deliver the responsibilities and tasks of the role. You can apply the required breadth and depth of technical knowledge.
- Testing - You can plan, design, manage, execute and report tests, using appropriate tools and techniques, and work within regulations. You know how to ensure that risks associated with deployment are adequately understood and documented
- Problem resolution - You know how to log, analyse and manage problems in order to identify and implement the appropriate solution. You can ensure that the problem is fixed.
You will work in the Data Engineering and Enablement Division that reports into the Data Operations Directorate. It's an exciting time to join as we seek to create leading Health Protection data capabilities that enable data scientists and data analysts to develop insights that inform public health decisions with resilience platforms and curated data assets.
The role requires the ingestion of a wide range of data assets, building acquisition, orchestration, data pipelines and curated data marts and data egresses. We are seeking to mature our advanced capabilities with standardised practices, machine learning and agility needed to pro-active detect and respond to new public health issues.
You will be involved in the delivery lifecycle from engaging with stakeholders on new initiatives, analysing use cases, developing optimal designs that where possible re-use and extend capabilities, and implement and operate the design. Our teams work collaboratively in an agile multi-disciplinary team mode and we are looking for engineers who have the ability to operate in feature teams, data operations and DevSecOps models. We are developing our engineering community of practice to share knowledge, enhance our standards and processes to provide a strong foundation to develop individuals, teams and innovation.
- Strong at python, unit testing (pytest) and pep8 standards
- Writing robust data pipeline code that can run unattended
- Pandas data validation, manipulation, merging, joining and at times visualisation
- Unix environment, server health and management of ongoing running processes
- Github, git, pull requests, CI and code review
- Logging and reporting pragmatically
- Ability to troubleshoot and solve numerical and technical problems
- High attention to detail
- Excellent communication and facilitation skills evidenced through verbal and written means to a wide range of stakeholders
- Experience with Agile delivery
- Data engineering experience using Python, SQL, Spark and AWS
- Hands on ETL development experience utilizing Microsoft enterprise stack / Azure and AWS Glue
- Knowledge of data management platforms and development with SQLServer
- Experience with publishing data sets for visualisation and analysis
- Experience with supporting design of data models / data flows
- Ability to work as part of a team to develop and deliver end-to-end data warehouse solutions
- Analytical skill set with an ability to understand data requirements and support the development of data solutions
- Machine learning for engineering practices, such as meta driven intelligent ETL and pipeline processes
- Experience of working with JIRA (or Azure DevOps or similar tools) within an Agile/Scrum environment
- Experience/understanding of software and data lifecycle management
- Educated to degree level (not essential, experience is key). Relevant numerate, technical or computer science discipline would be an advantage
for more info click the apply here button