Project description
We are looking for a Middle Data Engineer to support a short-term, time-sensitive data project. The role centers on building and running data processing workflows on AWS using Python and SQL. This is a focused, one-time effort - the ideal candidate can onboard quickly and deliver with minimal ramp-up.
Responsibilities
- Develop and run data processing and transformation workflows on the AWS data stackWrite and optimize Python scripts for data manipulation and AWS automation (boto3)Query and process datasets using SQL and AWS AthenaRun distributed data processing jobs on AWS EMR; manage data in S3Perform basic statistical analysis to support project goalsWork within the existing CI/CD and Atlassian environment (Bamboo, JIRA, Confluence)
SKILLS Must have
- Python for data engineering - pandas, scipyboto3 (AWS SDK for Python)SQLHands-on with AWS data services: Athena, EMR, S3 (basic working knowledge acceptable)
Nice to have
- Basic statistical analysis
- C++ (a plus, not required)