Data Scientist (Cloud data)

About the job

About the Data Scientist role

We are seeking a skilled Data Scientist to design and implement data-driven solutions using AWS technologies for cloud data. The role involves performing statistical analysis, developing predictive models, and creating interactive dashboards in AWS QuickSight to deliver actionable business insights. You will support data pipelines, CI/CD workflows, and infrastructure automation while ensuring data quality and governance. The ideal candidate combines strong technical expertise in Python, SQL, and AWS data tools with a deep understanding of analytics, visualization, and operational excellence in cloud environments.

Responsibilities:

1. Data Analysis & Insights
  • Perform data analysis and statistical modelling using AWS Redshift data
  • Develop predictive models and machine learning algorithms
  • Generate actionable insights from large datasets
  • Conduct data quality assessments and validation
2. Dashboard & Visualization Development
  • Create and maintain interactive dashboards in AWS QuickSight
  • Design data visualizations to support business decision-making
  • Optimize dashboard performance and user experience
  • Ensure data accuracy in reporting and visualizations
3. Data Pipeline & Engineering Support
  • Monitor and troubleshoot AWS Glue jobs and data ingestion processes
  • Support CI/CD pipelines with data-focused monitoring and validation
  • Assist with GitLab pipeline configurations for data workflows
  • Support AWS Lambda functions related to data processing
  • Collaborate on Infrastructure as Code (IaC) for data infrastructure
4. Data Science Operations
  • Monitor data pipelines and flag data quality issues
  • Collaborate with technical teams on data requirements
  • Support data governance and best practices implementation
  • Assist in data model validation and testing
5. Documentation & Reporting
  • Document analytical methodologies and findings
  • Prepare regular reports on data insights and model performance
  • Conduct monthly progress meetings (1 hour) to present findings
  • Maintain project documentation on SHIP-HATS Confluence
  • Track analytical tasks through SHIP-HATS Jira

Requirements:

  • Strong background in data science, statistics, and machine learning
  • Proficiency in data analysis tools (Python, R, SQL)
  • Experience with AWS data services (Redshift, QuickSight, S3, Glue, Lambda)
  • Data pipeline development and troubleshooting experience
  • Basic CI/CD pipeline knowledge (GitLab preferred)
  • Infrastructure as Code (IaC) familiarity for data environments
  • Data visualization and dashboard development skills
  • Strong analytical thinking and problem-solving abilities
  • Excellent documentation and presentation skills