
Consultant Data Engineering - Connected Medicine
- Bangalore, Karnataka
- Permanent
- Full-time
- Create and maintain optimal data pipeline architecture ETL/ ELT into structured data
- Assemble large, complex data sets that meet business requirements and create and maintain multi-dimensional modelling like Star Schema and Snowflake Schema, normalization, de-normalization, joining of datasets.
- Expert level experience in creating a scalable data warehouse including Fact tables, Dimensional tables and ingest datasets into cloud based tools.
- Identify, design, and implement internal process improvements including automating manual processes, optimizing data delivery and re-designing infrastructure for greater scalability.
- Collaborate with stakeholders to ensure seamless integration of data with internal data marts, enhancing advanced reporting
- Setup and maintain data ingestion, streaming, scheduling, and job monitoring automation using AWS services. Setup Lambda, code pipeline (CI/CD), Glue, S3, Redshift, Power BI needs to be maintained for uninterrupted automation.
- Build analytics tools that utilize the data pipeline to provide actionable insight into customer acquisition, operational efficiency, and other key business performance metrics.
- Work with stakeholders to assist with data-related technical issues and support their data infrastructure needs.
- Utilize GitHub for version control, code collaboration, and repository management. Implement best practices for code reviews, branching strategies, and continuous integration.
- Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader
- Ensure data privacy and compliance with relevant regulations (e.g., GDPR) when handling customer data.
- Maintain data quality and consistency within the application, addressing data-related issues as they arise.
- 7-10 years of relevant experience
- Advanced working SQL knowledge and experience working with relational databases, query authoring (SQL) as well as working familiarity with a variety of databases and Cloud Data warehouse like AWS Redshift
- Experience in creating scalable, efficient schema designs to support diverse business needs.
- Experience with database normalization, schema evolution, and maintaining data integrity
- Proactively share best practices, contributing to team knowledge and improving schema design transitions.
- Develop data models, create dimensions and facts, and establish views and procedures to enable automation programmability.
- Collaborate effectively with cross-functional teams to gather requirements, incorporate feedback, and align analytical work with business objectives
- Prior Data Modelling, OLAP cube modelling
- Data compression into PARQUET to improve processing and finetuning SQL programming skills.
- Experience building and optimizing “big data” data pipelines, architectures and data sets.
- Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
- Experience with manipulating, processing and extracting value from large disconnected unrelated datasets
- Strong analytic skills related to working with structured and unstructured datasets.
- Working knowledge of message queuing, stream processing, and highly scalable “big data” stores.
- Experience supporting and working with cross-functional teams and Global IT.
- Familiarity of working in an agile based working models.
- Experience with relational SQL and NoSQL databases, especially AWS Redshift.
- Experience with AWS cloud services Preferable: S3, EC2, Lambda, Glue, EMR, Code pipeline highly preferred. Experience with similar services on another platform would also be considered.
- Bachelor’s or master’s degree on Technology and Computer Science background