
Data Engineering Consultant
- Bangalore, Karnataka
- Permanent
- Full-time
- Support the full data engineering lifecycle including research, proof of concepts, design, development, testing, deployment, and maintenance of data management solutions
- Utilize knowledge of various data management technologies to drive data engineering projects
- Lead data acquisition efforts to gather data from various structured or semi-structured source systems of record to hydrate client data warehouse and power analytics across numerous health care domains
- Leverage combination of ETL/ELT methodologies to pull complex relational and dimensional data to support loading DataMart's and reporting aggregates
- Eliminate unwarranted complexity and unneeded interdependence
- Detect data quality issues, identify root causes, implement fixes, and manage data audits to mitigate data challenges
- Implement, modify, and maintain data integration efforts that improve data efficiency, reliability, and value
- Leverage and facilitate the evolution of best practices for data acquisition, transformation, storage, and aggregation that solve current challenges and reduce the risk of future challenges
- Effectively create data transformations that address business requirements and other constraints
- Partner with the broader analytics organization to make recommendations for changes to data systems and the architecture of data platforms
- Support the implementation of a modern data framework that facilitates business intelligence reporting and advanced analytics
- Prepare high level design documents and detailed technical design documents with best practices to enable efficient data ingestion, transformation and data movement
- Leverage DevOps tools to enable code versioning and code deployment
- Leverage data pipeline monitoring tools to detect data integrity issues before they result into user visible outages or data quality issues
- Leverage processes and diagnostics tools to troubleshoot, maintain and optimize solutions and respond to customer and production issues
- Continuously support technical debt reduction, process transformation, and overall optimization
- Leverage and contribute to the evolution of standards for high quality documentation of data definitions, transformations, and processes to ensure data transparency, governance, and security
- Ensure that all solutions meet the business needs and requirements for security, scalability, and reliability
- Comply with the terms and conditions of the employment contract, company policies and procedures, and any and all directives (such as, but not limited to, transfer and/or re-assignment to different work locations, change in teams and/or work shifts, policies in regards to flexibility of work benefits and/or work environment, alternative work arrangements, and other decisions that may arise due to the changing business environment). The Company may adopt, vary or rescind these policies and directives in its absolute discretion and without any limitation (implied or otherwise) on its ability to do so
To apply to an internal job, employees must meet the following criteria:
- The candidate should have completed 12 months in the current role
- The candidate should not be on any active CAP/ PIP
- The performance review of the candidate must be ME & Above in the last common review
- Graduate degree or equivalent experience
- Bachelor's Degree (preferably in information technology, engineering, math, computer science, analytics, engineering or other related field)
- 5+ years of combined experience in data engineering, ingestion, normalization, transformation, aggregation, structuring, and storage
- 5+ years of combined experience working with industry standard relational, dimensional or non-relational data storage systems
- 5+ years of experience in designing ETL/ELT solutions using tools like Informatica, DataStage, SSIS , PL/SQL, T-SQL, etc.
- 5+ years of experience in managing data assets using SQL, Python, Scala or other similar querying/coding language
- 3+ years of experience working with healthcare data or data to support healthcare organizations
- 3+ years of experience in Microsoft Azure Cloud, Azure Data Factory, Data Bricks, Spark, Scala / Python, ADO
- Certification in Azure Cloud preferably DP-203
- Experience in Machine Learning Pipelines and AI in healthcare
- Experience in Data Visualization and BI Tools (Power BI, Tableau) for healthcare reporting
- Exposure in containerization and orchestration (Docker, Kubernetes, Airflow)
- Familiarity with Data Tokenization, anonymization and synthetic data generation for research