Data Engineer

Bangalore, Karnataka
Permanent
Full-time

9 hours ago

About HireRightHireRight is the premier global background screening and workforce solutions provider. We bring clarity and confidence to vetting and hiring decisions through integrated, tailored solutions, driving a higher standard of accuracy in everything we do. Combining in-house talent, personalized services, and proprietary technology, we ensure the best candidate experience possible. PBSA accredited and based in Nashville, Tennessee, we offer expertise from our regional centers across 200 countries and territories in The Americas, Europe, Asia, and the Middle East. Our commitment to get it right every time, everywhere, makes us the trusted partner of businesses and organizations worldwide.Overview

Design, build, test and deploy new data pipelines within On Prem or Cloud Data Eco-Systems
Improve existing data pipelines by simplifying and increasing performance.
Follow best practices and execute architected techniques and solutions for data collection, management and usage to aid in company-wide data governance and management framework.
Work closely with the data analysts and scientists, and database and systems administrators to create data solutions.
Evaluate new data sources for quality and attribution to support product requirements.
Document new/existing pipelines, Datasets and Lineage.
The work activity includes processing complex data sets, leveraging technologies used to process these disparate data sets and understanding the correlations as well as patterns that exist between these different complex data sets.

ResponsibilitiesAdditionally, The Ideal Candidate May Also Have:

Experience in designing and building production data pipelines from ingestion to consumption within a hybrid data architecture, using Java, Python, C# etc.
Experience in Designing and implementing scalable and secure data processing pipelines using, Databricks (on AWS), AWS Glue, Lambda,AWS services, Azure Data Factory and Azure Databricks.
Experience in developing ETL/ELT workflows leveraging Apache Spark on Databricks orchestrated with AWS Glue Workflows and Step Functions.
Experience in managing and optimizing data storage using Amazon S3 and Amazon Redshift and ADLS Gen2.
Knowledge and hands on experience of working on data lake, lakehouse and Delta lake.
Proficient in building scalable data pipelines using Pyspark, Notebooks, Workflows, and Delta Live Tables.
Hands on Experience on realtime data ingestion and realtime data analytics.
Strong grasp of data governance and tools like Unity catalogue/Alation and access control.
Strong experience in common data warehouse modelling principles including Kimball, Inmon.
Ensuring data quality and consistency through data cleaning, transformation, and integration processes.
Knowledge of Dev-Ops processes (including CI/CD) and Infrastructure as code is essential.
Experience in Managing Monitoring and troubleshooting data-related issues within the Azure/AWS/Databricks environment to maintain high availability and performance.
Collaborating with data scientists, business analysts, and other stakeholders to understand data requirements and implement appropriate data solutions.
Implementing data security measures, including encryption, access controls, and auditing, to protect sensitive information.
Automating data pipelines and workflows to streamline data ingestion, processing, and distribution tasks.
Knowledge of Microsoft BI Stack (SSRS/SSAS (Tabular with DAX & OLAP with MDX) SSIS) is desirable.
Knowledge of Microsoft D365 / Dataverse / Salesforce / SAP data services / Knime
Keeping abreast of the latest Databricks features and technologies to enhance data engineering processes and capabilities.
Documenting data procedures, systems, and architectures to maintain clarity and ensure compliance with regulatory standards.
Providing guidance and support for data governance, including metadata management, data lineage, and data cataloging.

Qualifications

BA/BS/Btech/BE in Computer Science or related field, or equivalent experience.
5+ years of experience in area of data management and/or data curation.
5+ years development experience with Oracle and/or SQL Server.
5+ years of Experience with Python/Pyspark frameworks/libraries.
Expert-level understanding of Databricks Workspaces, Delta Lake, Databricks SQL, Unity Catalog, and integration with cloud-native services (e.g., AWS S3, Azure Data Lake Storage).
Expert level hands on experience on any public cloud AWS or Azure.
Deep knowledge of Delta Lake architecture and Databricks SQL for managing structured and semi-structured data.
Ability to develop, implement and optimize code using procedural languages such as PL/SQL, T-SQL, etc..
Experienced with SSIS and C# development
Experienced with data normalization and denormalization techniques
Experienced in implementing largescale event based streaming architectures
Experienced data transformation and data processing techniques
Knowledge of API and Microservice development
Experienced in Agile methodology and/or pair programming
Preferred knowledge of AI/ML concepts and technologies
Preferred experience with Stream-processing systems
Strong communication skills
Strong writing and documentation skills
Experienced in working with cross functional teams, building alignment and collaboration
Preferred certification Databricks Certified Data Engineer Associate/Professional

What do we offerPlease submit resume/CV in English.All resumes are held in confidence. Only candidates whose profiles closely match requirements will be contacted during this search.HireRight does not accept unsolicited resumes through or from search firms or staffing agencies. All unsolicited resumes will be considered the property of HireRight and HireRight will not be obligated to pay a placement fee.

HireRight

Apply Now