
Data Engineer
- Pune, Maharashtra
- Permanent
- Full-time
- Developing and supporting scalable, extensible, and highly available data solutions
- Deliver on critical business priorities while ensuring alignment with the wider architectural vision
- Identify and help address potential risks in the data supply chain
- Follow and contribute to technical standards
- Design and develop analytical data models
- First Class Degree in Engineering/Technology (4-year graduate course)
- 5 to 8 years’ experience implementing data-intensive solutions using agile methodologies
- Experience of relational databases and using SQL for data querying, transformation and manipulation
- Experience of modelling data for analytical consumers
- Ability to automate and streamline the build, test and deployment of data pipelines
- Experience in cloud native technologies and patterns
- A passion for learning new technologies, and a desire for personal growth, through self-study, formal classes, or on-the-job training
- Excellent communication and problem-solving skills
- ETL: Hands on experience of building data pipelines. Proficiency in two or more data integration platforms such as Ab Initio, Apache Spark, Talend and Informatica
- Big Data: Experience of ‘big data’ platforms such as Hadoop, Hive or Snowflake for data storage and processing
- Data Warehousing & Database Management: Understanding of Data Warehousing concepts, Relational (Oracle, MSSQL, MySQL) and NoSQL (MongoDB, DynamoDB) database design
- Data Modeling & Design: Good exposure to data modeling techniques; design, optimization and maintenance of data models and data structures
- Languages: Proficient in one or more programming languages commonly used in data engineering such as Python, Java or Scala
- DevOps: Exposure to concepts and enablers - CI/CD platforms, version control, automated quality control management
- Ab Initio: Experience developing Co
IT, Data Profiler and Conduct
IT, Control
Center, Continuous
Flows * Cloud: Good exposure to public cloud data platforms such as S3, Snowflake, Redshift, Databricks, BigQuery, etc. Demonstratable understanding of underlying architectures and trade-offs
- Data Quality & Controls: Exposure to data validation, cleansing, enrichment and data controls
- Containerization: Fair understanding of containerization platforms like Docker, Kubernetes
- File Formats: Exposure in working on Event/File/Table Formats such as Avro, Parquet, Protobuf, Iceberg, Delta
- Others: Basics of Job scheduler like Autosys. Basics of Entitlement management
- Certification on any of the above topics would be an advantage.