EY - GDS Consulting - AI And DATA - AWS Data Engineer - Manager
- Pune, Maharashtra
- Permanent
- Full-time
- Architect, design, and oversee scalable ETL/ELT pipelines using PySpark, SQL, Python, and AWS data services.
- Lead the implementation of data lakehouse solutions using AWS S3, Glue, Iceberg, and other cloud-native components.
- Drive migration of on-premises data workloads to AWS, ensuring performance, reliability, scalability, and cost optimization.
- Define and standardize metadata-driven ingestion frameworks and medallion (Bronze/Silver/Gold) architecture patterns.
- Provide direction on Spark job optimization, distributed processing, and performance tuning.
- Lead teams in building and operationalizing data pipelines using orchestration tools such as Astronomer (Airflow), AWS Step Functions, and managed workflows.
- Ensure adherence to data quality frameworks, best practices, and coding standards.
- Review architecture, design, and code artifacts; troubleshoot complex technical issues.
- Collaborate with cross-functional teams including BI, data science, and product teams for seamless data delivery.
- Mentor and guide data engineers and senior engineers in technical delivery.
- Work directly with business and technical stakeholders, translating requirements into scalable solutions.
- Facilitate Agile/Scrum delivery across multi-functional teams.
- Optional (Good to Have)
- Experience with Databricks (Delta Lake, PySpark notebooks, Unity Catalog).
- Familiarity with modern governance frameworks and MLOps/DevOps integrations.
- 9+ years of overall IT experience, with 5+ years in AWS-based data engineering and at least 2+ years in a leadership/managerial capacity.
- Advanced hands-on expertise in:
- PySpark, SQL, Python
- AWS S3, Glue, Lambda, Step Functions, CloudWatch
- ETL/ELT design and data lake/lakehouse architectures
- Apache Iceberg or similar table formats
- Airflow/Astronomer or equivalent orchestration tools
- Strong understanding of structured/semi-structured data formats (Parquet, JSON, CSV, XML).
- In-depth knowledge of DW concepts, dimensional modeling, and performance optimization.
- Practical experience with CI/CD frameworks (GitHub, Azure DevOps, Jenkins).
- Proven analytical, problem-solving, and troubleshooting capabilities.
- Excellent communication, leadership, and stakeholder management skills.
- Bachelor's or Master's degree in Computer Science, IT, or related field.
- 9+ years of industry experience with significant hands-on exposure to cloud data engineering.
- Experience designing and managing production-grade AWS data platforms.
- Demonstrated success leading Agile/Scrum delivery teams.
- Ability to own deliverables end-to-end with a proactive, self-driven approach.
- Prior client-facing experience and ability to influence senior stakeholders.
- Experience delivering in multi-environment, large-scale enterprise data landscapes.
- Exposure to Databricks, Delta Lake, or governance frameworks such as Unity Catalog.
- We seek technically strong, innovative, and adaptable leaders who enjoy mentoring teams, solving complex data challenges, and driving continuous improvement in a fast-paced environment.
- Opportunities to work on diverse, industry-leading, and high-impact data programs.
- Access to continuous learning, coaching, and tailored career development.
- A collaborative, inclusive, and global work environment.
- Flexibility to manage work in a way that suits you best.
- A culture that supports innovation, knowledge-sharing, and growth.