
Manager
- Noida, Uttar Pradesh
- Permanent
- Full-time
Location: Noida, UP
Role Type: Individual Contributor (IC)About the RoleWe are seeking a highly skilled Data Engineer with strong Python development expertise and proven experience in building and scaling cloud-based data management platforms. This role requires hands-on expertise in data pipelines, data lakehouse architectures, and Apache Spark, with a strong foundation in metadata and master data management. The ideal candidate will also bring experience in AI-driven analytics and demonstrate the ability to design, optimize, and manage data solutions with a focus on cost efficiency.Key Responsibilities
- Design, build, and maintain scalable data pipelines and ETL/ELT processes across Azure-based ecosystems.
- Develop and optimize data lakehouse solutions leveraging Azure Synapse Analytics, Microsoft Fabric, and Databricks.
- Estimate and manage cloud resource utilization and costs for data pipelines, ensuring efficiency and cost-effectiveness.
- Monitor and fine-tune pipelines to balance performance and cost optimization across compute, storage, and data movement.
- Collaborate with analytics teams to deliver business-ready datasets for reporting and AI-driven use cases.
- Implement best practices for metadata and master data management, ensuring data lineage, quality, and governance.
- Develop and support real-time and batch processing frameworks using Apache Spark.
- Integrate and support visualization solutions using Power BI for business stakeholders.
- Partner with data science and AI teams to enable AI/ML-powered analytics solutions.
- Ensure adherence to data security, compliance, and governance standards.
- 5–8 years of hands-on experience as a Data Engineer or similar role.
- Strong Python coding expertise for data processing and automation.
- Proven experience with Azure Synapse, Microsoft Fabric, and Databricks in enterprise environments.
- Hands-on expertise with Apache Spark, distributed data processing, and performance optimization.
- Experience in cost estimation, monitoring, and optimization of cloud-based pipelines.
- Proficiency in Power BI and data visualization best practices.
- Strong knowledge of metadata management, master data management, and data governance frameworks.
- Exposure to AI-driven analytics and integration with ML/GenAI workflows.
- Solid understanding of data modeling, data quality, and data integration principles.
- Familiarity with CI/CD pipelines for data engineering (Azure DevOps, GitHub Actions, etc.).
- Experience with API-driven data ingestion and workflow orchestration tools.
- Knowledge of responsible AI practices and explainability frameworks.
- Strong problem-solving, communication, and stakeholder management skills.