
Lead Data Engineer
- Bangalore, Karnataka
- Permanent
- Full-time
- Design and build robust, scalable data ingestion pipelines using Microsoft Fabric (Pipelines, Dataflows, Notebooks) to integrate data from Business Applications and external APIs.
- Perform deep source system analysis to define ingestion strategies that ensure data reliability, consistency, and observability, while applying metadata-driven design for automation.
- Develop and maintain Delta Tables using the medallion architecture (bronze/silver/gold) to systematically cleanse, enrich, and standardize data for downstream consumption.
- Implement comprehensive data quality checks (nulls, duplicates, schema drift, outliers, SCD types) and ensure data integrity across all transformation layers in the Lakehouse.
- Apply governance practices including schema versioning, data lineage tracking, role-based access control (RBAC), and audit trails to ensure compliance, traceability, and secure data access.
- Build semantic models and define business-aligned KPIs to support self-service analytics and dashboarding in Power BI and other BI platforms.
- Structure the gold layer and semantic model to support AI/ML use cases, ensuring datasets are enriched, contextualized, and optimized for AI agent consumption.
- Develop and maintain AI-ready run flows and access patterns to enable seamless integration between the Lakehouse and AI agents for tasks such as prediction, summarization, and decision automation.
- Implement DevOps best practices for pipeline versioning, testing, deployment, and monitoring; proactively detect and resolve data integration and processing issues.
- Deep expertise in data engineering with hands-on experience in designing and implementing large-scale data platforms, including data warehouses, lakehouse, and modern ETL/ELT pipelines.
- Proven ability to build, deploy, and troubleshoot highly reliable, distributed data pipelines integrating structured and unstructured data from various internal systems and external sources.
- Strong technical foundation in data modeling, database architecture, and data transformation techniques using medallion architecture (bronze/silver/gold layers) within Microsoft Fabric or similar platforms.
- Solid understanding of data lakehouse patterns and Delta Lake / OneLake concepts, with the ability to structure data models that are AI/ML-ready and support semantic modeling.
- Experience implementing data quality frameworks including checks for nulls, duplicates, schema drift, outliers, and slowly changing dimensions (SCD types).
- Familiarity with data governance, including schema versioning, data lineage, access controls (RBAC), and audit logging to ensure secure and compliant data practices.
- Working knowledge of data visualization tools such as Power BI with the ability to support and optimize semantic layers and KPI definitions.
- Strong communication and collaboration skills, with the ability to articulate complex data engineering solutions to both technical and non-technical stakeholders, and to lead cross-functional initiatives.
- Experience with DevOps practices, including version control, CI/CD pipelines, environment management, and performance monitoring in a data engineering context.