
Azure Data Architect
- Bangalore, Karnataka
- Permanent
- Full-time
- Architect, design, and implement large-scale data pipelines using Spark (batch and streaming).
- Optimize Spark jobs for performance, cost-efficiency, and scalability.
- Define and implement enterprise data architecture standards and best practices.
- Guide the transition from traditional ETL platforms to Spark-based solutions.
- Lead the integration of Spark-based pipelines into cloud platforms (Azure Fabric/Spark pools).
- Establish and enforce data architecture standards, including governance, lineage, and quality.
- Mentor data engineering teams on best practices with Spark (e.g., partitioning, caching, join strategies).
- Implement and manage CI/CD pipelines for Spark workloads using tools like GIT or DevOps.
- Ensure robust monitoring, alerting, and logging for Spark applications.
- 10+ years of experience in data engineering, with 7+ years of hands-on experience with Apache Spark (PySpark/Scala).
- Proficiency in Spark optimization techniques, Monitoring, Caching, advanced SQL, and distributed data design.
- Experience with Spark on Databricks and Azure Fabric.
- Solid understanding of Delta Lake, Spark Structured Streaming, and data pipelines.
- Strong experience in cloud platforms ( Azure).
- Proven ability to handle large-scale datasets (terabytes to petabytes).
- Familiarity with data lakehouse architectures, schema evolution, and data governance.
- Candidate to be experienced in Power BI, with at least 3+ years of experience.
- Experience implementing real-time analytics using Spark Streaming or Structured Streaming.
- Certifications in Databricks, Fabric or Spark would be a plus.