
Senior Data Engineer (Python, Azure Fabric, PySpark)
- Hyderabad, Telangana
- Permanent
- Full-time
- Lead the design and implementation of scalable data infrastructure and pipelines in collaboration with IT, enabling secure and high-performance access to Azure Fabric lakehouses to support reporting, advanced analytics, and machine learning use cases.
- Collaborate with analysts and stakeholders to translate business questions and model requirements into structured data workflows and repeatable, automated processes.
- Implement and manage data ingestion and transformation workflows across diverse sources using tools such as Python, PySpark, SQL, Azure Data Factory, and Microsoft Fabric.
- Enable batch model scoring pipelines, manage model artifacts and outputs, and ensure timely delivery of results to reporting environments (e.g., Power BI datasets).
- Ensure data quality, lineage, and governance, implementing validation rules and monitoring to support trusted analytics and reproducibility.
- Act as a technical advisor and partner to the analytics team, helping define data requirements and optimize model performance through better data design and availability.
- Continuously evaluate emerging data technologies and practices, recommending improvements to infrastructure, tooling, and processes that enhance analytical agility and scale.
- 5-8 years of experience in data engineering, data architecture, or analytics infrastructure roles.
- Proven track record of designing and deploying scalable data pipelines and structured data assets in modern cloud environments.
- Hands-on experience managing data pipelines for machine learning, including support for model deployment and scoring workflows.
- Experience working cross-functionally with business, analytics, and product teams to align data capabilities with strategic needs.
- Familiarity with customer analytics concepts, including segmentation, churn, and lifecycle metrics.
- Proficient in Python for building and automating data workflows, including cleansing and writing to cloud storage.
- Expertise in SQL for data extraction, transformation, and performance tuning.
- Experience with semantic modeling (e.g., using DAX, Power BI datasets) to support self-service analytics.
- Understanding of data warehouse design principles, including star/snowflake schemas and slowly changing dimensions.
- Exposure to CRM systems (e.g., Salesforce) and version control tools (e.g., Git).
- Familiarity with MLOps workflows, including model versioning, batch scoring, and result storage.
- Bachelor's degree in a quantitative or technical field (e.g., Computer Science, Data Engineering, Information Systems, Data Science, or related disciplines).
- Master's degree preferred in Data Engineering, Computer Science, or a related field.