Senior Data Engineer
Stolt-Nielsen Digital Innovation Centre
- Hyderabad, Telangana
- Permanent
- Full-time
Youll work closely with analysts, data scientists, and business stakeholders to turn raw data into trusted, accessible, and actionable insights.At Stolt-Nielsen, we take Digital seriously. We invest in our teams through training and mentoring and enable them with the required tools to work in a modern way (Engineering laptops, Cursor licenses). As we are further building up our engineering practice, there is ample room for you to contribute, take initiative and shape the future of our ways of working and technology landscape.What Youll Do:
- Design, develop, and maintain scalable data pipelines in Databricks using PySpark and Delta Lake.
- Build robust ETL/ELT processes to ingest data from multiple sources (APIs, databases, files, cloud systems).
- Optimize data models and storage for analytics, machine learning, and BI reporting.
- Write and maintain SQL queries, views, and stored procedures for data transformation and analysis.
- Collaborate with cross-functional teams to understand business needs and translate them into technical solutions.
- Implement and enforce data quality, governance, and security standards across pipelines and platforms.
- Monitor and troubleshoot data workflows for performance, reliability, and cost efficiency.
- Stay current with best practices in data engineering, Databricks, and cloud-native data architectures.
- Bachelors or masters degree in computer science or a related field.
- 5+ years of hands-on experience as a Data Engineer, ETL Developer, or similar.
- Strong proficiency in Python (especially PySpark and pandas).
- Excellent command of SQL for data manipulation, performance tuning, and debugging.
- Hands-on experience with Databricks (workspace, notebooks, jobs, clusters, and Unity Catalog).
- Solid understanding of data warehousing concepts, data modeling, and distributed data processing, like dimensional modelling (Kimball) and medallion architectures.
- Experience with Azure PaaS ecosystem (data/storage).
- Strong analytical mindset and attention to detail.
- Effective communication skills and ability to work cross-functionally.
- Experience in CI/CD pipelines, automated testing, and Agile delivery.
- Databricks (PySpark, Delta Lake, Jobs, Workflows)
- Python/ Pyspark (data processing and automation)
- SQL (advanced querying and performance tuning)
- Data pipeline orchestration (e.g., Databricks Workflows), similar tools
- Version control (Git) and CI/CD familiarity
- Data modeling and transformation design
- Unity Catalog
- Familiarity with dbt or modern data stack tools.
- Knowledge of data governance, cataloging, and access control (e.g., Unity Catalog, Purview).
- Exposure to machine learning workflows and ML model data preparation.
- Strong background in performance tuning and cost optimization