Databricks Data Engineer
Hitachi Solutions View all jobs
- Pune, Maharashtra
- Permanent
- Full-time
- We recognize our profitability and project success comes from our team-great people doing great things. As such, we pursue profitable growth and expanded opportunities for our team.
- We offer challenging and diverse work across multiple industries and reward creativity and entrepreneurial innovation.
- We respect, encourage, and support each individual needs to continually learn and grow personally and professionally. We are committed to fostering our people.
- We listen. Every employee has something important to say that can contribute to enriching our environment.
- We compensate fairly. And while employees might come for the paycheck, they stay for the people. Our people are the reason we are exceptional. This is something we never forget.
- Collaborate: Work collaboratively with other engineers to architect and implement complex systems with minimal oversight, while partnering with team leadership to identify how best to improve and expand platform capabilities.
- Build & Design: Design, develop, and maintain complex data pipeline products that support business-critical operations and large-scale analytics applications.
- Support the Team: Partner with analytics, data science, and engineering teams to understand and solve their unique data needs and challenges.
- Continuous Learning: Dedicate time to staying current with the latest developments in the space and embrace new concepts to keep up with fast-moving data engineering technology.
- Autonomy: Enjoy a role that offers strong independence and autonomy while contributing to the technical maturity of the organization.
- Experience: 5+ years of Data Engineering experience, including 4+ years designing and building Databricks data pipelines is essential. While Azure cloud experience is preferred, we are happy to consider experience with AWS, GCP, or other cloud platforms.
- Technical Stack:
- 4+ years of hands-on experience with Python, Pyspark, or SparkSQL is key. Experience with Scala is a plus.
- 4+ years of experience with Big Data pipelines or DAG Tools (such as Airflow, Dbt, Data Factory, or similar).
- 4+ years of Spark experience, especially with Databricks Spark and Delta Lake.
- 4+ years of hands-on experience implementing Big Data solutions in a cloud ecosystem, including Data/Delta Lakes.
- 5+ years of relevant software development experience, including Python 3.x, Django/Flask Frameworks, FastAPI (or other standard industry API), and relational databases (SQL/ORM).
- Additional Skills (Great to have):
- Experience with Microsoft Fabric specifically with Pyspark on Fabric and Fabric Pipelines
- Experience with conceptual, logical, and/or physical database designs.
- Strong SQL experience, specifically writing complex, highly optimized queries across large volumes of data.
- Strong data modeling/profiling capabilities using Kimball/star schema methodology as well as medallion architecture.
- Professional experience with Kafka, EventHub or other live streaming technologies.
- Familiarity with database deployment pipelines (e.g., dacpac or similar).
- Experience with unit testing or data quality frameworks.