
PySpark Scala Developer - Consultant / Senior Consultant
- Bangalore, Karnataka
- Permanent
- Full-time
- Design and develop scalable data pipelines using Apache Spark and Scala
- Optimize and troubleshoot Spark jobs for performance (e.g. memory management, shuffles, skew)
- Work with massive datasets in on-prem Hadoop clusters or cloud platforms like AWS/GCP/Azure
- Write clean, modular Scala code using functional programming principles
- Collaborate with data teams to integrate with platforms like Snowflake, Databricks, or data lakes
- Ensure code quality, documentation, and CI/CD practices are followed
- 3+ years of experience with Apache Spark in Scala
- Deep understanding of Spark internals—DAG, stages, tasks, caching, joins, partitioning
- Hands-on experience with performance tuning in production Spark jobs
- Proficiency in Scala functional programming (e.g. immutability, higher-order functions, Option/Either)
- Proficiency in SQL
- Experience with any major cloud platform: AWS, Azure, or GCP
- Worked with Databricks, Snowflake, or Delta Lake
- Exposure to data pipeline tools like Airflow, Kafka, Glue, or BigQuery
- Familiarity with CI/CD pipelines and Git-based workflows
- Comfortable with SQL optimization and schema design in distributed environments