
Associate Software Engineer II, Python, Big Data, Hadoop, PySpark
- Noida, Uttar Pradesh
- Permanent
- Full-time
- Work under supervision of Senior Data Engineers to gather requirements to create Datamodel for Data Science & Business Intelligence projects
- Engage in client communications for all important functions including data understanding/exploration, strategizing solutions etc.
- Document the Metadata information about the data sources used in the project & present that information to team members during team meetings
- Develop Data Marts, De-normalized views & Data Models for projects
- Develop Data Quality control processes around the data sets used for analysis
- Should be able to create/analyze/optimize complex SQL queries
- Lead and Drive Knowledge sharing session within the team
- Work with Senior team members to develop new capabilities for the team
- Comply with the terms and conditions of the employment contract, company policies and procedures, and any and all directives (such as, but not limited to, transfer and/or re-assignment to different work locations, change in teams and/or work shifts, policies in regards to flexibility of work benefits and/or work environment, alternative work arrangements, and other decisions that may arise due to the changing business environment). The Company may adopt, vary or rescind these policies and directives in its absolute discretion and without any limitation (implied or otherwise) on its ability to do so
- Bachelors or 4-year university degree
- 3+ years of experience
- Good Understanding of Python programming language
- Understanding of - Big Data, Hadoop, PySpark, Distributed or Parallel Processing, Map Reduce
- Good Knowledge of Databricks and Snowflake
- Knowledge or Experience on Cloud Technologies - Azure or AWS or GCP
- Understanding Relational Database Model and Entity Relation diagrams
- Good Knowledge on Relational Databases - SQL Server, Oracle, Teradata
- Knowledge on Orchestration tool - AirFlow, Data Factory, Databricks Workflows or Jobs
- Configuration Management - GitHub
- Relevant Databricks Certifications
- Knowledge or experience in messaging Queues - Kafka or ActiveMQ or RabbitMQ
- Knowledge or experience on CI or CD Tools - GitHub Actions
- Knowledge or experience in Unix Shell Scripting for automation and scheduling Batch Jobs
- Knowledge or experience using Microsoft Excel, Power Point
- Knowledge on Agile or Scrum