Pyspark, Big Data with GCP

Diverse Lynx

  • Chennai, Tamil Nadu Gurgaon, Haryana
  • Permanent
  • Full-time
  • 2 months ago
Position - Pyspark, Big Data with GCP
  • Ability to design and develop a high-performance data pipeline framework from scratch
  • Data ingestion across systems
  • Data quality and curation
  • Data transformation and efficient data storage
  • Data reconciliation, monitoring and controls
  • Support reporting model and other downstream application needs
  • Skill in technical design documentation, data modeling and performance tuning applications
  • Lead and manage a team of data engineers, contribute towards code reviews, and guide the team in designing and developing convoluted data pipelines adhering to the defined standards.
  • Be hands on, performs POCs on the open source/licensed tools in the market and share recommendations.
  • Provide technical leadership and contribute to the definition, development, integration, test, documentation and support across multiple platforms (GCP, Python, HANA)
  • Establish a consistent project management framework and develop processes to deliver high quality software, in rapid iterations, for the business partners in multiple geographies
  • Participate in a team that designs, develops, troubleshoots, and debugs software programs for databases, applications, tools etc.
  • Experience in balancing production platform stability, feature delivery and reduction of technical debt across a broad landscape of technologies.
  • Skill in the following platform, tools and technologies
  • GCP cloud platform – GCS, Big Query, Streaming (pub/sub), data proc and data flow
  • Python, PYSpark, Kafka, SQL, shell scripting & Stored procs
  • Data warehouse, distributed data platforms and data lake
  • Database definition, schema design, Looker Views, Models
  • CI/CD pipeline
  • Proven track record in scripting code in Python, PySpark and SQL
  • Excellent structured thinking skills, with the ability to break down multi-dimensional problems
  • Ability to navigate ambiguity and work in a fast-moving environment with multiple stakeholders
  • Good communication skills and ability to coordinate and work with cross functional teams.

Diverse Lynx