Data Engineer

Virtusa

  • Chennai, Tamil Nadu
  • Permanent
  • Full-time
  • 6 days ago
Design, develop, and optimize data pipelines using Databricks, PySpark, and Spark SQL.Build and deploy machine learning models using AWS SageMaker.Collaborate with data scientists, analysts, and business stakeholders to understand requirements and deliver solutions.Perform data wrangling, feature engineering, and model evaluation.Monitor and maintain production-grade ML workflows and pipelines.Ensure data quality, security, and compliance across all stages of the pipeline.Document processes and contribute to best practices in data engineering and ML operations.Required SkillsStrong proficiency in PySpark, Spark SQL, and Databricks notebooks.Experience with AWS SageMaker, including model training, tuning, and deployment.Solid understanding of data lake architectures, ETL processes, and cloud computing (AWS).Familiarity with ML algorithms, model evaluation techniques, and MLOps practices.Proficiency in Python and SQL.Experience with Git, CI/CD pipelines, and workflow orchestration tools (e.g., Airflow).Databricks (PySpark + Spark SQL) and / or AWS SageMaker, at a minimum.

Virtusa