
Data Engineer
- Bangalore, Karnataka
- Permanent
- Full-time
- Design, build, and maintain scalable data pipelines that facilitate the flow of data into machine learning models.
- Collaborate with data scientists to ensure the availability of high-quality, preprocessed data for model training.
- Deploy machine learning models to production using tools such as TensorFlow Serving, MLflow, or Seldon.
- Monitor and maintain the performance of machine learning models in production, identifying and addressing issues like data drift or concept drift.
- Work with large datasets, both batch and real-time, using frameworks like Apache Spark, Apache Kafka, and AWS Glue.
- Implement and manage data versioning and experiment tracking using tools like DVC and MLflow.
- Ensure data integrity and quality through validation and profiling techniques
- Proficiency in Python and SQL; experience with Scala or Java is a plus.
- Strong experience with Apache Spark, Apache Kafka, and other data processing frameworks.
- Experience deploying machine learning models using TensorFlow Serving, MLflow, or similar tools.
- Familiarity with data lakes and data warehouses (AWS S3, Google BigQuery, Snowflake).
- Experience with cloud platforms (AWS, GCP, Azure) and containerization tools like Docker and Kubernetes.
- Understanding of machine learning workflows and collaboration with data scientists and ML engineers.
- Strong knowledge of ETL processes, batch and real-time data processing, and orchestration tools (e.g., Apache Airflow)
- 1+ years of relevant industry experience (Healthcare, Pharmaceutical Consulting, Enterprise level data-analytical solutio