
MLOps Engineer
- Pune, Maharashtra
- Permanent
- Full-time
- CI/CD Pipeline Setup: Design, implement, and maintain CI/CD pipelines for deploying ML System systems using tools like Jenkins, GitHub Actions, or GitLab CI.
- Performance & Reliability Monitoring: Monitor and optimize the performance, scalability, and reliability of ML Systems
- Infrastructure Scaling: Scale infrastructure to support AI workflows efficiently across multiple environments.
- Observability & Monitoring: Implement and manage observability tools (Prometheus, Grafana, ELK Stack) for real-time monitoring and alerting.
- Vector Database Infrastructure: Set up and manage infrastructure for vector databases to support AI driven applications.
- Focuses on the machine learning infrastructure, model deployment, and MLOps practices for AI systems.
- Responsible for designing and implementing end-to-end machine learning pipelines, automating model deployment, monitoring, and versioning across development, staging, and production environments using tools like Kubernetes, Docker, MLflow, and Azure cloud platforms, N8n etc.
- CI/CD Expertise: Strong experience with Jenkins, GitHub Actions, GitLab CI.
- Scripting: Proficiency in Python and Bash for automation of deployment, scaling, and maintenance tasks.
- Containerization & Orchestration: Hands-on experience with Docker, Kubernetes, and Helm charts.
- Infrastructure as Code (IaC): Experience with Terraform, Ansible, or CloudFormation for automated infrastructure provisioning and version control. Monitoring & Observability: Familiarity with Prometheus, Grafana, ELK Stack for system health and performance tracking.
- Cloud Platforms: Proficient in AWS, GCP, or Azure for provisioning and scaling compute, storage, and networking resources. Preferred Qualifications Experience with ML System systems or AI/ML infrastructure.
- Knowledge of vector databases (e.g., Pinecone, Weaviate, Milvus).
- Strong problem-solving and troubleshooting skills in distributed systems.
- Programming: Python, Shell scripting, YAML
- Containerization: Docker, Kubernetes, Helm
- Cloud Platforms: Azure ML
- CI/CD: GitHub Actions, Azure DevOps,
- ML Tools: MLflow, Kubeflow,
- Monitoring:, Grafana, Azure Monitor
- Orchestration: Apache Airflow, N8n, Azure Data Factory
- Experience: 3+ years in DevOps/Infrastructure, 2+ years in ML systems