
ML Ops Engineer
- Bangalore, Karnataka
- Permanent
- Full-time
AI/ML: AWS Bedrock, SageMaker, Athena, Python, LangGraph, LangChain, Streamlit, Claude, Kubernetes
ML Infra: FAISS, Neo4j, MLFlow, Docker, CI/CD, Airflow, Vector DBs, Graph DBsWhat You'll Do * Productionize ML Models - Deploy models (ML & GenAI) into robust production environments using modern ML infrastructure and MLOps practices.
- Optimize for Scale & Performance - Build scalable, low-latency ML services and APIs with observability, testing, and failover mechanisms.
- Pipeline Automation - Design and implement automated training, testing, and deployment pipelines using tools like SageMaker Pipelines, Airflow, and MLFlow.
- Model Monitoring & Maintenance - Implement monitoring for model drift, data quality, and performance metrics. Own retraining and rollback strategies.
- Partner with Scientists & Engineers - Collaborate with data scientists to take notebooks to production, and with software engineers to integrate ML into customer-facing systems.
- Champion Best Practices - Define best practices for ML development lifecycle, including CI/CD for models, reproducibility, and secure deployment.
- 5+ years of experience in machine learning engineering or applied ML development
- Proven experience deploying ML models to production, maintaining APIs, and building CI/CD pipelines for ML
- Strong foundations in data engineering: ETL, batch/stream processing, and data quality practices
- Hands-on experience with MLOps tools like MLflow, SageMaker, Airflow, or similar
- Proficiency in Python and SQL; familiarity with Java or Spark is a plus
- Experience with infrastructure-as-code (e.g., Terraform) and container orchestration (Kubernetes)
- Familiarity with model monitoring, experimentation, and continuous training workflows
- Bachelor's or Master's degree in Computer Science, Engineering, or a related technical discipline
- Experience with GenAI deployment (Bedrock, LangChain, Claude, etc.)
- Familiarity with vector databases (FAISS, Pinecone) and graph databases (Neo4j)
- Exposure to A/B testing or online experimentation platforms
- Understanding of privacy, security, and governance in ML deployments
- Competitive compensation, corporate bonus program and performance rewards, company equity and retirement programs
- Medical insurance
- Generous, flexible time off
- Paid holidays, “wellness” days and company wide end of year break
- 6 months fully paid parental leave
- Learning & Development stipend
- Opportunities to volunteer and give back, including charitable donation match
- Free resources and support for your mental wellbeing