Scientist 5, Data Science (12+ years experience . Generative AI, interpretable ML, Statistics)
Western Digital View all jobs
- Bangalore, Karnataka
- Permanent
- Full-time
We’re hiring a hands-on technologist to lead the design, delivery and operational scaling of mission-critical AI/ML systems. This is a senior technical leadership role, combining end-to-end system design, deep ML engineering, research translation and team enablement. You will set technical direction, unblock engineering teams, and take direct ownership for measurable business outcomes.Core responsibilities
- Own end-to-end system design and architecture for production ML/AI solutions — from problem framing, data design, model selection and infra to monitoring, runbooks and cost control.
- Lead hands-on technical delivery: prototype, validate, harden and ship models and agentic components into live systems; ensure reliability, observability, and automated CI/CD for models and data pipelines.
- Act as the single technical escalation point — remove blockers, resolve cross-team technical tradeoffs, and make final architecture decisions that balance performance, scalability, cost and vendor dependencies.
- Mentor and grow engineering teams (ML engineers, data engineers, OR engineers), set engineering standards, code/architecture reviews and champion best practices (MLOps, testing, data contracts).
- Translate research into product: evaluate papers, run experiments, lead IP efforts (patents, trade secrets) and supervise research to production pipelines.
- Drive multiple projects in parallel with clear prioritization, milestones and delivery SLAs; align technical plans to business KPIs.
- Define, track and report success metrics and ROI for all ML initiatives; continuously tune models and design experiments for measurable impact.
- Collaborate with product, platform, security, legal and operations teams to ensure compliance, data privacy and safe, explainable model behavior.
- Works closely with Product, Platform, Security, Legal, and Business stakeholders.
- Become a visible technical leader within the company — represent technical strategy externally when needed.
- 12+ years of experience in Machine Learning/AI engineering and solution delivery, across classical ML, generative models and agentic/LLM-based systems.
- Proven ability to design production ML platforms (data ingestion → training → serving → monitoring → retraining) with scalability, reliability and cost awareness.
- Deep expertise in system & distributed design: data architectures, feature stores, model serving, streaming/batch pipelines, autoscaling, retries/poison-pill handling and disaster recovery.
- Strong MLOps and DevOps experience: CI/CD for models, monitoring (data + model drift), A/B testing, canary deployment and rollback strategies.
- Strong practical experience in classical ML, deep learning and modern LLM/agentic systems (RAG, fine-tuning, evaluation, guardrails) using Python and modern ML frameworks (PyTorch/TensorFlow).
- Experience with CI/CD for models, containerization (Docker/Kubernetes), model serving, monitoring, drift detection and automated retraining pipelines.
- Strong coding and service design (APIs/microservices), testing practices, high-availability design, observability and incident handling for live AI systems.
- Experience mentoring/leading senior engineers and small cross-functional teams; comfortable as the technical owner across several concurrent initiatives.
- Prior experience publishing research and participating in IP creation (patent filings, trade secrets) is required.
- Excellent communication skills — able to present technical tradeoffs to both engineering and executive stakeholders.
- Background with reinforcement learning, foundation models, LLMs and agent orchestration frameworks.
- Hands-on with cloud platforms (AWS/GCP) and on-prem hybrid deployments.
- Strong software engineering fundamentals: scalable microservices, API design, security best practices, and cost optimization.
- Familiarity with optimization/OR techniques and integrating them with ML pipelines.
- Lead architecture reviews and design sessions; produce clear system diagrams, component ownership, latency/capacity budgets, cost estimations and failure-mode analyses.
- Define data contracts, SLAs, service-level objectives and monitoring thresholds for every deliverable.
- Ensure designs are modular, testable and observable — with clear automation for deployment, rollback and incident response.
- Make pragmatic architecture choices: prefer simpler solutions that meet business needs and constrain cost/dependencies; justify when heavy engineering is necessary.