
Sr.Data Scientist
- Chennai, Tamil Nadu
- Permanent
- Full-time
- Machine Learning Solution Development:
- Design, develop, and implement advanced machine learning models (supervised and unsupervised) to solve complex IT Operations problems, including Event Correlation, Anomaly Detection, Root Cause Analysis, Predictive Analytics, and Auto-Remediation.
- Leverage structured and unstructured datasets, performing extensive feature engineering and data preprocessing to optimize model performance.
- Apply strong statistical modeling, hypothesis testing, and experimental design principles to ensure rigorous model validation and reliable insights.
- AI/ML Product & Platform Development:
- Lead the end-to-end development of Data Science products, from conceptualization and prototyping to deployment and maintenance.
- Develop and deploy AI Agents for automating workflows in IT operations, particularly within Networks and CyberSecurity domains.
- Implement RAG (Retrieval Augmented Generation) based retrieval frameworks for state-of-the-art models to enhance contextual understanding and response generation.
- Adopt AI to detect and redact sensitive data in logs, and implement central data tagging for all logs to improve AI Model performance and governance.
- MLOps & Deployment:
- Drive the operationalization of machine learning models through robust MLOps/LLMOps practices, ensuring scalability, reliability, and maintainability.
- Implement models as a service via APIs, utilizing containerization technologies (Docker, Kubernetes) for efficient deployment and management.
- Design, build, and automate resilient Data Pipelines in cloud environments (GCP/Azure) using AI Agents and relevant cloud services.
- Cloud & DevOps Integration:
- Integrate data science solutions with existing IT infrastructure and AIOps platforms (e.g., IBM Cloud Paks, Moogsoft, BigPanda, Dynatrace).
- Enable and optimize AIOps features within Data Analytics tools, Monitoring tools, or dedicated AIOps platforms.
- Champion DevOps practices, including CI/CD pipelines (Jenkins, GitLab CI, GitHub Actions), infrastructure-as-code (Terraform, Ansible, CloudFormation), and automation to streamline development and deployment workflows.
- Performance & Reliability:
- Monitor and optimize platform performance, ensuring systems are running efficiently and meeting defined Service Level Agreements (SLAs).
- Lead incident management efforts related to data science systems and implement continuous improvements to enhance reliability and resilience.
- Leadership & Collaboration:
- Translate complex business problems into data science solutions, understanding their strategic implications and potential business value.
- Collaborate effectively with cross-functional teams including engineering, product management, and operations to define project scope, requirements, and success metrics.
- Mentor junior data scientists and engineers, fostering a culture of technical excellence, continuous learning, and innovation.
- Clearly articulate complex technical concepts, findings, and recommendations to both technical and non-technical audiences, influencing decision-making and driving actionable outcomes.
- Best Practices:
- Uphold best engineering practices, including rigorous code reviews, comprehensive testing, and thorough documentation.
- Maintain a strong focus on building maintainable, scalable, and secure systems.
- Education:
- Bachelors or Master's in Computer Science, Data Science, Artificial Intelligence, Machine Learning, Statistics, or a related quantitative field.
- Experience:
- 8+ years of IT and 5+yrs of progressive experience as a Data Scientist, with a significant focus on applying ML/AI in IT Operations, AIOps, or a related domain.
- Proven track record of building and deploying machine learning models into production environments.
- Demonstrated experience with MLOps/LLMOps principles and tools.
- Experience with designing and implementing microservices and serverless architectures.
- Hands-on experience with containerization technologies (Docker, Kubernetes).
- Technical Skills:
- Programming: Proficiency in at least one major programming language, preferably Python, sufficient to effectively communicate with and guide engineering teams. (Java is also a plus).
- Machine Learning: Strong theoretical and practical understanding of various ML algorithms (e.g., classification, regression, clustering, time-series analysis, deep learning) and their application to IT operational data.
- Cloud Platforms:
- Expertise with Google Cloud Platform (GCP) services is highly preferred, including Dataflow, Pub/Sub, Cloud Logging, Compute Engine, Kubernetes Engine, Cloud Functions, BigQuery, Cloud Storage, and Vertex AI.
- Experience with other major cloud providers (AWS, Azure) is also valuable.
- DevOps & Tools:
- Experience with CI/CD pipelines (e.g., Jenkins, GitLab CI, GitHub Actions).
- Familiarity with infrastructure-as-code tools (e.g., Terraform, Ansible, CloudFormation).
- AIOps/Observability:
- Knowledge of AIOps platforms such as IBM Cloud Paks, Moogsoft, BigPanda, Dynatrace, etc.
- Experience with log analytics platforms and data tagging strategies.
- Soft Skills:
- Exceptional analytical and problem-solving skills, with a track record of tackling ambiguous and complex challenges independently.
- Strong communication and presentation skills, with the ability to articulate complex technical concepts and findings to diverse audiences and influence stakeholders.
- Ability to take end-to-end ownership of data science projects.
- Commitment to best engineering practices, including code reviews, testing, and documentation.
- A strong desire to stay current with the latest advancements in AI, ML, and cloud technologies.