EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our customers, our employees, and our communities. We embrace a dynamic and inclusive culture. Here you will collaborate with multi-national teams, contribute to a myriad of innovative projects that deliver the most creative and cutting-edge solutions, and have an opportunity to continuously learn and grow. No matter where you are located, you will join a dedicated, creative, and diverse community that will help you discover your fullest potential.We are seeking an experienced Devops/ AIOps Architect to design, architect, and implement an AI-driven operations solution that integrates various cloud-native services across AWS, Azure, and cloud-agnostic environments. The AIOps platform will be used for end-to-end machine learning lifecycle management, automated incident detection, and root cause analysis (RCA). The architect will lead efforts in developing a scalable solution utilizing data lakes, event streaming pipelines, ChatOps integration, and model deployment services. This platform will enable real-time intelligent operations in hybrid cloud and multi-cloud setups.ResponsibilitiesAssist in the implementation and maintenance of cloud infrastructure and servicesContribute to the development and deployment of automation tools for cloud operationsParticipate in monitoring and optimizing cloud resources using AIOps and MLOps techniquesCollaborate with cross-functional teams to troubleshoot and resolve cloud infrastructure issuesSupport the design and implementation of scalable and reliable cloud architecturesConduct research and evaluation of new cloud technologies and toolsWork on continuous improvement initiatives to enhance cloud operations efficiency and performanceDocument cloud infrastructure configurations, processes, and proceduresAdhere to security best practices and compliance requirements in cloud operationsRequirementsBachelor's Degree in Computer Science, Engineering, or related field12+ years of experience in DevOps roles, AIOps, OR Cloud ArchitectureHands-on experience with AWS services such as SageMaker, S3, Glue, Kinesis, ECS, EKSStrong experience with Azure services such as Azure Machine Learning, Blob Storage, Azure Event Hubs, Azure AKSStrong experience with Infrastructure as Code (IAC)/ Terraform/ Cloud formationProficiency in container orchestration (e.g., Kubernetes) and experience with multi-cloud environmentsExperience with machine learning model training, deployment, and data management across cloud-native and cloud-agnostic environmentsExpertise in implementing ChatOps solutions using platforms like Microsoft Teams, Slack, and integrating them with AIOps automationFamiliarity with data lake architectures, data pipelines, and inference pipelines using event-driven architecturesStrong programming skills in Python for rule management, automation, and integration with cloud servicesNice to haveAny certifications in the AI/ ML/ Gen AI spaceWe offer/BenefitsOpportunity to work on technical challenges that may impact across geographiesVast opportunities for self-development: online university, knowledge sharing opportunities globally, learning opportunities through external certificationsOpportunity to share your ideas on international platformsSponsored Tech Talks & HackathonsUnlimited access to LinkedIn learning solutionsPossibility to relocate to any EPAM office for short and long-term projectsFocused individual developmentBenefit package:
Health benefits
Retirement benefits
Paid time off
Flexible benefits
Forums to explore beyond work passion (CSR, photography, painting, sports, etc.)