
Cloud Platform Operations Engineer
- Hyderabad, Telangana
- Permanent
- Full-time
- Hands-on knowledge of cloud operations in multi-cloud environment, architecture with advanced skills in one or more cloud platforms (AWS preferred, Azure or GCP).
- Be able to use professional knowledge and problem determination / source identification skills to resolve problems involving cloud APIs, application services, IaaS, PaaS, micro-services, containers, middleware components, network, security and infrastructure issues alike. If unable to resolve, will triage and route the incident to the appropriate level of support.
- Establish processes and methods to efficiently operate and manage a multi-cloud ecosystem with agility and velocity
- Work directly with engineering teams to ensure that customer issues are resolved as expediently as possible, and root causes are addressed utilizing continuous improvement methodologies
- Collaborate with the Site Reliability Team (SRE) team to improve reliability of the environment through practical application and feedback
- Identify use cases for AIOps, Automation through scripts and tools deployment for monitoring and proactive service delivery.
- Understand cloud usage costs and FinOps practices and deliver cloud cost optimization solutions.
- Performance Monitoring and Optimization: Monitor system performance, identify bottlenecks, and implement improvements to ensure optimal performance and scalability.
- Automation and Scripting: Develop and maintain automation scripts to streamline operations, improve efficiency, and reduce manual intervention.
- Security and Compliance: Ensure cloud infrastructure complies with security standards and regulatory requirements. Implement best practices for data protection and access control.
- Cost Optimization: - Identify areas of cost optimization in cloud and work with various teams to implement those gaps.
- Continue to evolve the operations team as they grow, and the Cloud space matures.
- Demonstrated ability to think tactically and strategically about solutions to business, product, and technical challenges.
- Lead a team in supporting our business teams worldwide by providing critical product support
- Technical Support: Provide advanced technical support to resolve complex issues and ensure minimal disruption to services.
- Training and Mentorship: Train and mentor junior engineers, sharing knowledge and best practices to build a strong, capable team.
- Drive the team to improve operational efficiency for all services through the identification and development of SLAs, metrics, monitors, procedures, tools, and documentation.
- Experienced with the ITIL processes of Incident (including Critical Incident Management), Problem, Change Management and Integrated Service Level Management.
- Incident Change and Service Request Management: Lead incident response efforts, perform root cause analysis, and implement preventive measures to avoid future incidents. Oversee and ensure timely completion of changes and service requests.
- Drive refinement of metrics that are SMART, drive action from those metrics and report outcomes to senior leadership.
- Complete analysis and present periodic reviews of operational performance and KPIs.
- Collaboration: Work with development, network, security, and other teams to ensure seamless integration and operation of cloud services. Collaborate with customers to understand their needs and provide tailored solutions.
- On-Call Coverage: Participate in an on-call schedule and collaborate with remote teams to ensure 24/7 coverage.
- Documentation: Maintain comprehensive documentation of cloud infrastructure, configurations, procedures, and troubleshooting guides pertaining to operations teams.
- Continuous Improvement: Identify areas for improvement in existing processes and toolsets. Implement enhancements to increase efficiency, reliability, and customer satisfaction.
- Bachelor’s Degree from an accredited institution required.
- A minimum of 8+ years of professional experience within an IT function and/or consulting team
- Minimum 5 years of hands-on cloud computing/technology experience
- Working knowledge of Pharmaceutical regulatory requirements, qualification and validation of applications.
- Experience in sophisticated, global, matrixed organizations within a world class IT function
- Strong background in automation, and familiarity with Terraform
- Experience leading large-scale IT projects, preferably in the cloud migration space
- Comfortability with AWS, Azure, GCP and/or, AVS, Oracle Cloud Infrastructure (OCI)
- Experience managing budgets and financials associated with large scale projects.
- Solving problems at their root, stepping back to understand the broader context
- Defining metrics for larger, more complex projects and productivity
- You have experience in understanding the needs of both business and end customers and translating them into right solutions.
- Role located in Hyderabad (relocation required)
- Availability to work flexible work hours is/may be required. This team will support continuous operations across two shifts and therefore, this role will require non-standard work hours, and some work on weekends and holidays. Appropriate adjustments in benefits will be provided for employees working non-standard hours where applicable