
Sr Cloud Operations Specialist
- Noida, Uttar Pradesh
- Permanent
- Full-time
About The Team:
The Incident Management Team, being part of the IT Service Management (ITSM), works cross-functionally with Global Services, Engineering, Cloud Hosting and Management on the effective delivery of UKG's Cloud SaaS offerings.About The Role:The IT Service Operations Specialist provides day-day support for all the ongoing customer facing & internal cloud infrastructure related incidents. In addition, they will work closely with the leads on operational improvement initiatives.Responsibilities:
- Acknowledge incoming incidents via PagerDuty and spin-up a bridge
- Gather the initial information and document them in ServiceNow
- Adopt/Learn the internal automation tools for incident logging and tracking
- Learn various internal product & engineering team structures to effectively lead the bridges/war rooms
- Effectively lead the Incident bridges by taking charge of the room, leading the response teams (engineers, support specialists) to diagnose, troubleshoot, and resolve issues impacting applications to timely mitigate customer-impacting incidents.
- Engage with global communications teams for status page and external customer communications throughout the lifecycle of the incident
- Maintain the quality of the data captured in all the tools used in ITSM (PagerDuty, Service Now, JIRA..etc)
- Learn the new product features for effective management of incident bridges
- Complete all organizational trainings timely
- Thrive under pressure with the ability to stay calm, handle conflict, and partner with other UKG teams to drive resolution
- Develop and monitor key metrics to understand incident trends, as well as operational resilience and readiness
- 3+ years of experience supporting a global 24x7x365 incident management team in a SaaS environment
- 3+ years of technical experience (Support, Services, IT, Engineering) at a tech company with exposure working with complex customer base
- 1+ years of working in a Cloud (AWS or GCP or Azure; GCP preferred) environment
- 2+ years of working in a scrum/agile/SRE environment (hands-on experience will be a PLUS)
- 2+ years of working in on-call support rotation model and PagerDuty experience
- 2+ years of working experience with Teams (integrations with PagerDuty and Service Now), Confluence and Share Point
- Subject matter expertise in incident management frameworks; awareness of industry standards and best practices
- Experience with working with the following tools: JIRA, ServiceNow, Salesforce, and Aha
- Experience working in an Agile technical environment
- Experience working in a Cloud environment
- Excellent problem-solving and decision-making skills to identify root causes and implement corrective actions
- Demonstrated ability to collaborate, build credibility, and establish good working relationships with leaders across UKG to ensure solid partnership and alignment
- Willingness/Ability to work in shift-based rotation model in a larger enterprise incident management team