SRE with AIPO and Dynatrace

Virtusa

  • Chennai, Tamil Nadu
  • Permanent
  • Full-time
  • 2 months ago
Knowledge & Experience:Minimum of 6 years of relevant work experience in critical production environmentsExperience in enabling observability within applications to extract appropriate telemetry into suitable back ends like DynatraceHands-on experience of curating Service Level Objectives, defining Error Budgets and refining the change management lifecycle to accommodate the sameKnowledge and experience with CI CD pipelines and deployment patterns like CanaryAnalytics of application telemetry and AIOps enablement using Dynatrace Davis or an alternative product in combination with any other tools for orchestrationHas experience defining an SRE capability charter and roadmap for all dependent teamsHas experience successfully running and providing leadership to DevOps or SRE teams (preferred)Working knowledge of SQL and troubleshooting by writing queries is keyKnowledge of containerized solutions and orchestration tools like KubernetesCore Capabilities:Understand and demonstrate application of SRE principles, particularly toil reduction, blameless post mortems, monitoring distributed systems and release engineeringIndepth knowledge of any observability product like Dynatrace, Splunk or ELK stack covering synthetic monitoring, RUM and APMAbility to instrument microservices applications via OpenTelemetry to extract traces is beneficialExperience administering applications and infrastructure services in hyperscaler environments such as AWS, Azure or GCP is keyHands-on experience in writing Python scripts and Ansible templates for application deployment automation or other automations is importantAbility to diagnose and debug systems at the code level (Java preferred) is beneficialQualification:ITIL4 certification is mandatory. Achieving Practitioner or Intermediate level certifications are preferredSRE Foundation certification via PeopleSoft or DevOps Institute is beneficialAWS Solutions Architect Associate qualification or alternative from another Cloud Service Provider is preferredRole & Responsibilities:Formulate the detailed SRE rollout plan and execute a transformation roadmapContinuously seek to uplift the maturity of SRE implementation and improve SLO, MTTR, MTTD as well as any other relevant KPIs identifiedEngage in on call and critical operations support activities while leading blameless post mortemsDirect liaison with customers remotely and face to face for stakeholder managementFormulate a plan to eliminate toil by lowering incident volume, eliminating noise from alerts, automating manual processes, and converting workarounds into system featuresWork with Development, QA and other squads to design, build and rollout reliability features into the applications being deliveredLead a team of SREs deployed on the ground while being engaged hands on

Virtusa