
Manager - Site Reliability (SRE)
- Hyderabad, Telangana
- Permanent
- Full-time
- At least 12+ years of prior demonstrated experience in a Site Reliability Engineering, DevOps, or an Infrastructure-focused role
- 3+ years of experience leading and managing high performance SRE teams
- Proven track record in leading sophisticated SRE projects, enterprise services at a large scale
- Strong analytical, troubleshooting and problem solving skills
- Good knowledge in at least one object oriented programming language (preferably Java, Python)
- Unix Performance Monitoring & Tuning
- Good understanding of Database concepts, PL/SQL and NoSql Technologies
- Hands on experience with monitoring and data analysis tools (e.g., Prometheus, Splunk, Grafana, Cloudwatch)
- Building and operating container orchestrating systems like Kubernetes or EKS
- Deep understanding of security concepts and protocols -authentication, authorization, signing, encryption, SSL/TLS, SSH/SFTP, PKI, X509 certificates and PGP
- Good fundamentals on Release Management & continuous Integration
- Familiarity with modern web services architectures, cloud platforms such as AWS, GCP, Azure and distributed storage systems (ScaleIO, Amazon S3)
- Ability to communicate with large cross-functional teams about various engineering topics such as system architecture, detailed design, APIs, project schedules etc.
- Ability to make right trade-off choices when dealing with functional complexity, conflicting priorities and aggressive schedules
- Represent the team and remove hurdles to enable each team member to operate at the highest level of efficiency and productivity
- Ability to hire, mentor and manage the performance of a large team
- Ability to connect with senior executives and business stakeholders
- A learning attitude to continuously improve self, team and the organisation
- Ability to work under pressure and manage difficult situations in a fast-paced work environment
- Bachelor or Masters or equivalent experience in Computer Science or other related field
- Java and JVM technologies runtime configurations and troubleshooting is preferred
- Good fundamentals on data modelling and machine learning algorithms
- Strong knowledge on securing applications, thorough understanding of OWASP top 10 risks and solutions.