Cloud Operations / SRE Engineer
SRKay Consulting Group View all jobs
- India
- Permanent
- Full-time
- Reliability & Scalability: Ensuring system reliability and scalability to meet strict regulatory requirements.
- Incident Response: Managing incident response protocols.
- Monitoring: Implementing adequate monitoring to provide comprehensive insights into the infrastructure.
- Maintenance Optimization: Building dedicated solutions to avoid downtime currently induced by regular maintenance work.
- Complexity Reduction: Lowering the complexity of architecture parts to mitigate regular maintenance overhead.
- AWS account management and IAM
- Terraform state and drift management
- Monitoring, patching, and backups
- Infrastructure-level incident response
- Documentation and operational maturity
- This is infrastructure ownership - not application development.
- 3 - 6 years experience
- 2+ years hands-on AWS
- Practical Terraform experience (implementing changes under review)
- Exposure to monitoring, patching, and incident response