
Site Reliability Engineer III
- India
- Permanent
- Full-time
- Develop and manage GitHub Actions/Workflows to automate routine operations (e.g., password rotation, environment administration).
- Optimize CI/CD pipelines to streamline deployments and system updates.
- Implement and manage alerting, anomaly detection, and metrics collection.
- Build and maintain dashboards and reports using CloudWatch and other tools to monitor health and usage.
- Support and enhance container-based environments (Docker or equivalent).
- Maintain infrastructure using tools like Terraform and Ansible.
- Assist in migration and support of Windows-based services to Linux, including basic scripting in PowerShell and .NET (F# or C#).
- Identify and implement opportunities for cloud cost reduction through monitoring and analysis.
- Experience in DevOps, Site Reliability Engineering, or a related field. Equivalent practical experience and transferable skills are welcome.
- Experience with cloud services (AWS or similar) preferred.
- Familiarity with PowerShell and Windows Server (legacy support).
- Hands-on experience with GitHub and GitHub Workflows.
- Experience with infrastructure-as-code tools such as Terraform and/or Ansible.
- Comfortable with Bash or comparable scripting languages, Linux, and containers (Docker or similar).
- Understanding of observability tools and practices (metrics, alerting, dashboards).
- Ability to perform basic coding or debugging in .NET (F# or C#).