DevOps Engineer - Senior (SRE)
Cognyte View all jobs
- Pune, Maharashtra
- Permanent
- Full-time
- Maintain the reliability, performance, and availability of systems and infrastructure.
- Work closely with R&D, system administrators, and architects.
- Design and implement scalable and robust systems.
- Resolve operational and production issues.
- Ensure systems meet internal and external SLOs.
- Design, build, and support production infrastructure and monitoring.
- Implement alerting and incident response processes.
- Monitor system health and improve performance.
- Build dashboards and share reliability metrics.
- Respond to incidents and take part in on‑call support.
- Perform root cause analysis and drive corrective actions.
- Automate operational tasks and deployments.
- Optimize system performance and resource usage.
- Plan capacity and support scaling, failover, and disaster recovery.
- Document processes and incident resolutions.
- Communicate system status and improvements clearly.
- Collaborate with cross‑functional teams to achieve shared goals.
- Bachelor’s degree in Computer Science, Engineering, or related field; advanced degree preferred.
- 5–7 years of experience in Site Reliability Engineering or a related role.
- Strong knowledge of distributed systems, cloud platforms, and modern infrastructure.
- Hands‑on experience with monitoring and observability tools (Prometheus, Grafana, ELK).
- Proficiency in Python, Go, Bash, or similar scripting languages.
- Experience with incident management and root cause analysis.
- Exposure to on‑prem environments.
- Familiarity with CI/CD pipelines and tools.
- Understanding of DevOps practices and methodologies.
- Strong analytical and problem‑solving skills.
- Excellent written and verbal communication skills.