Site Reliability Engineer
Qualys View all jobs
- Pune, Maharashtra
- Permanent
- Full-time
- Monitor and maintain system reliability, availability, and performance.
- Support day-to-day operations of non-production environments.
- Troubleshoot Linux-based systems, applications, and infrastructure issues.
- Assist in incident management, root cause analysis (RCA), and problem resolution.
- Work on system monitoring tools and alerting mechanisms.
- Automate repetitive operational tasks using scripting (Shell/Python preferred).
- Collaborate with development and DevOps teams to improve system stability.
- Ensure adherence to best practices in system security and compliance.
- RHCE (Red Hat Certified Engineer/ RHCA ) certification is mandatory.
- Strong knowledge of Linux/Unix systems administration.
- Basic understanding of SRE concepts . Familiarity with monitoring tools (e.g., Prometheus, Grafana, Nagios, etc.).
- Basic scripting knowledge (Shell, Bash, or Python).
- Understanding of networking fundamentals (TCP/IP, DNS, HTTP/HTTPS).
- Exposure to cloud platforms (AWS/Azure/GCP) is a plus.
- Good troubleshooting and analytical skills.
- Knowledge of containerization tools like Docker and Kubernetes (basic level).
- Familiarity with CI/CD tools (Jenkins, GitLab CI, etc.).
- Understanding of version control systems like Git.
- Strong willingness to learn and adapt to new technologies.
- Ability to work in a fast-paced environment.
- Eagerness to learn and grow in the SRE/DevOps domain.
- Bachelor’s degree in Computer Science, IT, or related field .
- Manage and maintain Linux-based lab systems and environments.
- Perform installation, configuration, and troubleshooting of Linux servers.
- Monitor system performance and ensure high availability of lab resources.
- Assist in deployment, setup, and validation of lab environments.
- Troubleshoot system, network, and application-related issues.
- Strong expertise in troubleshooting complex Linux system issues, including kernel, memory, CPU, disk I/O, and network performance analysis.
- Good understanding of Linux system architecture, including CPU, memory management, file systems, networking, and system services.
- Basic knowledge of Hadoop ecosystem (HDFS, YARN, MapReduce).
- Exposure to big data environments and distributed systems.