
Software Engineer (Site Reliability)
- Hyderabad, Telangana
- Permanent
- Full-time
- Design, implement, and maintain scalable and reliable systems and services
- Monitor system performance, availability, and reliability, proactively identifying and resolving issues.
- Apply Databricks observability tools to develop and maintain dashboards, alerts, and reporting mechanisms that provide insights into system performance and usage.
- Establish and improve observability frameworks to supervise key performance indicators (KPIs) and service-level objectives (SLOs).
- Respond to and fix production incidents, performing root cause analysis and implementing corrective actions to prevent future occurrences.
- Collaborate with multi-functional teams to ensure effective incident response processes and documentation.
- Develop automation scripts and tools to streamline operational tasks, improve deployment processes, and enhance system reliability.
- Supply to the continuous improvement of deployment pipelines and infrastructure as code (IaC) practices.
- Work closely with development teams to understand application architectures and give to system design discussions.
- Document processes, best practices, and system architecture to facilitate knowledge sharing and onboarding.
- Analyze system performance and application usage patterns to recommend and implement optimizations that improve efficiency and reduce costs.
- Education:
- Bachelor’s degree in Computer Science, Engineering, or a related field, or equivalent practical experience.
- Experience:
- 3 to 6 years of experience in Platform Engineering, DevOps, or a related field.
- Experience with Databricks, including its observability and monitoring features.
- Experience in Grafana observability platform
- Familiarity with cloud platforms (Azure)
- Technical Skills:
- Programming languages skills: Python, Scala.
- SQL knowledge for data extraction and transformation
- Experience in Power BI development, both semantic models & visualizations
- Experience in Grafana visualizations
- Soft Skills:
- Problem-solving skills and the ability to work in a fast-paced, collaborative environment.
- Good communication skills, with the ability to convey sophisticated technical concepts to non-technical collaborators.
- A proactive attitude with a focus on continuous improvement and learning.
- Open to explore and experiment new SRE processes and tools to support technical requirements of the D&A platform
- Willing to proactively seek new opportunities to learn and adopt new knowledge into practice.
All the available job opportunities are posted either on our website - pgcareers.com, or on our official social media pages, for the convenience of prospective candidates, and do not require them to pay any kind of fees towards their application.”Job Schedule Full timeJob Number R000134576Job Segmentation Experienced Professionals (Job Segmentation)