Lead Site Reliability Engineer (SRE), Global Operations Centre

Qvantel

  • Hyderabad, Telangana Secunderabad, Telangana
  • Permanent
  • Full-time
  • 22 days ago
Greetings from Qvantel! As the Lead Site Reliability Engineer (SRE), you will play a critical role in ensuring the reliability, scalability, and performance of our infrastructure & production systems. You will be responsible for supporting a team of SREs, developing and implementing SRE best practices, and driving continuous improvement across our infrastructure. You'll work closely with cross-functional teams to enhance system resilience, automate processes, and maintain smooth operations. Your expertise will contribute to the success of our organization by driving excellence in system reliability.. We have an opportunity for Lead Site Reliability Engineer (SRE), Global Operations Centre for our organization. About Us:- Qvantel is a global product-based company headquartered in Helsinki, Finland, with a presence in over 7 countries. We are a SAAS-based Telecom BSS (Business Support Systems) company. Job Title: Lead Site Reliability Engineer (SRE), Global Operations Centre Location: Hyderabad, India. Roles & Responsibilities: Must to have: 1.System Monitoring and Automation: Monitor system health, performance, and availability. Implement automation to proactively address issues and improve system reliability. Create effective monitoring systems to detect anomalies and respond promptly. 2.Incident Management and Response: Lead incident response efforts during critical incidents. Collaborate with teams to resolve incidents swiftly and minimize impact. Participate in on-call rotations to ensure 24/7 support. Continuously analyze repetitive incidents, system performance degradations patterns and follow Problem Management process. 3.Infrastructure Operations Management: Manage infrastructure components, including servers, databases, and networking. Optimize resource utilization and scalability. Ensure proper configuration management and version control. 4.Performance Optimization, Capacity Planning and Scaling: Identify bottlenecks and optimize system performance. Work on load testing, caching, and latency reduction. Analyze system capacity requirements and plan for scaling. Work with engineering/architecture teams to share feedback, accommodate growth and changing demands. 5.Collaboration and Work Etiquette: Support a team of SREs, providing mentorship and guidance. Foster a culture of openness, honesty, and inclusivity. Collaborate with cross-functional teams, including developers, QA, and Release Engineers. 6.Documentation and Knowledge Sharing: Document processes, procedures, and best practices. Share knowledge within the team and across the organization. 7.SLO Exposure & Tracking: Strong Technical Exposure & proven demonstration on the SLO Definitions, tracking & the four golden signals of monitoring: Latency (Request Service Time), Traffic (User Demand), Errors (Rate of Failed Requests) and Saturation (Overall Capacity of the System) Good to have: Define or support enhancing the technology strategy for infrastructure and tooling. Stay informed about industry trends and emerging technologies. Drive continuous improvement in reliability practices. Qualifications: Bachelor's degree in Computer Science, Engineering, or related field. Proven experience as an SRE or similar role. (6+ years) Strong expertise in Linux, cloud platforms (AWS, GCP, Azure preferred), and scripting languages (Python, Bash) Experience with open-source monitoring tools stack (e.g., Prometheus, Grafana, ELK, Nagios, Zabbix etc..) and configuration management tools. Excellent problem-solving skills and attention to detail. Strong communication and collaboration skills. Ability to work independently and as part of a team. A passion for building and maintaining reliable, scalable systems. What We Offer: Competitive salary, excellent company benefits, and a global and diverse team of colleagues. At Qvantel, you get to work with major international accounts as they work towards the true digital transformation of their services and offer value and satisfaction to their customers. Qvantel is a fast-growing, customer-oriented technology company with dynamic and international culture. At Qvantel, people are encouraged to learn and develop themselves. They are used to working independently and in teams and have a hands-on working style and a can-do attitude. To know more about us please visit https://www.qvantel.com/company For additional details, please get in touch with the Recruiter by phone at +91 995 994 7774 or [HIDDEN TEXT].

foundit

Similar Jobs

  • Operations & Site Reliability Engineer

    Apple

    • Hyderabad, Telangana
    People at Apple don't just build products - they craft the kind of experience that has revolutionised entire industries. The diverse collection of our people and their ideas encour…
    • 11 days ago
  • Site Reliability Engineer, Lead I

    myGwork

    • Hyderabad, Telangana
    • Secunderabad, Telangana
    This inclusive employer is a member of myGwork - the largest global platform for the LGBTQ+ business community. About the Role: Grade Level (for internal use): 11 The Team: This te…
    • 23 days ago
  • Operations & Site Reliability Engineer

    Apple

    • Hyderabad, Telangana
    • Secunderabad, Telangana
    Key Qualifications Strong sense of ownership, customer service, and integrity demonstrated through clear communication Experience in leading and driving operations teams for large …
    • 9 days ago