Lead Site Reliability Engineer (Expert LAN & Wireless plus Automation)

SITA

  • Delhi
  • Permanent
  • Full-time
  • 1 day ago
Job Description:OverviewWELCOME TO SITAWe're the team that keeps airports moving, airlines flying smoothly, and borders open. Our tech and communication innovations are the secret behind the success of the world's air travel industry.You'll find us at 95% of international hubs. We partner closely with over 2,500 transportation and government clients, each with their own unique needs and challenges. Our goal is to find fresh solutions and cutting-edge tech to make their operations run like clockwork. Want to be a part of something big?Are you ready to love your job? The adventure begins right here, with you, at SITA.PURPOSEKEY RESPONSIBILITIES
  • Define build and maintain support systems to ensure high availability and performance.
  • Work closely with Product, Engineering & Service support architects for new product productization as Operation technical expert and as well in reviewing non standard bids to check operability feasibility.
  • Ensure Operations readiness to support new products and ensure they are trained to support effectively.
  • SREs are responsible for making sure that the systems and services they support meet the non-functional requirements defined by the business, the users, and the organization. They are the guardians of reliability and availability, ensuring that systems perform as expected, scale appropriately, and are resilient to failures
  • Defining and Understanding NFRs:SREs collaborate with stakeholders to understand and define NFRs, such as performance targets (response time, throughput), scalability limits, security requirements (encryption, authentication), and maintainability goals (ease of updates, error handling).
They help translate these abstract requirements into concrete, measurable metrics and Service Level Objectives (SLOs & service-level indicators (SLIs) . * Designing and Implementing Reliable Systems:SREs should design and implement systems that are resilient to failures and can meet the defined NFRs even under stress. This includes implementing fault-tolerant zero down time architectures, using techniques like redundancy, load balancing, and automated failover mechanisms.
SREs also focus on building systems that are scalable, meaning they Analyse network performance data and capacity requirements proactively to ensure the network can handle current and future demands without performance degradation. * Monitoring and Incident Response:SREs implement robust monitoring systems to track key metrics related to NFRs .
They set up alerts and notifications to proactively identify and address potential issues before they impact users. * SRE must define and maintain an event catalog specifying active events thresholds , propose & implement relevant remediation and optimize it for efficiency.
  • Develop event response protocols provide training to teams and ensure quick and efficient handling of incidents.
In the event of incidents, SREs are responsible for quickly diagnosing and resolving complex network incidents, including troubleshooting complex incidents, conducting root cause analysis, and implementing preventative measures. * Perform Critical incident (e.g P1, P2 etc) root cause analysis for critical system failures to ensure high availability all the time and prevent future occurrences.
  • Highest technical escalation contacts to handle complex cases for the Portfolio service operations as technical expert.
  • Accountable within SGS for the in scope product to ensure high availability performance of the product/solution.
  • Technical expert /Guru in the domain and point of contact for engineering, management operations & product.
  • Optimizing network performance by analysing traffic patterns, identifying bottlenecks, and implementing solutions
  • Coordinate with incident management teams, operations experts and with different application & platform Portfolio service operations and Engineering teams to develop and implement permanent solutions.
  • Conduct thorough problem investigations via trend analysis to diagnose recurring incidents and find permanent solutions.
  • Conduct the problem review board weekly & Monitor the effectiveness of problem resolution activities & provide regular reports on problem management activities to ensure continuous improvement.
  • Deployment and Release Management:
SREs work closely with Engineering teams to facilitate smooth and reliable new software release deployments.
They implement strategies, clear process, SOP and rollback plan to minimize risks and reduce downtime during releases.
They ensure that new features and changes are deployed in a controlled manner, minimizing the impact on existing services. * Track deployment progress , conduct operational readiness assessments on successful execution of deployment and mitigate risk or improve deployment plan to ensure service stability.
  • DevOps/NetOps Management: Manage continuous integration and deployment (CI/CD) pipelines ensuring smooth integration between development and operational teams.
  • Building scripts to Automate network tasks, reducing manual effort, removing toils and human error .
  • Implement automation for system provisioning, self-healing - auto recovery, deployment , system health checks etc & monitoring event to incident with proper correlation.
  • Implement and manage infrastructure as code provide ongoing support for automation tools and continuously improve DevOps/Netops practices.
  • Creating & maintaining documentation related to network configurations,SOP’s, and troubleshooting guides.
QualificationsEXPERIENCE8+ years of experience in IT operations service management or infrastructure management including roles such as Site Reliability Engineer, or NetOps Engineer /Expert.
  • Airline experience and/or ATI know-how, is good to have.
  • Experise in troubleshooting Data center & Cloud setup technologies issues
  • Expertise in technologies like Cisco routing & switching, Cisco ACI, CISCO Nexus , Aruba, Clear Pass, Juniper Mist or any other wireless technology.
  • Hands-on experience with CI/CD pipelines automation system, performance monitoring and the implementation of infrastructure as code.
  • Having experience in NetOps working enviornment.
  • Having experience in Automation and scripting.
  • Proven experience in managing high-availability systems and ensuring operational reliability.
  • Extensive experience in root cause analysis (RCA) incident management and developing permanent solutions for recurring service disruptions.
KNOWLEDGE & SKILLS
  • any one : Terraform OR Python, OR other languages is must for automation & scripting
  • Git process knowledge good to have.
  • CICD pipeline tools such as GitHub good to have
  • Experience implementing architectural standards into pipelines
Other Networking technologies:
  • CISCO Routing & switching must to have.
  • CISCO ACI
  • loadbalancers
  • any wirless technology - Juniper Mist expertise or Aruba AP or CISCO
  • Cisco Datacenter switches like Nexus must to have
  • Aruba Clear pass knowledge good to have
  • Palo Alto firewalls good to have
  • Knowledge or experience with cloud platforms (AWS, Azure, Google Cloud) and their networking services is good to have.
  • Familiarity with operating systems (Linux, Windows) and system-level troubleshooting is good to have.
  • Understanding of ITIL or other incident management frameworks. Ability to effectively communicate technical information and collaborate with diverse teams.
PROFESSION COMPETENCIESCORE COMPETENCIES
  • Adhering to SITA Principles & Values
  • Good Communication
  • Creating & Innovating
  • Customer Focus
  • Impact & Influence
  • Leading Execution
  • Results Orientation
  • Teamwork
EDUCATION & QUALIFICATIONS
  • Bachelor's or Master degree in Computer Science Information Technology Engineering or a related field.
  • Relevant certifications such as CCIE in data centers OR routing & switching , Expert level certification in Juniper Mist or Aruba & Palo Alto Firewall.
  • Good to have Certifications in cloud platforms (AWS Azure Google Cloud) or DevOps methodologies (e.g. Certified DevOps Professional).
  • ITIL certification.
WHAT WE OFFERWe're all about diversity. We operate in 200 countries and speak 60 different languages and cultures. We're really proud of our inclusive environment. Our offices are comfortable and fun places to work, and we make sure you get to work from home too. Find out what it's like to join our team and take a step closer to your best life ever.🏡 Flex Week: Work from home up to 2 days/week (depending on your team's needs)⏰ Flex Day: Make your workday suit your life and plans.🌎 Flex-Location: Take up to 30 days a year to work from any location in the world.🌿 Employee Wellbeing: We have got you covered with our Employee Assistance Program (EAP), for you and your dependents 24/7, 365 days/year. We also offer Champion Health - a personalized platform that supports a range of wellbeing needs.🚀 Professional Development: Level up your skills with our training platforms, including LinkedIn Learning!🙌 Competitive Benefits: Competitive benefits that make sense with both your local market and employment status.SITA is an Equal Opportunity Employer. We value a diverse workforce. In support of our Employment Equity Program, we encourage women, aboriginal people, members of visible minorities, and/or persons with disabilities to apply and self-identify in the application process.

SITA

Similar Jobs

  • Lead Engineer

    REA

    • Gurgaon, Haryana
    About REA At REA, we are shaping the future of real estate. Our engineering teams are the architects behind the experiences that millions of property seekers rely on every day. W…
    • 9 days ago
    • Apply easily
  • Senior Lead Engineer

    REA

    • Gurgaon, Haryana
    About REA At REA, we are shaping the future of real estate. Our engineering teams are the architects behind the experiences that millions of property seekers rely on every day. W…
    • 9 days ago
    • Apply easily
  • Engineer - Full Stack

    REA

    • Gurgaon, Haryana
    About REA At REA, we are shaping the future of real estate. Our engineering teams are the architects behind the experiences that millions of property seekers rely on every day. W…
    • 9 days ago
    • Apply easily