
Senior Manager, Reliability Engieering
- Bangalore, Karnataka
- Permanent
- Full-time
- Lead and mentor a team of reliability engineers, fostering a strong culture of collaboration and continuous improvement.
- Conduct regular one-on-one meetings with team members, providing guidance, feedback, and support for their career development.
- Manage performance evaluations and provide constructive feedback and actively participate in all phases of growing the engineering organization through recruiting, team building, etc.
- Reliability Engineering, Operations & Governance
- Lead and coordinate engineering activities to successfully plan, communicate, and deliver on product features on time while designing for quality, observability, and scalability.
- Ensure full software lifecycle instrumentation from requirement ideation to software development to deployment.
- Drive the adoption of cloud-native technologies and standard processes, such as containerization, service mesh, microservices, etc.
- Collaboration with internal partners and team members:
- Reliability engineering and operations teams, product, and PMO on engineering resource allocation and project schedules in accordance with our strategic organizational priorities.
- SRE team to champion automation to enhance efficiency and reliability.
- Operations teams on maintaining a highly available telemetry and command/control infrastructure to ensure eBay’s products and services are available to our customers.
- Fleet management team on capacity planning, resource allocation, and cost optimization for the telemetry control plane.
- Information security teams to ensure integrity and compliance of the telemetry infrastructure by implementing appropriate security controls and monitoring.
- 12-15 years of proven experience working in Infrastructure and software development and engineering organizations with 5 years’ experience in managing and leading both reliability engineering teams and software development teams.
- Excellent at communicating critical updates to organizational leaders and executives including AI-driven reliability trends and insights.
- Experience supporting medium or large tech organizations with many different internal customers and partners.
- Experience working collaboratively in large distributed global teams.
- Demonstrated ability to adopt and operationalize emerging AI tools, ensuring the team remains at the forefront of reliability engineering practices.
- Knowledge of software development, networking, security, and storage technologies in a cloud environment and proven understanding of cloud-native architectures, microservices, and DevOps and SRE principles.
- Passion for staying ahead of the curve in AI/ML innovation applied to observability, monitoring, and system reliability.