Senior Site Reliability Engineer, Core
Sumo Logic
- Bangalore, Karnataka
- Permanent
- Full-time
- Remote from India.
- Continually improve the lifecycle of microservices and architectural components from inception and design, through deployment, operation, and refinement.
- Participate in defining, evolving, and managing SLOs
- Write code and automation to reduce operational workload, increase efficiency, improve security posture, eliminate toil, and enable Sumo's developers to deliver features more rapidly.
- Scale systems sustainably through mechanisms like automation, and evolve systems by pushing for changes that improve reliability and velocity.
- Facilitate blame-free root cause analysis meetings for incidents to learn and drive improvement
- Participate in and continually improve our global IRC (incident response coordination) for all products.
- Drive root cause identification and issue resolution with the various teams.
- Work inside of a fast-paced iterative environment.
- Cloud native application development experience leveraging best practices and design patterns
- Strong debugging and trouble-shooting skills across the entire technology stack
- Deep understanding of AWS Networking, Compute, Storage, and managed services.
- Competency with modern CI/CD tooling like Kubernetes, Terraform, Ansible & Jenkins
- Experience with full life cycle support of services, from creation to production support
- Versed in Infrastructure as Code practices using technologies like Terraform or Cloud Formation
- Ability to author production ready code in at least one the following: Java, Scala or Go.
- Experience with Linux systems and at home on the command line
- Understand and apply modern approaches to cloud-native software security
- Experienced with agile frameworks, such as Scrum and Kanban, and how to operate within these frameworks to continually deliver value.
- Flexible and willing to step into new roles and responsibilities
- Willingness to learn and use Sumo Logic products for solving reliability and security issues
- Bachelor's or Master's Degree in Computer Science, Electrical Engineering, or another scientific or technical discipline
- 6+ years of industry experience.
- Experience using Sumo Logic products or other observability products for reliability and security
- Experienced with planet scale product development
- Running and operating SaaS products on AWS Cloud with expert level proficiency
- Experience with streaming technologies like Kafka, Kafka Streams, or KSQL
- Expert level experience in one or more of: Java, Go, Scala, or Python
- Expert level experience in one or more of: Terraform, Jenkins, Kubernetes
- Extensive experience running and tuning JVM workloads at scale