
Site Reliability Engineer
- Bangalore, Karnataka
- Permanent
- Full-time
- Provide expert operational support to our nodes running in the cloud(AWS/Azure/GCP), using technologies such as Linux (CoreOS, Ubuntu), Docker,and languages including Java, Python and bash.
- Responding to customer queries and incidents, diagnosing and solving complex technical issues by liaising with customer's engineers on Apache Cassandra, Apache Kafka, Elastic Search, Redis and other supported technologies and maintain a highstandard of customer communication
- Investigate issues and apply standard change and maintenance procedures to optimize the performance and stability of production systems
- Undertake complex cluster operations such as migrations, upgrades and maintenance on our fleet.
- Develop and continually improve our suite of internal automation tools, applications, documentation, processes and procedures
- Be a proactive, reliable and supportive member of the TechOps team, and participate in a 24/7 rotating shift roster
- Strong knowledge and experience with Unix/Linux and be comfortable working from the command line. This is essential, there are no GUIs here.
- Good fundamental computer science / software engineering skills and knowledge, particularly operating system internals, resource management, and networking.
- Previous experience working with databases and/or Open-Source Techs (managingstandard tasks, handle production issues, performance tuning) in a support role.
- Programming skills in Python, bash scripting, SQL, and source code control usingGit.
- Exceptional ability to communicate clearly and professionally in written and verbalEnglish (essential).
- Demonstrated ability to multitask.
- Passion for all things IT, and especially open source.
- Any customer service experience is favourable