
Lead Site Reliability Engineer
- Hyderabad, Telangana Ahmedabad, Gujarat
- Permanent
- Full-time
- Design, implement, and manage infrastructure using Terraform or other Infrastructure-as-Code (IaC) tools.
- Leverage AWS or equivalent cloud platforms to build and maintain scalable, high-performance infrastructure that supports data-heavy applications and JavaScript-based visualizations.
- Understand component-based architecture and cloud-native applications.
- Implement and maintain site reliability practices, including monitoring and alerting using tools like DataDog, ensuring the platform's availability and responsiveness across all environments.
- Design and deploy high-availability architecture to support continuous access to alerting engines.
- Support and maintain Configuration Management systems like ServiceNow CMDB.
- Manage and optimize CI/CD workflows using GitHub Actions or similar automation tools.
- Work with OIDC (OpenID Connect) integrations across Microsoft, AWS, GitHub, and Okta to ensure secure access and authentication.
- Contribute to QA testing (both manual and automated) to ensure high-quality releases and stable operation of our data visualization tools and alerting systems.
- Participate in light JavaScript programming tasks, including HTML and CSS fixes for our charting library.
- Assist with deploying and maintaining mobile applications on the Apple App Store and Google Play Store.
- Troubleshoot and manage network issues, ensuring smooth data flow and secure access to all necessary environments.
- Collaborate with developers and other engineers to troubleshoot and optimize production issues.
- Help with the deployment pipeline, working with various teams to ensure smooth software releases and updates for our library and related services.
- Proficiency with Terraform or other Infrastructure-as-Code tools.
- Experience with AWS or other cloud services (Azure, Google Cloud, etc.).
- Solid understanding of component-based architecture and cloud-native applications.
- 10 to 20 years' Experience with site reliability tools like DataDog for monitoring and alerting.
- Experience designing and deploying high-availability architecture for web based applications.
- Familiarity with ServiceNow CMDB and other configuration management tools.
- Experience with GitHub Actions or other CI/CD platforms to manage automation pipelines.
- Strong understanding and practical experience with OIDC integrations across platforms like Microsoft, AWS, GitHub, and Okta.
- Solid QA testing experience, including manual and automated testing techniques (Beginner/Intermediate).
- JavaScript, HTML, and CSS skills to assist with troubleshooting and web app development.
- Experience with deploying and maintaining mobile apps on the Apple App Store and Google Play Store that utilize web-based charting libraries.
- Basic network management skills, including troubleshooting and ensuring smooth network operations for data-heavy applications.
- Knowledge of package publishing tools such as Maven, Node, and CocoaPods to ensure seamless dependency management and distribution across platforms.
- Ability to wear multiple hats: Adapt to the ever-changing needs of a startup environment within a global organization.
- Self-starter with a proactive attitude, able to work independently and manage your time effectively.
- Strong communication skills to work with cross-functional teams, including engineering, QA, and product teams.
- Ability to work in a fast-paced, high-energy environment.
- Familiarity with agile methodologies and working in small teams with a flexible approach to meeting deadlines.
- Basic troubleshooting skills to resolve infrastructure or code-related issues quickly.
- Knowledge of containerization tools such as Container Platforms and Amazon ECS is a plus.
- Understanding of DevSecOps and basic security practices is a plus.
- Experience with CI/CD pipeline management, automation, and deployment strategies.
- Familiarity with serverless architectures and AWS Lambda.
- Experience with monitoring and logging frameworks, such as Prometheus, Grafana, or similar.
- Experience with Git, version control workflows, and source code management.
- Security-focused mindset, experience with vulnerability scanning, and managing secure application environments.
- Competitive salary and benefits package.
- Flexible work schedule with remote work options.
- The opportunity to work in a collaborative, creative, and innovative environment.
- Hands-on experience with cutting-edge technologies and tools that power sophisticated financial data visualizations and charting solutions.
- Professional growth and career advancement opportunities.
- A dynamic startup culture within a global organization, where your contributions directly impact the product and the financial industry.
Posted On: 2025-08-21
Location: Hyderabad, Telangana, India