Lead Infrastructure Engineer – SRE/DevOps, Python, Cloud & BI Platforms
JPMorgan Chase View all jobs
- Bangalore, Karnataka
- Permanent
- Full-time
- Design, develop, and maintain automation scripts, pipelines, and microservices to support platform operations, deployments, failovers, and incident remediation.
- Build self-healing and auto-remediation workflows to reduce manual toil and improve system uptime.
- Develop and maintain Infrastructure as Code (IaC) using Terraform, Ansible, or equivalent tools.
- Automate routine operational tasks such as certificate rotation, password resets, service restarts, health checks, and content migrations.
- Create and maintain CI/CD pipelines using tools like Jenkins, Spinnaker, Jules, or equivalent.
- Monitor, troubleshoot, and optimize platform performance using observability tools (Dynatrace, Splunk, Grafana, OPS Hub).
- Define and track SLIs, SLOs, and error budgets; drive continuous improvement in platform reliability.
- Participate in on-call rotations and incident response; conduct blameless post-incident reviews and implement preventive measures.
- Develop and maintain runbooks, playbooks, and disaster recovery procedures.
- Support SR/DR testing, failover automation, and resiliency validation.
- Leverage AI/ML capabilities (e.g., GitHub Copilot, LLM Suite, predictive analytics) to enhance automation development, code quality, and operational efficiency.
- Formal training or certification on software engineering concepts and 5+ years applied experience
- Bachelor's degree in Computer Science, Engineering, or a related field (or equivalent experience)
- 6+ years of experience in SRE, DevOps, or platform automation roles
- Strong proficiency in scripting and programming languages such as Python, Bash, PowerShell, or Go
- Hands-on experience with CI/CD tools (Jenkins, Spinnaker, GitLab CI, or equivalent)
- Experience with Infrastructure as Code (Terraform, Ansible, CloudFormation)
- Strong experience with monitoring and observability tools (Dynatrace, Splunk, Grafana, Datadog)
- Experience with cloud platforms (AWS, Azure, or GCP)
- Hands-on experience in supporting infrastructure platform and SRE for BI tools such as SAP BusinessObjects, ThoughtSpot, Tableau, Qlik Sense, or IBM Cognos
- Experience with AI/ML tools and frameworks (GitHub Copilot, OpenAI APIs, TensorFlow, or equivalent)
- Hands-on experience with ServiceNow, JIRA ticketing tools
- Experience with Snowflake, Databricks, or other cloud data platforms
- Knowledge of SDLC processes, Agile/Scrum methodologies, and change management
- Familiarity with secrets management, least-privilege access, and security best practices
- Experience with Autosys, Control-M, or similar job scheduling tools