Infrastructure Lead (DevOps & Cloud)

Mumbai, Maharashtra
Permanent
Full-time

3 days ago
Apply easily

This role is for one of the Weekday's clientsMin Experience: 8 yearsLocation: MumbaiJobType: full-timeWe are looking for an experienced Infrastructure Lead to drive the design, implementation, and optimization of scalable, secure, and highly available cloud infrastructure. This role will lead DevOps/SRE initiatives, establish best practices, and ensure reliability and performance of mission-critical systems.RequirementsKey Responsibilities1. Cloud Infrastructure & Architecture

Design, develop, and maintain scalable cloud infrastructure on AWS and Azure platforms.
Lead architectural decisions to ensure high availability, fault tolerance, and optimal performance.
Promote infrastructure automation through Infrastructure as Code (Terraform).

2. DevOps & CI/CD Enablement

Develop and enhance CI/CD pipelines using tools such as Jenkins, GitLab CI, CircleCI, and ArgoCD.
Adopt GitOps methodologies for consistent and dependable deployments.
Increase deployment frequency, shorten lead times, and reduce failure rates.

3. Kubernetes & Containerization

Oversee and scale Kubernetes clusters across EKS, AKS, and on-premises environments.
Implement container orchestration, service mesh solutions, and cluster optimization techniques.
Ensure platform reliability and conduct performance tuning.

4. Monitoring, Reliability & Incident Management

Establish and uphold SLOs, SLAs, and reliability benchmarks.
Deploy observability tools such as Prometheus, Grafana, Datadog, and ELK stack.
Lead incident management processes including root cause analysis and reducing mean time to recovery (MTTR).

5. Automation & Operational Excellence

Promote automation across infrastructure provisioning, monitoring, and recovery workflows.
Create reusable infrastructure modules and accelerators.
Minimize manual tasks through scripting using Python and Bash, along with supporting tools.

6. Security & Compliance

Apply cloud security best practices involving IAM, network security, and policy enforcement.
Maintain compliance via Kubernetes policies and governance frameworks.
Champion secure-by-design principles in infrastructure development.

7. Cost Optimization

Monitor cloud resource consumption and implement cost-saving strategies.
Utilize right-sizing, auto-scaling, and efficient resource utilization methods.

8. Leadership & Stakeholder Management

Lead and mentor DevOps and SRE teams.
Collaborate effectively with engineering, product, and architecture teams.
Promote infrastructure best practices across various projects and teams.

9. Innovation & AI-driven Operations (Preferred)

Explore AI and machine learning-driven infrastructure enhancements and AIOps capabilities.
Implement intelligent monitoring, anomaly detection, and automate root cause analysis.

Required Skills & Experience

At least 8 years of experience in Infrastructure, DevOps, or SRE roles.
Strong expertise in AWS (preferred).
Hands-on experience with Terraform (Infrastructure as Code).
Comprehensive knowledge of Kubernetes and containerization (Docker).
Experience working with CI/CD tools such as Jenkins, GitLab CI, CircleCI, and ArgoCD.
Strong understanding of monitoring and observability tools.
Proficient in scripting languages including Python and Bash.
Experience managing high-availability, large-scale systems.

SkillsInfrastructure as codeLead InfrastructureDevOpsSRETerraformKubernetesDockerCI CD

Weekday AI