Senior Software Engineer - Cloud Infrastructure Reliability

Under Armour India

Bangalore, Karnataka
Permanent
Full-time

9 days ago
Apply easily

About Under Armour IndiaWe have expanded our global footprint with Under Armour India, a strategic capability hub in Bengaluru, designed to strengthen our global operations and accelerate innovation.Here, we enable athletes’ digital journeys with tech solutions they never knew they needed and can’t imagine living without. Our teams drive products from conceptualization to production, with a strong focus on consumer engagement and retail store experience.We combine art and technology with the athletes’ world of performance, sport, and fitness—working with the latest technologies across platforms.Job DescriptionPURPOSE OF ROLE:TheSite Reliability Engineering (SRE) team at UnderArmourdrives continuous improvements in performance, resiliency, and operational excellence across our technology platforms. We take a consultative,engineering firstapproach to reliability-partnering closely withcross functionalteams to deliver guidance, automation, and best practices that improve the scalability, stability, and reliability of the services that power our products and digital experiences.We are seeking aSite Reliability Engineerto help strengthen the reliability and scalability of critical systems. In this role, you will build automation, enhance observability, improve operational workflows, andparticipatein incident response and problem management. The ideal candidate bringsa strong foundationin distributed systems,cloud nativeplatforms, and performance optimization, along with a collaborative mindset and a passion for applying SRE principles across the organization.Innovation is a core part of how we work at UnderArmour. Success in this role requires adaptability, continuous learning, and the ability topivotastechnologies, priorities, and business needsevolve.YOUR IMPACT (Job Responsibilities):

Engineer and improve reliable, scalable, andhigh performingsystems supporting critical business services.
Build automation across deployments, monitoring, alerting, and operational workflows to reduce toil and improve resiliency.
Partner with engineering and platform teams to apply SRE principles, including SLIs, SLOs, error budgets, and automated remediation.
Enhance CI/CD pipelines and software delivery processes to improve reliability and efficiency.
Develop observability solutions across metrics, logs, and distributed tracing to improve system visibility.
Participate in incident response, root cause analysis, and corrective actions to prevent recurrence.
Support capacity planning, performance tuning, and scaling strategies forcloud nativeand distributed systems.
MaintainInfrastructure as Code, cloud configurations, and operational documentation, including runbooks and standards.
Collaboratewithteams toidentifyreliability risks and drive continuous improvement.

QUALIFICATIONS:

Bachelor's degree in computer science, Engineering, or a related field with typically 3-5 years of experience in Site Reliability Engineering, DevOps, Platform Engineering, ora relateddiscipline or Master's degree with typically 3 years of relevant experience or typically 9 years of relevant work experience without a degree.
Proficiencyinone or more programmingor scripting languages such as Python, Go, JavaScript, or Bash.
Solid working knowledge of Linux/Unix basedsystems.
Experience building or supporting CI/CD pipelines using tools such as GitHub Actions, GitLab CI, or Jenkins.
Familiarity withInfrastructure as Codepractices and tools (e.g., Terraform, CloudFormation).
Experience with containerization and orchestration technologies, including Docker and Kubernetes.
Understanding ofnetworking fundamentals, distributed systems, and system design principles.

PREFFERED QUALIFICATIONS:

Handson experience with modern observability stacks such as Prometheus, Grafana, ELK/EFK, or Datadog. Experience contributing to SLI/SLO frameworks and applying error budgets to guide reliability decisions.
Exposure toGitOpsworkflows and tooling such as Argo CD or Flux.
Working knowledge of service mesh architectures (e.g., Istio,Linkerd).
Familiarity with performance and load testing tools and techniques.
Experience with asynchronous and distributed systems, including message queues,event drivenarchitectures, or distributed data platforms.
Cloud or DevOps certifications (e.g., AWS Associate or Specialty, GCP Professional, Kubernetes CKA/CKS) are a plus.
Experienceoperatingin largescale enterprise environments and collaborating with globally distributed teams. Experience usingAI assisteddevelopment tools (such as Copilot, Cursor, or similar) to improve code quality, accelerate development, and enhance documentation.
Understanding offoundational AI/ML concepts, with exposure tocloud nativeAI services and/or the ability toleverageAI tools to automate cloud and operational tasks.

WORKPLACE LOCATION:

Location: This individual must reside within commuting distance from ouroffice.
Work Schedule:This role follows a hybrid work schedule, requiring 4 days in-office per week

OUR COMMITMENT TO EQUAL OPPORTUNITY:At Under Armour, we are committed to providing an environment of mutual respect where equal employment opportunities are available to all applicants and teammates without regard to race, color, religion or belief, sex, pregnancy (including childbirth, lactation and related medical conditions), national origin, age, physical and mental disability, marital status, sexual orientation, gender identity, gender expression, genetic information (including characteristics and testing), military and veteran status, family or paternal status and any other characteristic protected by applicable law. Under Armour seeks to recruit, develop and retain the most talented people representing a wide variety of backgrounds and perspectives. Reasonable accommodations are available for applicants with disabilities upon request.

Under Armour India