
Senior L2/L3 Production Support Engineer, SRE
- Bangalore, Karnataka
- Permanent
- Full-time
- Develop and maintain Unix shell scripts for automation tasks.
- Write and optimize Python scripts for process automation and data handling.
- Design, implement, and maintain scalable cloud infrastructure using AWS services (EC2, S3, Lambda, etc.).
- Monitor and troubleshoot cloud environments for optimal performance.
- Monitor and optimize system resources and automate routine administrative tasks and BAU tasks.
- Production Environment monitoring & Issue Resolution.
- Control SLA and notify management or the client in case of unexpected behavior.
- Support end-to-end data flows and health and sanity checks of the systems and applications.
- Escalate the issues (internally to Group lead/PM) with environment and application health.
- Logs review and data discovery in database tables for investigation of workflow failures.
- Investigate and supply analysis to fix application/configuration issues in the production environment.
- Contact/chase responsible support/upstream/downstream/cross teams and ask for root cause analysis from them on issues preventing end-to-end flow to work as designed.
- Regular update on issue status until addressed, notifying the client on status changes; expected time to address.
- Participate in ad-hoc/regular status calls on application health with the client to discuss critical defects/health check status.
- Working with business users service requests, which includes investigation of business logic and application behavior.
- Work with different data format transformation processes (XML, Pipeline).
- Work with source control tools (GIT/SVN) in order to investigate configuration or data transformation-related issues.
- Work with middleware and schedulers on data flow and batch process control.
- Focus on continuous proactive service improvement and continuous learning.
- Ensure customer service excellence and guaranteed response within SLA timeline by actively monitoring support emails/tickets and actively working on them till the issue is fully remediated.
- Ensuring all incident tickets are resolved in a timely and comprehensive manner.
- Track and identify frequently occurring, high-impact support issues as candidates for permanent resolution.
- Bachelor's Degree from a reputed university with good passing scores.
- 7 to 12 years as a L2/L3 Production Support along with Site Reliability Engineer having strong knowledge of Unix shell scripting
- Develop and maintain Unix shell scripts for automation tasks.
- Write and optimize Python or Shell scripts for process automation and data handling. Good knowledge of any scripting language would be fine.
- Basic Knowledge on AWS services (EC2, S3, etc.).
- Monitor and optimize system resources and automate routine administrative tasks and BAU tasks.
- Good Understanding of Incident/Change/Problem Management process. Required Skills:
- Strong experience with Unix Shell Scripting.
- Proficiency in Python Scripting for automation. Proficiency in any scripting language and have hands-on experience in automation.
- Strong Knowledge of Database
- Basic understanding of AWS services and cloud
- Basic knowledge and experience supporting cloud applications.
- Ability to troubleshoot and resolve technical issues in a Production Environment.