Cloud Ops Engineer

Bangalore, Karnataka
Permanent
Full-time

1 month ago

Overview:This position will assist in performing implementation, operation, monitoring, recovery, and performance tuning for infrastructure and application services at symplr.The CloudOps team augments the symplr Development, IT teams by focusing on application deployment on production systems using a software engineering approach and manageability of application failure resolutions within Service level agreements.CloudOps goals include improving system performance, increasing operational observability, enhancing system stability, and reducing time for software delivery. Duties & Responsibilities:

Be a champion for department initiatives and values by ensuring all actions promote the department’s mission statement
Participate in release cycles of product by closely working with Engineering Managers, Architects and Developers.
Work towards automating the product deployment to various environments by integrating with continuous integration (CI) and continuous delivery (CD) tools, monitoring, and change management practices.
Create and maintain standard operating procedures (SOPs) for performing maintenance tasks, applying configuration changes, and remediating problems in the environment.
Implement monitoring, alerting, notification and metrics collection for

Infrastructure and application performance
System uptime
Error rate
Monitor and continually improve the capacity and reliability of our production environments infrastructure.
Investigate and fix performance and scalability bottlenecks, proactively identify issues and create work items to improve stability and performance.
Respond to alerts from production systems, identify and resolve root causes in a timely fashion
Identify single points of failure and other high-risk architecture issues and propose resilient resolutions to mitigate the risk thereby improving the system reliability.
See opportunities of automation and reduce the operational workload, build scripts, introduce new tools and practices as needed
Work with other Cloud Infrastructure Engineer and developers to ensure maximum performance, reliability and automation of our deployments and infrastructure.
Work with, consult and influence developers on new features and software architecture to ensure scalability.
Communicate to stakeholders and handle the deployment/maintenance/support efficiently
Ticket Handling and Support

Tickets that are handled should have clear communication and correct stakeholders involved
Tickets should be completed within the SLA and should be clearly informed, documented if there is any delay or improper tickets.
Tickets should have proper comments to close the ticket including steps for resolutions, screen shots.
Tickets that are repetitive should be discussed in standup call for brainstorming and eventually should lead into resolution through automation if necessary.

Skills Required:

4+ years of experience with any public cloud provider such as Amazon Web Services (AWS), Microsoft Azure and On-Prem Servers
Solid understanding of standard TCP/IP networking, Windows IIS, Load Balancing and common protocols like DNS, HTTPS
Good knowledge on CI/CD tools like Octopus CD, Azure ADO, GitHub Actions, Jenkins etc
Monitoring and Logging: Experience with any Application monitoring and logging tools (e.g. Datadog, New Relic, AppDynamics, Application Insight, ELK, Prometheus).
Good understanding of Web Servers & Database
[Optional] Good understanding in Docker and Kubernetes.
Good scripting knowledge & Software life cycles model.
Good understanding of DevOps practices.
Should have worked on high traffic & highly scalable systems in past
Knowledge on fundamental aspects for release automation (packaging, dependencies, promotion, deployment, compliance)
A passion for collecting, evaluating, and improving performance metrics.
Excellent time management, resource organization and priority establishment skills, and ability to multi-task in a fast-paced environment
Ability to work quickly and efficiently with minimal supervision
Excellent communication skills with both written and verbal
Should be able to handle On-calls 12-hours following a week rotation pattern for symplr products.
Able to work during the US Day hours shift and coordinate with team members in US/India for completing the day-to-day tasks.

Qualifications:

Have HEART. To work here, you must be:

Humble – self-aware and respectful
Effective – measurably move the needle & immeasurably add value
Adaptable – innately curious and constantly changing
Remarkable – stand out in some way
Transparent – openly and honestly sharing knowledge
3+ years of Systems Engineering experience in the following areas

Cloud platforms (Azure, AWS) and On-Prem Servers
Windows and Linux Servers
Application Monitoring Tools (Datadog, New Relic, AppDynamics, Application Insights)
Log Aggregation Tools (Datadog, ELK, etc)
PowerShell, Bash, or Python scripting
CI/CD tools (Azure Pipelines, GithHub Actions, Jenkins, Octopus, etc.)
Infrastructure management tools (Terraform, Ansible, etc.)
Application Hosting (IIS, Apache, Tomcat)
Alerting (PagerDuty)
Ticketing (ADO Boards and Ivanti)
Documentation (Confluence)
Bachelor’s degree or equivalent experience

Symplr

Apply Now