
Senior Observability Engineer - Splunk
- Gurgaon, Haryana
- Permanent
- Full-time
Join a world-class team of skilled engineers who build creative digital solutions to support our colleagues and clients. We make a broad organizational impact by delivering cutting-edge technology solutions that power Gartner. Gartner IT values its culture of nonstop innovation, an outcome-driven approach to success, and the notion that great ideas can come from anyone on the team.About the role:Responsible for the management and coordination of day-to-day and strategic operations of our log analysis framework to advance the capabilities of our IT organizations which will reduce MTTR and increase our ability to deliver timely data to support business velocity.What you will do:
- Develop L0-L2 SOP’s related to the operational support of the logging framework
- Collect and report relevant KPIs that clearly show value/ROI and progression of the log analysis service
- Stay abreast of emerging technology advancements of the current logging platform and/or open-source alternatives including implementation of pilots and/or POC/POV’s.
- Recognize and onboard new data sources into Splunk, analyze data for anomalies and trends, and build relevant dashboards/alerts that improve visibility.
- Responsible for the installation, configuration, and ongoing administration of Cribl environments, ensuring efficient data routing, transformation, and delivery to downstream systems.
- Collaborate with cross-functional teams to optimize log pipelines and maintain system reliability.
- Manages and maintains Cribl Stream infrastructure, including pipeline configuration, performance monitoring, and troubleshooting. Ensures secure, efficient, and compliant data flows to support organizational observability and security needs.
- Develop/Refine organizations pattern based automated log ingestion via tight integration with existing/emerging technology pipelines and/or create a robust and repeatable onboarding process
- · Ensure proper operation and performance of Splunk index cluster, search heads, other backend components, universal forwarders, modules/plug-ins, and connectors.
- Standardize Splunk agent deployment, configuration, and maintenance across multiple configuration management systems
- Develop, Manage, and Maintain the organization's Event Management Framework.
- Administers and maintains Grafana environments, ensuring reliable dashboard performance and secure user access.
- Designs and develops interactive Grafana dashboards for real-time data visualization and monitoring.
- Manages and optimizes ClickHouse database clusters to ensure high performance, availability, and data integrity.
- Utilizes ClickHouse for efficient querying and analysis of large-scale datasets to support business insights.
- Educate/mentor junior team members to grow their capabilities and skills.
4+ years in a role supporting the operational needs of a relevant enterprise log analysis framework . Bachelor's degree in Computer Science, or related discipline, or equivalent work experience.Must have:
- In-depth experience installing, configuring, maintaining log analysis & visualization & next gen pipeline tools such as Splunk, Grafana, Clickhouse & Cribl.
- Basic familiarity with a wide array of IT monitoring tools, ITIL & Devops framework(s), and ITSM tools
- Proficiency in leveraging regular expression patterns
- Understanding of Windows Server and Linux Operating Systems Administration
- Hands-on & practical experience of log aggregation related to Cloud Platforms, server-less compute, and micro-services (Lamba, Docker, SSM,RDS)