
Staff Software Engineer
- Hyderabad, Telangana
- Permanent
- Full-time
- Cribl Pipelines: Architect and optimize large-scale data pipelines using Cribl Stream and Cribl Edge for ingestion, transformation, and routing.
- Streaming & Real-Time Processing: Design and implement real-time data pipelines using Apache Spark, Apache Flink, and Kafka Streaming to handle high-throughput, low-latency observability data.
- ETL/Data Engineering: Apply advanced ETL practices to cleanse, enrich, filter, and normalize diverse data sources before downstream ingestion.
- Observability Data Management: Manage high-volume telemetry data (logs, metrics, traces, events) and design strategies for noise reduction, performance optimization, and cost control.
- Integration: Build robust integrations with Splunk, Elasticsearch, Kafka, S3, Prometheus, VictoriaMetrics, InfluxDB, and other TSDBs.
- Scalability & Performance Tuning: Ensure Cribl and streaming pipelines perform reliably at scale, handling high-cardinality and high-throughput datasets.
- Best Practices & Governance: Define and enforce observability ingestion best practices, schema governance, and data quality standards.
- Leadership & Mentorship: Guide engineers in pipeline design, streaming technologies, and observability best practices.
- Innovation: Explore emerging technologies in observability, streaming, and AI-driven analytics to continuously improve architecture.
- Experience in leveraging or critically thinking about how to integrate AI into work processes, decision-making, or problem-solving. This may include using AI-powered tools, automating workflows, analyzing AI-driven insights, or exploring AI's potential impact on the function or industry.
- 10+ years of software/data engineering experience, with at least 5+ years hands-on in Cribl (Stream/Edge).
- Strong background in ETL pipelines, real-time streaming, and distributed data processing.
- Hands-on expertise with Apache Spark (Structured Streaming), Apache Flink, and Kafka Streaming.
- Deep understanding of observability data (logs, metrics, traces) and platforms such as Splunk, Elastic, Prometheus, Grafana, OpenTelemetry.
- Experience with Time Series Databases (TSDBs) such as VictoriaMetrics, InfluxDB, TimescaleDB, or ClickHouse.
- Proficiency in scripting/programming (Python, Go, or Java) for pipeline extensions and automation.
- Strong knowledge of Kafka, S3, cloud-native services (AWS/GCP/Azure) for data transport and storage.
- Experience with scalability, performance tuning, and cost optimization in observability pipelines.
- Strong collaboration and leadership skills to influence cross-functional teams.
- Exposure to AI/ML-based anomaly detection or predictive observability use cases.
- • Previous Staff/Principal Engineer experience in large-scale data systems.