
Data Scientist (Cyber security domain)
- Gurgaon, Haryana
- Permanent
- Full-time
- Design and implement scalable data pipelines for ingesting and processing network security data
- Perform data preprocessing and feature engineering to prepare data for machine learning models
- Set up and manage data storage solutions, including Elasticsearch
- Handle model deployment and implement DevOps practices
- Develop comprehensive testing strategies for data pipelines and deployed models
- Ensure data quality, integrity, and availability for machine learning models
- Collaborate with the team to optimize data flow and model performance
- Bachelor's or Master's degree in Computer Science, Software Engineering, or related field
- 3-5 years of experience in data engineering
- Strong programming skills in Python
- Expertise in big data technologies (Hadoop, Spark, Hive)
- Proficiency in SQL and experience with various database systems (PostgreSQL, MySQL, MongoDB)
- Experience with data pipeline tools (Apache Airflow)
- Familiarity with Elasticsearch for efficient data storage and retrieval
- Experience with stream processing frameworks (Apache Kafka, Apache Flink)
- Proficiency in version control systems (Git)
- Understanding of data modelling and ETL processes
- Experience with real-time data processing and analytics
- Knowledge of machine learning deployment processes
- Familiarity with network protocols and security concepts
- Experience with containerization and orchestration (Docker, Kubernetes)
- Experience with CI/CD tools (Jenkins, GitLab CI)
Expertia AI Technologies