
Data Engineer with AWS
- Chennai, Tamil Nadu
- Permanent
- Full-time
- ​Data Pipeline Development: Design, build, and maintain robust, scalable, and efficient ETL/ELT pipelines for ingesting, transforming, and loading data from various sources into our data lake and data warehouse. ​AWS Expertise: Develop and manage data solutions using a wide range of AWS services, including but not limited to: ​Storage: S3 (for data lakes), RDS (PostgreSQL, MySQL), Redshift (data warehousing), DynamoDB. ​Compute/Processing: Glue, EMR (Spark), Lambda, Athena. ​Orchestration/Workflow: AWS Step Functions, AWS Data Pipeline. ​Networking & Security: VPC, IAM, Security Groups. ​Apache Airflow Orchestration: ​Develop, deploy, and manage complex Directed Acyclic Graphs (DAGs) in Apache Airflow to schedule, monitor, and automate data workflows. Implement best practices for Airflow DAG design, error handling, logging, and monitoring. Optimize Airflow performance and manage Airflow environments (e.g., using Amazon MWAA or self-hosted).
- Overall, 8+ years of experience as a Data Engineer and over 5 years of relevant experience in AWS. ​Programming: Strong proficiency in Python for data manipulation, scripting, and automation. ​SQL: Expert-level SQL skills for querying, manipulating, and optimizing data in relational and analytical databases. ​AWS Services: Proven hands-on experience designing, building, and operating data solutions on AWS (S3, Glue, Redshift, EMR, Lambda, RDS, Step Functions, etc.). ​Apache Airflow: In-depth experience with Apache Airflow for data pipeline orchestration, including DAG development, custom operators, sensors, and managing Airflow environments. ​ETL/ELT: Solid understanding and practical experience with ETL/ELT methodologies and tools.