Data Engineer
Maersk View all jobs
- Bangalore, Karnataka
- Permanent
- Full-time
- Ingest the world: Design and maintain ingestion frameworks for high-volume, structured and unstructured data-from operational systems, APIs, file drops, and events. Support streaming and batch use cases across latency windows.
- Transform at scale: Develop transformation logic using SQL, Python, Spark, and modern declarative tools like dbt or sqlmesh. You’ll handle deduplication, windowing, watermarking, late-arriving data, and more.
- Curate for trust: Collaborate with domain teams to annotate datasets with metadata, ownership, PII classification, and usage lineage. Enforce naming standards, partitioning schemes, and schema evolution policies.
- Optimize for the lakehouse: Work within a modern lakehouse architecture-leveraging Delta Lake, S3, Glue, and EMR-to ensure scalable performance and queryability across real-time and historical views.
- Build for observability: Instrument your pipelines with quality checks, cost visibility, and lineage hooks. Integrate with OpenMetadata, Prometheus, or OpenLineage to ensure platform reliability and traceability.
- Enable production-readiness: Support deployment workflows via GitHub Actions, Terraform, and IaC patterns. Your code will be versioned, testable, and safe for multi-tenant deployments.
- Think platform-first: Everything you build should be reusable. You’ll help codify data engineering standards, create scaffolding for onboarding new datasets, and drive automation over repetition.
- Python(PySpark) & SQL — Non-negotiable. Strong working proficiency in both.
- AWS — Solid understanding of AWS services beyond just data engineering (storage, compute, networking, IAM, etc.). Preference for candidates already working within the AWS ecosystem.
- Data Fundamentals & Data Pipeline Optimization — Working knowledge of optimizing pipelines for cost efficiency and resource utilization.
- Interest in working in Platform Engineering
- Platform Engineering Mindset — Must have a genuine interest in platform/infrastructure work, not just pipeline development. Cultural fit on this is important — we don't want drop-offs post-interview.
- Containerization & Orchestration — Conceptual understanding or hands-on experience with Docker and Kubernetes.
- Cloud Migration / Multi-cloud — Experience with cloud migrations or working across multi-cloud environments.
- AI/ML — Any exposure to AI/ML concepts or tooling is a bonus, not a requirement.
- Infrastructure as Code (IaC) — Familiarity with IaC tooling (Terraform, CDK, etc.).
- Observability — Familiarity with tools like Grafana and Prometheus for monitoring and alerting.
- Impact at global scale: Your work will influence container journeys, terminal operations, vessel routing, and sustainability metrics across 130+ countries and $4T+ in global trade.
- Platform-level thinking: You’re not just solving one use case-you’re building primitives for others to reuse. This is your chance to shape a high-leverage internal data platform.
- Freedom to experiment: We don’t believe in checkbox engineering. You’ll have space to challenge the status quo, propose better tooling, and refine the foundations of our platform stack.
- Career-defining scope: Greenfield. Executive visibility. Cross-domain exposure. This is not a maintenance role-it’s about creating the next chapter in Maersk’s data journey.