We are seeking a Hands-On Data Architect to design, build, and operate a high-scale, event-driven data platform supporting payment and channel operations. This role combines strong data architecture fundamentals, deep streaming expertise, and hands-on engineering in a regulated, high-throughput environment.
You will lead the evolution from legacy data ingestion patterns to a modern AWS-based lakehouse and streaming architecture, handling tens of millions of events per day, while applying domain-driven design (DDD) and data-as-a-product principles.
This is a builder role, not a documentation-only architect position.
Key Responsibilities
Data Products & Architecture
* Design and deliver core data products including:
* Channel Operations Warehouse (high-performance, ~30 days retention)
* Channel Analytics Lake (long-term retention, 7+ years)
* Define and expose data APIs and status/statement services with clear SLAs.
* Architect an AWS lakehouse using S3, Glue, Athena, Iceberg, with Redshift for BI and operational analytics.
* Enable dashboards and reporting using Amazon QuickSight (or equivalent BI tools).
Streaming & Event-Driven Architecture
* Design and implement real-time streaming pipelines using:
* Kafka (Confluent or AWS MSK)
* AWS Kinesis / Kinesis Firehose
* EventBridge for AWS-native event routing
* Define patterns for:
* Ordering, replay, retention, and idempotency
* At-least-once and exactly-once processing
* Dead-letter queues (DLQs) and failure recovery
* Implement CDC pipelines from Aurora PostgreSQL into Kafka and the lakehouse.
Event Contracts & Schema Management
* Define and govern event contracts using Avro or Protobuf.
* Manage schema evolution through Schema Registry, including:
* Compatibility rules
* Versioning strategies
* Backward and forward compatibility
* Align domain events with Kafka topics and analytical storage models.
Migration & Modernization
* Assess existing “as-is” ingestion mechanisms (APIs, files, SWIFT feeds, Kafka, relational stores).
* Design and execute migration waves, cutover strategies, and rollback runbooks.
* Ensure minimal disruption during platform transitions.
Governance, Quality & Security
* Apply data-as-a-product and data mesh principles:
* Clear ownership
* Quality SLAs
* Access controls
* Retention and lineage
* Implement security best practices:
* Data classification
* KMS-based encryption
* Tokenization where required
* Least-privilege IAM
* Immutable audit logging
Observability, Reliability & FinOps
* Build observability for streaming and data platforms using:
* CloudWatch, Prometheus, Grafana
* Track operational KPIs:
* Throughput (TPS)
* Processing lag
* Success/error rates
* Cost per million events
* Define actionable alerts, dashboards, and operational runbooks.
* Design for high availability with multi-AZ / multi-region patterns, meeting defined RPO/RTO targets.
Hands-On Engineering
* Write and review production-grade code using:
* Python, Scala, SQL
* Spark / AWS Glue
* AWS Lambda & Step Functions
* Build infrastructure using Terraform (IaC).
* Implement CI/CD pipelines (GitLab, Jenkins).
* Enforce automated testing, performance profiling, and secure coding practices.
Required Skills & Experience
Streaming & Event-Driven Systems
* Strong experience with Kafka (Confluent) and/or AWS MSK
* Experience with AWS Kinesis / Firehose
* Deep understanding of:
* Event ordering and replay
* Delivery semantics
* Outbox and CDC patterns
* Practical experience using EventBridge for event routing and filtering
AWS Data Platform
* Hands-on experience with:
* S3, Glue, Athena
* Redshift
* Step Functions and Lambda
* Familiarity with Iceberg-based lakehouse architectures
* Experience building streaming pipelines into S3 and Glue
Payments & Financial Messaging
* Experience with payments data and flows
* Knowledge of ISO 20022 messages:
* PAIN, PACS, CAMT
* Understanding of payment lifecycle, reconciliation, and statements
* Exposure to API, file-based, and SWIFT-based integration channels
Data Architecture Fundamentals (Must-Have)
* Logical data modeling (ER diagrams, normalization up to 3NF/BCNF)
* Physical data modeling:
* Partitioning strategies
* Indexing
* SCD types
* Strong understanding of:
* Transactional vs analytical schemas
* Star schema, Data Vault, and 3NF trade-offs
* Practical experience with:
* CQRS and event sourcing
* Event-driven architecture
* Domain-driven design (bounded contexts, aggregates, domain events)