Data Engineer - £350PD - Remote Required Technical Skills Data Pipeline & ETL Design, build, and maintain robust ETL/ELT pipelines for structured and unstructured data Hands-on experience with AWS Glue and AWS Step Functions Implementation of data validation, data quality frameworks, and reconciliation checks Strong error handling, monitoring, and retry strategies in production pipelines Experience with incremental data processing patterns (CDC, watermarking, upserts) AWS Data Services Amazon S3: data lake architectures, partitioning strategies, lifecycle policies DynamoDB: data modeling, secondary indexes, streams, and performance optimization Amazon Redshift: foundational querying, integrations, and performance considerations AWS Lambda for scalable data processing and orchestration Amazon EventBridge for event-driven and decoupled data pipelines Vector Databases & Embeddings Strong understanding of vector database concepts, indexing strategies, and performance trade-offs Design and implementation of embedding generation pipelines Optimization techniques for semantic search and retrieval accuracy Effective chunking strategies for document ingestion and processing Experience with CockroachDB deployment and management is beneficial Document Processing Experience with PDF parsing libraries such as PyPDF2, pdfplumber, and AWS Textract Integration of OCR solutions (AWS Textract, Te...