We are seeking a highly skilled Senior Data Engineer to join our systems engineering team. In this role, you will be at the forefront of innovation, designing and maintaining the robust data architectures that power mission-critical AI and NLP initiatives. Since we operate in a highly regulated and secure environment, you will focus heavily on on-premise infrastructure, ensuring our Generative AI capabilities are powerful, private, and resilient. Key Responsibilities Architect & Build: Design, develop, and optimize scalable data pipelines (ETL/ELT) within air-gapped or on-premise data centers. AI Integration: Engineer data structures specifically for Natural Language Processing (NLP) and Large Language Models (LLMs), including local vector databases and private model hosting. Infrastructure Management: Manage and scale on-premise big data clusters, ensuring high availability without reliance on public cloud providers. Data Governance: Maintain rigorous data quality and security standardscrucial for sensitive engineeringwhile managing complex datasets from disparate sources. Collaboration: Work alongside Data Scientists to transition GenAI prototypes into production-ready, locally-hosted solutions. Technical Requirements Languages Expert-level Python and advanced SQL. Data Engineering Experience with ETL/ELT and orchestration tools like Apache Airflow or NiFi. On-Prem Tech Proficiency with Hadoop/HDFS, Spark, and containerization via Docker/Kubernetes (K3s/OpenShift). AI/ML Practical experience with NLP (HuggingFace) and GenAI frameworks (LangChain) tailored for local execution. Databases Experience with PostgreSQL and on-prem Vector DBs (e.g., Milvus, Qdrant, or pgvector). Security Experience working within Linux-based secure environments and air-gapped networking. Essential Qualifications Education: A degree in Computer Science, Data Engineering, Mathematics, or a related technical field. Security Clearance: Must be eligible for high-level security clearance (SC or DV level). Technical Rigor: A "security-first" mindset with the ability to troubleshoot complex hardware/software interactions on-site.