AI Data Lead
$180,000 - $200,000 | New York (Remote) | Full-Time
Deeprec.ai is proud to announce our partnership with a leading AI Compliance company. They use a modern tech stack to help companies stay on track with laws and regulations worldwide. This remote AI Data Lead role will enable you to join A forward-thinking engineering culture that embraces AI tools as practical productivity accelerators and innovation enablers.
What You’ll Do
* Own the end-to-end Client & AI Data Vault, from ingestion to AI retrieval.
* Build and scale vector databases and RAG infrastructure in production.
* Prototype chunking and embedding strategies using real client data and AI coding tools.
* Develop parsers for complex documents including PDFs, DOCX, spreadsheets, and scans.
* Design data models connecting client content to regulatory concepts and gap analysis.
* Maintain high standards for data quality, performance, testing, and engineering practices.
Requirements
* Production experience with vector databases (e.g. Qdrant, Pinecone, Weaviate, pgvector), including tuning for performance and recall
* Experience building chunking and embedding pipelines for complex documents
* Strong SQL and data modelling skills in production systems
* Experience extracting data from PDFs, DOCX, and scanned documents (incl. OCR/layout-aware parsing)
* Strong Python plus at least one systems-level language
* Experience with Azure (preferred) or AWS/GCP, CI/CD, and containers
* Hands-on experience with RAG or hybrid retrieval systems
* Effective use of AI coding assistants in development workflows
* Proven track record of shipping production AI or data systems
What will make you great
* Experience with multi-tenant data architectures and isolation patterns
* Experience with Elasticsearch, OpenSearch, or similar search engines
* Background in NLP, information extraction, or document understanding
* Experience with Kafka or similar messaging systems
* Experience in regulated industries with strict audit and versioning requirements
* Contributions to open-source retrieval, embedding, or parsing tools
What You’ll Get?
* Join a small, high-impact AI team where the data layer is a core product enabler, not backend plumbing
* Direct access to leadership with fast feedback loops and real influence on architecture
* AI-first culture that treats tools as productivity multipliers
* Competitive compensation, benefits, and flexible working
* Opportunity to build the core data foundation of a fast-scaling compliance intelligence platform