Staff AI Backend Engineer AWS / Node.js / TypeScript
Location: hybrid from Milton Keynes (min 3 days/month in office)
Team: Engineers across UK, US, India, and Philippines
Reports to: CTO (US)
Works closely with: Head of Engineering (India)
Why Serve First
Were a scrappy, well-funded (£4.5 million seed closed) AI startup turning raw customer feedback into real-time insight for businesses that care about CX. Our 2025 roadmap is ambitious: break apart our Node.js monolith into microservices, double our AI-driven workflows, and harden infrastructure for 100× traffic. Everyone ships. Everyone is on-call. Bureaucracy is nil. Velocity is high.
What Youll Do
Break up the monolith
Define service boundaries and lead the transition to a microservices architecture. Implement REST + SQS communication between services, containerized via ECS Fargate. You'll design services that scale, not snowball.
Own AI integration
Build features using OpenAI APIs today, and pave the path for tomorrow: think private model deployment, vector DBs, prompt orchestration frameworks, and usage monitoring. You'll lead the evolution toward multi-model support, caching layers, and Bedrock/RAG-native infrastructure.
Build and scale AI infra
Design training/inference workflows. Spin up model-serving infra on AWS (Bedrock, SageMaker, or container-based). Help make our AI systems observable, secure, and cost-efficient. Youll apply DevOps instincts to support LLM-powered production systems at scale.
Architect the future of our AI platform
Were not wrapping GPT, were building composable infrastructure for experimentation, scale, and optional self-hosting. You'll help define boundaries between orchestration and inference, expose tracing and prompt history, and make systems that our team can iterate on without chaos.
Champion testing and correctness
Define and enforce robust testing strategies: unit, integration, and load. Youll design systems that are testable by default, with clear mocks, interface contracts, and fast CI.
Estimate and scope work
You'll own delivery for complex features breaking them down into sensible milestones, identifying hidden risks, and clearly communicating tradeoffs. We don't over-spec; we trust senior engineers to lead the build and help shape the spec.
Make it observable
Design systems with first-class telemetry: structured logs, metrics, traces, and alerting. You'll help us make LLM behavior debuggable and traceable, from token usage to prompt mutation.
Think like a secure infra engineer
We handle sensitive customer data. You'll make security a first-class concern in system designPII handling, IAM design, secrets management, rate limiting, and GDPR-readiness.
Ship production-ready backend code
You'll work primarily in Node.js/(TypeScript soon!) using MongoDB (Mongoose), Redis, and job schedulers (cron/EventBridge).
Design cloud-flexible infra
Help keep our infrastructure cloud-agnostic. We're AWS-first today, but use modular Terraform so we can pivot to GCP for customer workloads if needed.
Mentor, review, and raise the bar
Lead code reviews, pair with engineers, and mentor the team. Youll help reinforce best practices without slowing things down. You'll also know when to lean on AI tools (and when not to).
Must-Haves
* 8+ years of backend engineering, with deep experience building distributed systems using Node.js/TypeScript on AWS.
* System design fluency especially in event-driven, autoscaling architectures.
* Production LLM experience: You've shipped features using OpenAI, Claude, or similar APIs. You understand token limits, prompt shaping, cost tradeoffs, and context handling.
* Infra-aware AI developer: You've helped stand up inference infra via Bedrock, SageMaker, or containerized flow sand you care about performance, cost, and traceability.
* Testing mindset: You design with testability in mind, and are confident in setting up or extending CI pipelines for coverage across microservices.
* MongoDB & Redis: You can tune indexes, optimize queries, and debug performance issues at the DB level.
* Terraform & Docker: You can bootstrap infra from scratch and debug cloud deployment issues without waiting on a DevOps team.
* Clear communicator: You write well, document clearly, and thrive in async-first workflows.
* Security + compliance basics: Comfortable designing within GDPR/SOC2 requirements, with a good grasp of secure architecture patterns.
* Estimation and delivery: You've scoped, built, and delivered complex backend features with minimal PM handholding.
Nice-to-Haves
* Background in CX, survey, or analytics SaaS.
* Bedrock, LangChain, or RAG experience.
* Experience with LLMOps: prompt versioning, feedback loops, prompt/response telemetry.
* GCP infra exposure or portability experience.
* React familiarity or empathy with frontend engineers.
* Incident response and blameless postmortem experience.
What We Offer
* Competitive salary (band shared at offer stage)
* Standard UK pension
* 20 days holiday + public holidays
* Generous hardware/kit budget
* High autonomy, massive scope
* Personal and Professional Development budget
* Additional Perks