This isn’t your regular job. Almedia is a place where those who want to push harder can accelerate their careers faster than anywhere else. We’re aiming to become Germany’s second bootstrapped unicorn. Almedia is already Europe’s 3 fastest-growing company in 2025 (FT1000). We are building the future of marketing by rewarding our community of over 50 million users for engaging with our advertisers’ products. We are offering a new way to acquire users for the biggest companies in the world. At Almedia, you’ll: Own way more, way earlier — you’ll be trusted with responsibility fast. Push harder, get further — this isn’t a 9–5. We highly reward intensity. Join a rare environment — you will work with ambitious high-speed, high-ownership people. Fully present — we’re 5 days a week in the office to build the energising momentum we need. Staff Site Reliability Engineer / DevOps London or Remote About you An SRE or DevOps engineer with hands-on experience in high-traffic production systems Strong in Linux, databases (MySQL, Postgres, MongoDB, Redis), and networking fundamentals Comfortable with Kubernetes, CI/CD pipelines, and observability tools like Datadog A self-starter who thrives in scaling environments and can work independently without PMs Pragmatic, able to balance prevention, maintenance, and firefighting when needed Your mission is to Take ownership of uptime and reliability for a platform serving 50M users Build robust monitoring, alerting, and incident response practices Improve CI/CD pipelines and enable safe deployments (blue-green, canary) Partner with engineers across teams to fix pain points in infra, tooling, and reliability Bring initiatives that make the platform automatically reliable, cost-efficient, and scalable Your impact Collaborate with engineering teams to improve operational workflows and resilience Design smart alerts, improve observability, and drive better performance monitoring Lead incident response, including on-call, and drive improvement with blameless postmortems Build safer delivery methods and improve deployments with Kubernetes and GitLab pipelines Report directly to the CTO and act as the primary reliability leader in the company Your toolkit Linux, networking (TCP/IP), and distributed systems troubleshooting Databases: MySQL, Postgres, MongoDB, Redis Kubernetes, GitLab pipelines, CI/CD best practices Observability tools like Datadog, OpenTelemetry, or ELK stack Nice-to-haves: RabbitMQ, Kafka, Terraform, Ansible, GCP, Datadog What makes this role exciting Be the first senior SRE hire with ownership of reliability across the entire platform Shape infrastructure and processes for a scale-up growing beyond 100 FTE Work on a product serving millions of users worldwide with real engineering challenges Gain autonomy while collaborating with strong product and engineering teams Join a culture that values pragmatism, initiative, and continuous improvement Why Almedia? Scale With Almedia: Have a real impact and grow alongside a startup that has been profitable from day one. High-Growth Environment: We encourage all staff to take ownership of projects and consistently raise the bar. Do More, Get More: Generous bonus scheme to ensure great, proactive work is valued. We Listen: We regularly add to our benefits through rigorous employee feedback. We believe in fostering talent, evaluating all skill levels during the hiring process, and providing a clear path for growth. Almedia is an equal opportunity employer. We embrace and celebrate diversity, and encourage individuals from all backgrounds to apply.