Job Title: Sr. DevOps Engineer
Location: London, UK (3 days in office)
SC Cleared: Required
Job Type: Full-Time
Experience: 5-8 years
About the Role
We are seeking a highly skilled and experienced Platform Engineer to build and maintain the core infrastructure and tooling that empowers our data scientists, economists, and developers working on our cutting-edge Azure Databricks Economic Data Platform. This platform is critical for our Monetary Analysis, Forecasting, and Modelling activities. The Platform Engineer will focus on creating a self-service, scalable, and reliable platform that streamlines development workflows, simplifies infrastructure management, and enhances overall productivity. This role requires a strong understanding of cloud computing (specifically Azure), infrastructure-as-code (IaC), DevOps practices, containerisation, orchestration, and a passion for building developer-friendly platforms.
Key Responsibilities:
Platform Design & Development:
* Design, develop, and maintain the core platform infrastructure on Azure, including networking, compute, storage, security, and identity management.
* Implement infrastructure-as-code (IaC) using tools like Terraform, ARM templates, or Bicep to automate infrastructure provisioning and management.
* Develop and maintain platform components, such as APIs, CLIs, and web interfaces, to provide self-service capabilities to users.
Azure Databricks Integration & Optimisation:
* Deeply integrate Azure Databricks into the platform, ensuring seamless access and efficient resource utilisation.
* Implement automation for Databricks workspace setup, cluster configuration, and job deployments.
* Optimise Databricks workloads for performance, scalability, and cost-effectiveness.
Containerisation & Orchestration:
* Implement and manage containerised applications and services using Docker and Kubernetes (or Azure Kubernetes Service - AKS).
* Design and implement container orchestration strategies for deploying and scaling platform components.
CI/CD Pipeline Automation:
* Design and implement robust CI/CD pipelines for building, testing, and deploying platform components and user applications.
* Automate build processes, unit tests, integration tests, and deployment processes.
* Implement advanced deployment strategies (e.g., blue/green deployments, canary releases).
Monitoring, Logging & Alerting:
* Implement comprehensive monitoring, logging, and alerting systems to proactively identify and address performance issues, errors, and security threats.
* Use tools like Azure Monitor, Prometheus, Grafana, or similar to collect and analyse metrics, logs, and traces.
* Configure alerts and notifications to ensure timely responses to critical events.
Security & Compliance:
* Implement security best practices and controls within the platform infrastructure and CI/CD pipelines.
* Ensure compliance with relevant security standards and regulations.
* Implement security scanning and vulnerability management processes.
Documentation & Support:
* Develop and maintain comprehensive documentation for the platform, including API documentation, user guides, and troubleshooting guides.
* Provide support to users of the platform.
Collaboration & Communication:
* Collaborate closely with data scientists, economists, developers, and other stakeholders to understand their needs and gather feedback on the platform.
* Communicate technical concepts effectively to both technical and non-technical audiences.
Essential Skills & Experience:
* 5+ years of experience in platform engineering, DevOps engineering, or a related role.
* Strong experience with Azure cloud platform and services.
* Extensive experience with IaC tools like Terraform, ARM templates, or Bicep.
* Solid understanding of CI/CD principles and experience with CI/CD tools like Azure DevOps, Jenkins, or GitLab CI.
* Strong experience with containerisation technologies like Docker and orchestration tools like Kubernetes (or AKS).
* Experience with monitoring and logging tools.
* Scripting skills in PowerShell, Bash, YAML and Python. Good understanding of networking concepts and security best practices.
* Excellent problem-solving and troubleshooting skills.
* Strong communication and collaboration skills.
* Experience with Azure Databricks and its integration with platform tooling.
* Experience with configuration management tools like Ansible, Puppet, or Chef.
* Experience building and maintaining internal developer platforms.
* Experience working in a regulated industry (e.g., financial services).
* Azure certifications (e.g., Azure DevOps Engineer Expert, Azure Administrator Associate, Azure Solutions Architect Expert).