Open Targets (OT) is a unique public-private partnership working to deliver experimental data and informatics platforms that enable researchers to make more informed decisions about target selection for drug discovery. OT is a shared initiative between the European Bioinformatics Institute (EMBL-EBI), a global leader in the management, integration and analysis of public domain life science data; world-leading pharmaceutical companies GSK, Sanofi, Bristol Myers Squibb, Pfizer and Genentech; and the Wellcome Sanger Institute.
Generative AI has revolutionised the way we interact with knowledge. To benefit from the advances in LLM technology inside Open Targets, we are extending the capabilities of our platform towards LLM integration using open-source frameworks. The project aims to improve on extraction, representation, and usage of scientific knowledge, and present this knowledge to platform users in a user-friendly way. The central aims of the role we describe below will be 1) extending knowledge representation capabilities of the Open Targets platform towards custom knowledge graphs, and 2) interfacing with knowledge extraction and knowledge usage teams to ensure effective knowledge representations for any given task.
Your role
We are seeking a highly skilled and motivated Research Software Engineer with expertise in Python and databases to join the AI knowledge management project for 3 years. We are open to applicants at various career stages, with particular interest in individuals who are eager to utilise cutting-edge technologies to address complex challenges in software development and informatics in the context of drug discovery. This position would be embedded within the Open Targets project team in the Saez-Rodriguez Group at the European Bioinformatics Institute and benefit from joint supervision with Sebastian Lobentanzer in the Saez-Rodriguez Group at Heidelberg University Hospital (UKHD).
You will work collaboratively across the project group with other experts in ML/AI, NLP, data integration and product delivery across ChEMBL, ePMC, Open Targets and Heidelberg University Hospital on a common goal to integrate cutting-edge technology for knowledge extraction, representation and interpretation to help drug discovery scientist. As a crucial member of the project team team, you will design, build, and operate cloud-first software that interfaces with large-scale biomedical data and drug discovery. You will contribute to developing informatics tools designed to support identifying and prioritising drug targets. Leveraging cutting-edge technologies and the expertise of our product owners and industry stakeholders, you will work in a dynamic, multidisciplinary, international environment to tackle a wide range of algorithmic and technical challenges.
As a Research Software Engineer you will be instrumental in extending our Open Targets Platform framework to include a modular knowledge graph platform. Your expertise will enhance the robustness and efficiency of our data processing and knowledge representation systems, contributing directly to our open science initiatives.
As part of a dynamic, collaborative, and international team, you will be responsible for:
1. Developing and implementing a knowledge graph framework on top of the existing data lake to improve our data sharing and analysis pipelines to assist drug discovery user stories.
2. Working closely with data provision and analysis engineers up- and downstream of the framework.
3. Working in an open-source environment, contributing to codebases and collaborating on agile development.
4. Writing clean, efficient, and readable Python code to support our internal pipelines and integrate Large Language Models.
Actively disseminating the outcomes of the project to the scientific community and stakeholders through well-crafted presentations and publications, community forums and blog.
You have
5. Advanced degree (MSc, PhD) in computer science, bioinformatics, software development, or a related field.
6. Strong skills in Python and familiarity with relevant frameworks and tools.
7. Experience with databases and their Python integrations.
8. Proficient in open-source development and version control (e.g., Git).
9. Passionate about collaborative, agile development in a fast-paced environment.
10. Experience in independent problem-solving and examples of resolving complex issues.
11. Fluency in written and spoken English.
12. Ability to effectively communicate ideas or issues and work with team members from multidisciplinary backgrounds.
13. You might also have
14. Understanding of the ecosystem of biomedical and/or clinical data resources
15. Knowledge of human genetics, genomics, and/or drug discovery – or are interested in learning about these topics.
16. Experience working with knowledge graphs (e.g., https://biocypher.org) and graph databases (e.g., Neo4j).
17. Experience leveraging embeddings derived from graph-based representations and/or machine learning.
18. Experience building high-quality software and making frequent deployments as part of a regular software release process.
19. Experience working with infrastructure-as-code, continuous integration, containers, Cloud infrastructure, and deployment.
20. Interest in promoting your work and the ways we have solved complex challenges
Apply now! Benefits and Contract Information
21. Financial incentives: depending on circumstances, monthly family/marriage allowance of £260, monthly child allowance of £314 per child. Non resident allowance up to £532 per month. Annual salary review, pension scheme, death benefit, long-term care, accident-at-work and unemployment insurances
22. Hybrid working arrangements
23. Private medical insurance for you and your immediate family (including all prescriptions and generous dental & optical cover)
24. Generous time off: 30 days annual leave per year, in addition to eight bank holidays
25. Relocation package including installation grant (as applicable)
26. Campus life: Free shuttle bus to and from work, on-site library, subsidised on-site gym and cafeteria, casual dress code, extensive sports and social club activities (on campus and remotely)
27. Family benefits: On-site nursery, child sick leave, generous parental leave, holiday clubs on campus and monthly family and child allowances
28. Contract duration: This position is a 3 year project contract non-renewable
29. Salary: UK Equivalent £48,613.00 (Total package will be dependant on family circumstances)
30. International applicants: We recruit internationally and successful candidates are offered visa exemptions. Read more on our page for international applicants.
31. Diversity and inclusion: At EMBL-EBI, we strongly believe that inclusive and diverse teams benefit from higher levels of innovation and creative thought. We encourage applications from women, LGBTQ+ and individuals from all nationalities.
32. Job location: This role is based in Hinxton, near Cambridge, UK. You will be required to relocate if you are based overseas and you will receive a generous relocation package to support you.
To apply, please submit a covering letter and CV via our online system. Applications will close on 20/05/2024.