Multimodal Foundational Models
PhD Projects in Artificial Intelligence

Project Summary
In many scientific domains, including sustainability, medical and materials sciences, combining information from multiple modalities is of increasing importance to perform predictive and generative tasks accurately.For example, knowledge graphs are widely used for drug repurposing and materials discovery and painstakingly curated in the form of databases based on information extracted from scientific literature and empirical evidence or experimental data. Information from such knowledge graphs can be used to improve the grounding of large language models by improving retrieval performance of relevant evidence.
While various approaches exist to train inherently multimodal models, this approach is not always feasible or scalable, especially under resource constraints. Model stitching and pairing is one approach to multimodal alignment and fusion and can be used as a potential solution for unimodal model selection or to train adapter networks. Multimodal alignment can be implicit or explicit, while multimodal fusion can occur at a data-, feature- or model level. In this project, the student is going to explore resource- and data-efficient ways of fusing graph-structured, imaging and text data for scientific applications with a strong focus on medical sciences.
Potential Supervisors
- Dr Ira Ktena (Research Scientist, EIT)
- Dr Liam Atkinson (Research Engineer, EIT)
- Dr Ben Chamberlain (Research Scientist, EIT)
- Additional Supervisor(s) from the University of Oxford
Skills Recommended
- Strong background in linear algebra and optimisation
- Experience with established model architectures (e.g. Transformers)
- Proficient in Python and experience with deep learning frameworks (PyTorchor JAX)
Skills to be Developed
- Designing novel multimodal model architectures
- Developing large scale multi-GPU training and inference codebases
- Working with graph-structured, imaging and text data
- Domain expertise in medical, life or materials science
University DPhil Courses
Relevant Background Reading
- (Almost) Free Modality Stitching ofFoundation Models
- RevisitingModel Stitching to Compare Neural Representations
- Multimodal Alignment and Fusion: A Survey
- Let Your Graph Do the Talking: EncodingStructured Data for LLMs