Opportunistic Analysis of Radiological Screening (OARS)
Supervisor: Professor Martin Graves, Professor of MR Physics (mjg40@cam.ac.uk)
Principle Supervisor Department: Radiology
Summary
Opportunistic screening taps into valuable imaging data from abdominal and thoracic imaging examinations that are typically unrelated to the main clinical reason for the scan. Most of this incidental imaging data has been underutilised. Yet, it holds promise in enhancing patient wellness by aiding in prevention, risk assessment, and early detection of diseases.
Evidence is mounting on the potential benefits of the detailed analysis of such data, for example, CT imaging biomarkers, particularly those related to body composition, show promise in helping radiologists estimate the biological age of patients and forecast potential cardiometabolic issues. These capabilities can sometimes match or even surpass existing clinical benchmarks. This project will utilise existing open-source software for deep learning based segmentation for the rapid retrospective and prospective analysis of routinely acquired abdominal CT images. The segmentations can then be used to extract several quantitative biomarkers from the images, such as visceral and subcutaneous fat volumes, muscle volume, liver volume and vascular calcification.
The objective of the project will be to develop a workflow where these biomarkers can be extracted entirely automatically and provided back as a structured report for the radiologist or referring clinicians to review. The various organ segmentations can then be used in a radiomics-based analysis where a multitude of latent image ‘texture’ features can also be extracted and compared between patients to investigate whether such image features can also be leveraged to reveal opportunistic detection of disease, such as liver fibrosis or steatosis.
Large volumes, i.e., ‘big data’ already exist within the Cambridge University Hospitals PACS and data processing will be performed within the Trust’s secure Azure computing environment. Further development of the project will involve applying similar approaches to opportunistic MRI data acquisitions.
Unravelling the causes of clonal haematopoietic expansion
Supervisor: Dr Siddhartha Kar, Group Leader & UKRI Future Leaders Fellow (sk718@medschl.cam.ac.uk)
Principle Supervisor Department: Oncology
Summary
The pervasive impacts and interaction of ageing and somatic mutation shape the landscape of human disease – particularly cancer. A ubiquitous feature of ageing is the development of somatic mutation-driven clonal expansions in aged tissues. In blood, somatic mutations in certain cancer driver genes that enhance the cellular fitness of individual haematopoietic stem cells and their progeny give rise to the common age-related phenomenon of clonal haematopoiesis (CH). CH is associated with an increased risk of blood cancers and, as we have shown for the first time in a genome-wide association study (GWAS) published in 2022 (PMID: 35835912), also increases risks for several solid tumour types. We have also demonstrated that inherited/germline genetic variants, leucocyte telomere length, and smoking have causal effects on the risk of acquiring CH. Now in this data science PhD project we will set out to discover the causes of CH clonal expansion, i.e., unravel the determinants that govern the process by which a CH clone once formed expands and eventually progresses towards cancer. The availability of ever-larger population-level blood-based biobank data from the UK (UK Biobank, Our Future Health, INTERVAL), US (All of US), and the Netherlands (Lifelines, which currently contains the largest collection of longitudinally-sampled individuals with CH) coupled with the recent development of methods such as “passenger-approximated clonal expansion rate” (PMID: 37046083) will enable us to “phenotype” these biobanks for CH expansion. The specific aims of this PhD are:
1. Adapt and apply emerging clonal expansion rate measurement methods designed to measure this rate from single timepoint DNA sequences in individuals with CH in global biobanks, using the longitudinal data in the Lifelines cohort to validate the methods.
2. Conduct a GWAS to identify germline genetic variants associated with CH clonal expansion rate and link variants to target genes and pathways.
3. Apply modern causal inference approaches rooted in genetics to identify lifestyle (e.g., accelerometer-derived physical activity), pathophysiological (e.g., telomere length), proteomic (e.g., C-reactive protein), and metabolomic (e.g., lipids) associations with CH clonal expansion rate.
The expansion of cellular clones carrying somatic driver gene mutations is one of the defining features of carcinogenesis and the project will be key to understanding myriad influences on this process. Our understanding, in turn, will have crucial implications for the early detection of cancer and uncover targets for pharmacologic and non-pharmacologic intervention to prevent progression to cancer in those at high risk.
Functional inference of Gene Regulatory Networks dynamics (FIGd)
Supervisor: Dr Irina Mohorianu, Head of Bioinformatics/ Scientific Computing (iim22@cam.ac.uk)
Principle Supervisor Department: Neurosciences
Summary
The project proposes a novel angle for the inference of Gene Regulatory Networks (GRNs) dynamics that considers cell-cell interactions and organism/tissue-specific transcriptomic and epigenetic signatures.
Recent single-cell -omics studies underwent significant improvements in numbers and heterogeneity of cells, corroborated with increased details on expression signatures. The advances in refining cell populations, and redefining cell-plasticity, revealed new challenges in understanding/predicting causality (regulatory interactions), some addressed computationally, through tools for predicting cell-cell interactions (CellPhoneDB, NicheNet), pseudotime and inferring GRNs (Scenic suite). A new sequencing-based approach RABIDseq (Clark 2021), developed and optimized for the central nervous system, allows barcoded viral tracing of cell–cell interactions at single-cell resolution.
Using an inducible system that directly reprograms human fibroblasts into induced neural stem cells the Pluchino lab generated stably expandable human iNSC lines from patients with progressive multiple sclerosis and controls, displaying a pro-inflammatory senescent phenotype also transferred horizontally to non-senescent cells in vitro. To study cell-cell communication for neighbouring cells, within a complex environment in vivo, they developed a hybrid brain organoid system where developmentally mature brain cortical organoids are cut at the air-liquid interface and iNSCs are transplanted and integrated within the tissue.
Despite the in-silico interest, there are no easy-to-access, interactive pipelines for analysing RABIDseq data; the existing tools have only command-line versions, hindering the data-mining and hypothesis generation from wet-lab researchers. Also, currently there is no link between cell-cell interactions, direct gene interactions, and the study of interaction-dynamics.
With FIGd a new pipeline for RABIDseq outputs will be developed/optimised, which will facilitate the integration with other high-throughput outputs (e.g. single-cell RNAseq, ATACseq, and spatial omics). This novel approach will also permit, for the first time, to study the tissue/ organism specific pathways, slicing the GRN dynamics not only on time, but also on functional criteria.
Data science approaches to understanding causes and predicting outcomes in mental disorder and brain structure/function
Supervisor: Professor Graham Murray, Professor of Psychiatry and Neuroscience (gm285@cam.ac.uk)
Principle Supervisor Department: Psychiatry
Summary
The student will take a clinical informatics or bioinformatics approach to investigate causes and/or outcomes in mental disorder and/or related brain phenotypes.
This could involve using GWAS summary statistics for metabolomics, genomics and proteomics and relating these to mental disorder and /or brain phenotypes, using techniques such as statistical genomics and mendelian randomisation. It could also or alternatively involve clinical data from electronic health records, in combination with biomarker data, with a focus on psychosis and/or depression and possible relation to physical health (cardio-metabolic or immune mechanisms).