Supporting data for "An extensible big data software architecture managing a research resource of real-world clinical radiology data linked to other health data from the whole Scottish population"

Dataset type: Imaging, Software
Data released on August 20, 2020

Nind T; Sutherland J; McAllister G; Hardy D; Hume A; MacLeod R; Caldwell J; Krueger S; Tramma L; Teviotdale R; Abdelatif M; Gillen K; Ward J; Scobbie D; Baillie I; Brooks A; Prodan B; Kerr W; Sloan-Murphy D; Herrera JR; McManus D; Morris C; Sinclair C; Baxter R; Parsons M; Morris A; Jefferson E (2020): Supporting data for "An extensible big data software architecture managing a research resource of real-world clinical radiology data linked to other health data from the whole Scottish population" GigaScience Database. http://dx.doi.org/10.5524/100780

DOI10.5524/100780

To enable a world-leading research dataset of routinely collected clinical images linked to other routinely collected data from the whole Scottish National population. This includes 30 million different radiological examinations from a population of 5.4 million and over 2 petabytes of data collected since 2010. Scotland has a central archive of radiological data used to directly provide clinical care to patients. We have developed an architecture and platform to securely extract a copy of that data, link it to other clinical or social data sets, remove personal data to protect privacy, and make the resulting data available to researchers in a controlled Safe Haven environment. An extensive software platform has been developed to host, extract and link data from cohorts to answer research questions. The platform has been tested on 5 different test cases and is currently being further enhanced to support 3 exemplar research projects. The data available is from a range of radiological modalities, scanner types and collected under different environmental conditions. This “real-world”, heterogenous data is highly valuable for training algorithms to support clinical decision making, especially for deep learning where large data volumes are required. The resource is now available for international research access. The platform and data can support new health research using Artificial Intelligence and Machine Learning technologies as well as enabling discovery science.

If you would like to access the SMI dataset for a research project, please contact eDRIS in the first instance.

Keywords:





File NameSample IDData TypeFile FormatSizeRelease Date 
GitHub archivezip56.17 KB2020-08-03
GitHub archivezip2.12 MB2020-08-03
GitHub archivezip780.21 KB2020-08-03
GitHub archivezip5 MB2020-08-03
GitHub archivezip30.41 KB2020-08-03
GitHub archivezip35.4 MB2020-08-03
GitHub archivezip2.81 MB2020-08-03
readmeTEXT5.66 KB2020-08-03
GitHub archivezip25.9 MB2020-08-03
Displaying 1-9 of 9 File(s).
Funding body Awardee Award ID Comments
Medical Research Council A Morris MR/M501633/1
Wellcome Trust A Morris WT086113 Scottish Health Informatics Programme (SHIP)
EPSRC E Jefferson MR/S010351/1
Health Data Research UK M Parsons HDR-5012
Chief Scientist Office of the Scottish Government Health and Social Care Directorates A Morris Farr Scotland
Scottish Government E Jefferson Imaging AI
Date Action
August 20, 2020 Dataset publish
August 25, 2020 Dataset publish
September 14, 2020 Manuscript Link added : 10.1093/gigascience/giaa095
October 7, 2022 Manuscript Link updated : 10.1093/gigascience/giaa095
November 11, 2022 Manuscript Link updated : 10.1093/gigascience/giy060