Supporting data for "1399 H&E-stained sentinel lymph node sections of breast cancer patients: the CAMELYON dataset"

Dataset type: Imaging
Data released on May 21, 2018

Litjens G; Bandi P; Bejnordi BE; Geessink O; Balkenhol M; Bult P; Halilovic A; Hermsen M; de Loo Rv; Vogels R; Manson Q; Stathonikos N; Baidoshvili A; Diest Pv; Wauters C; Dijk Mv; Laak Jv (2018): Supporting data for "1399 H&E-stained sentinel lymph node sections of breast cancer patients: the CAMELYON dataset" GigaScience Database. http://dx.doi.org/10.5524/100439

DOI10.5524/100439

The presence of lymph node metastases is one of the most important factors in breast cancer prognosis. The most common strategy to assess the regional lymph node status is the sentinel lymph node procedure. The sentinel lymph node is the most likely lymph node to contain metastasized cancer cells and is excised, histopathologically processed and examined by the pathologist. This tedious examination process is time-consuming and can lead to small metastases being missed. However, recent advances in whole-slide imaging and machine learning have opened an avenue for analysis of digitized lymph node sections with computer algorithms. For example, convolutional neural networks, a type of machine learning algorithm, are able to automatically detect cancer metastases in lymph nodes with high accuracy. To train machine learning models, large, well-curated datasets are needed. We released a dataset of 1399 annotated whole-slide images of lymph nodes, both with and without metastases, in total three terabytes of data in the context of the CAMELYON16 and CAMELYON17 Grand Challenges. Slides were collected from five different medical centers to cover a broad range of image appearance and staining variations. Each whole-slide image has a slide-level label indicating whether it contains no metastases, macro-metastases, micro-metastases or isolated tumor cells. Furthermore, for 209 whole-slide images, detailed hand-drawn contours for all metastases are provided. Last, open-source software tools to visualize and interact with the data have been made available. A unique dataset of annotated, whole-slide digital histopathology images has been provided with high potential for re-use.

Additional details

Read the peer-reviewed publication(s):

(PubMed: 29860392)

Additional information:

https://github.com/GeertLitjens/ASAP

Projects:







Sample IDTaxonomic IDCommon NameGenbank NameScientific NameSample Attributes
normal_0019606HumanhumanHomo sapiens Description:Histology images of lymph node used in...
Sex:female
Sample source:CAMELYON16
...
+
normal_0029606HumanhumanHomo sapiens Description:Histology images of lymph node used in...
Sex:female
Sample source:CAMELYON16
...
+
normal_0039606HumanhumanHomo sapiens Description:Histology images of lymph node used in...
Sex:female
Sample source:CAMELYON16
...
+
normal_0049606HumanhumanHomo sapiens Description:Histology images of lymph node used in...
Sex:female
Sample source:CAMELYON16
...
+
normal_0059606HumanhumanHomo sapiens Description:Histology images of lymph node used in...
Sex:female
Sample source:CAMELYON16
...
+
normal_0069606HumanhumanHomo sapiens Description:Histology images of lymph node used in...
Sex:female
Sample source:CAMELYON16
...
+
normal_0079606HumanhumanHomo sapiens Description:Histology images of lymph node used in...
Sex:female
Sample source:CAMELYON16
...
+
normal_0089606HumanhumanHomo sapiens Description:Histology images of lymph node used in...
Sex:female
Sample source:CAMELYON16
...
+
normal_0099606HumanhumanHomo sapiens Description:Histology images of lymph node used in...
Sex:female
Sample source:CAMELYON16
...
+
normal_0109606HumanhumanHomo sapiens Description:Histology images of lymph node used in...
Sex:female
Sample source:CAMELYON16
...
+
Displaying 1-10 of 599 Sample(s).




File NameSample IDData TypeFile FormatSizeRelease Date 
GitHub archivearchive679.44 KB2018-05-16
ScriptUNKNOWN4.22 KB2018-05-16
Imagearchive714.99 KB2018-05-16
Scriptarchive5.22 KB2018-05-16
Scriptarchive3.08 KB2018-05-16
Imagearchive2.59 MB2018-05-16
Imagearchive7.34 MB2018-05-16
Imagearchive329.43 KB2018-05-16
normal_001ImageTIF1.17 GB2018-05-16
normal_002ImageTIF1.46 GB2018-05-16
Displaying 1-10 of 611 File(s).
Funding body Awardee Award ID Comments
Stichting IT Projecten
Fonds Economische Structuurversterking tEPIS/TRAIT project
Fonds Economische Structuurversterking DFES1029161
Fonds Economische Structuurversterking LSH-FES Program 2009
Fonds Economische Structuurversterking FES1103JJTBU
European Union 601040 FP7-funded VPH-PRISM project
Date Action
May 21, 2018 Dataset publish
July 9, 2018 Manuscript Link added : 10.1093/gigascience/giy065
August 1, 2018 File stage_labels.csv updated
November 11, 2022 Manuscript Link updated : 10.1093/gigascience/giy065