Supporting data for "Chromatin conformation capture (Hi-C) sequencing of patient-derived xenografts: analysis guidelines"
Dataset type: Software, Bioinformatics
Data released on March 05, 2021
Dozmorov M; Tyc KM; Sheffield NC; Boyd DC; Olex AL; Reed J; Harrell JC (2021): Supporting data for "Chromatin conformation capture (Hi-C) sequencing of patient-derived xenografts: analysis guidelines" GigaScience Database. http://dx.doi.org/10.5524/100870
Sequencing of patient-derived xenograft (PDX) mouse models allows investigation of the molecular mechanisms of human tumor samples engrafted in a mouse host. Thus, both human and mouse genetic material is sequenced. Several methods have been developed to remove mouse sequencing reads from RNA-seq or exome sequencing PDX data and improve the downstream signal. However, for more recent chromatin conformation capture technologies (Hi-C), the effect of mouse reads remains undefined. We evaluated the effect of mouse read removal on the quality of Hi-C data using in silico created PDX Hi-C data with 10% and 30% mouse reads. Additionally, we generated two experimental PDX Hi-C datasets using different library preparation strategies. We evaluated three alignment strategies (Direct, Xenome, Combined) and three pipelines (Juicer, HiC-Pro, HiCExplorer) on Hi-C data quality. Removal of mouse reads had little-to-no effect on data quality than the results obtained with the Direct alignment strategy. Juicer extracted more valid chromatin interactions for Hi-C matrices, regardless of the mouse read removal strategy. However, the pipeline effect was minimal, while the library preparation strategy had the largest effect on all quality metrics. Together, our study presents comprehensive guidelines on PDX Hi-C data processing.
Additional details
Read the peer-reviewed publication(s):
(PubMed: 33880552)
Github links:
https://github.com/dozmorovlab/PDX-HiC_processingScripts
Accessions (data generated as part of this study):
BioProject:
PRJNA668904
| Sample ID | Taxonomic ID | Common Name | Genbank Name | Scientific Name | Sample Attributes |
|---|---|---|---|---|---|
| UCD52_CR_Arima_rep1 | 9606 | Human | human | Homo sapiens | Description:Hi-C sequence of patient-derived xenog... Alternative accession-BioSample:SAMN16427194 Analyte type:DNA ... + |
| UCD52_CR_Arima_rep2 | 9606 | Human | human | Homo sapiens | Description:Hi-C sequence of patient-derived xenog... Alternative accession-BioSample:SAMN16427195 Analyte type:DNA ... + |
| UCD52_CR_Phase_rep1 | 9606 | Human | human | Homo sapiens | Description:Hi-C sequence of patient-derived xenog... Alternative accession-BioSample:SAMN16427192 Analyte type:DNA ... + |
| UCD52_CR_Phase_rep2 | 9606 | Human | human | Homo sapiens | Description:Hi-C sequence of patient-derived xenog... Alternative accession-BioSample:SAMN16427193 Analyte type:DNA ... + |






