Example dataset — get_example

Returns a list of example datasets.

Usage

get_example_data()

Value

A list with the following elements:

QTL: a tibble of QTL regions, with columns id, chromosome, start, end, score and name.
GWAS: a tibble of GWAS results, with columns id, chromosome, position and score.
DE: a tibble of differential expression results, with columns gene, chromosome, padj, log2FoldChange, start, end and label.
CAN: a tibble of candidate genes, with columns id, chromosome, start, end, name and gene_name.

Details

The dataset used in this example is presented in: Angelin-Bonnet et al., BMC Plant Biology (2023). In this study, tetraploid potato plants from a half-sibling breeding population were used to assess the genetic components of tuber bruising. Capture sequencing was used to obtain genomic information about the individuals, and a genome-wide association study (GWAS) was performed on 72,847 genomic biallelic variants obtained from 158 plants for which a bruising score was measured. The GWAS analysis was carried with the GWASpoly package. In addition, expression data was obtained for 25,163 transcribed genes, and a differential expression (DE) analysis was carried out between 41 low- and 33 high-bruising samples. Finally, a literature search yielded a list of 42 candidate genes identified in previous studies as involved in potato tuber bruising mechanisms. A subset of the GWAS and DE results, as well as the list of candidate genes from the literature, are made available in this function. From the complete GWAS results table, half of the genomic variants with a GWAS score < 3.5 were randomly selected and consequently discarded, yielding a dataset with GWAS scores for 35,481 variants. Similarly, half of the transcribed genes in the DE results table with an adjusted p-value > 0.05 were randomly selected and discarded, yielding a dataset with DE results for 10,671 transcribed genes. This filtering was performed to reduce the size of the datasets (in accordance with CRAN policies), but ensures that all significant markers and genes are retained in the datasets. Finally, some of the candidate genes located on chromosome 3 were removed from the example dataset for better clarity in the resulting HIDECAN plot, leaving 32 candidate genes. The QTL mapping data was randomly generated for illustration purposes, as no QTL mapping was performed in the original study.