Aims of this eQTL analysis
a. To identify genetic loci that affect mRNA expression level in whole blood tissue of Asian Crohn's disease patients.
b. To determine the most functionally relevant genes at established susceptibility loci to Crohn's disease and to find novel disease-associated genes using integrated analysis of GWAS and eQTL data in Asians.
Materials and methods
Total number of Crohn's disease patients included in this eQTL analysis :101 samples in Korean population.
Platform :Illumina HiSeq 2500, 2 x 101BP.
Genotype information :6,451,113 SNPs in published GWAS dataset (Yang et al. Gastroenterology. 2016; 151(6):1096–1099.e4. doi:10.1053/j.gastro.2016.08.025).
1. FastQC v0.11.7 - data quality check
a. Number of total reads > 93 million.
b. GC percent - 46 ~56%.
c. Adapter check.
2. Cutadapt v1.16 - adapter trimming
a. Quality Phred score > 33.
3. STAR 2.6.0c - mapping
a. GRCh37 reference genome in GENCODE release 19.
4. RNA-SeQC v22.214.171.124 - calculating mapped reads and mapping quality check
a. Unique mapping rate check (94~99%).
b. Ribosomal RNA ratio < 40%.
c. Mapping quality > 255, and mismatched bases ≤ 6.
d. Calculation of TPM.
5. R package: edgeR - outlier identification and normalization
a. Normalization and estimation of dispersions.
b. Outliers check using MDS plot.
c. TMM normalization using genes with TPM > 0.1 and read count > 6 in at least 20% of 101 samples.
6. PEER v1.3 - calculation of hidden factors
a. Calculated 15 PEER factors were included.
7. FastQTL v2_184 - eQTL analysis
a. 20 covariates : 3 pca, sex, repeat or not, 15 PEER factors.
b. Estimation of permutation P value.
c. 3,816 eGenes with q value < 0.05.
d. Estimation of norminal P value.
e. A total of 135,164 eQTLs over nominal P value threshold.