Setp 5-1: Calculate statistics value of PRS model
(Rscript create by Soyoung Jeon; update by Ying-Chu Lo)
Function: prs-statistics
After obtained combined sscore file, prs-statistics
calculates BETA, AIC, AUC, PseudoR2 and OR ratio
How to use it?
Shell:
$ gprs prs-statistics --score_file [str] --pheno_file [str] --data_set_name [str] --prs_stats_R [str] --r_command [str] --output_name [str]
Python:
from gprs.gene_atlas_model import GeneAtlasModel
if __name__ == '__main__':
geneatlas = GeneAtlasModel( ref='1000genomes/hg19',
data_dir='data/2014_GWAS_Height' )
geneatlas.prs_statistics(output_name='2014height', score_file = "/home1/ylo40816/Projects/GPRS/tmp/2014height_250_0.02_0.1.sscore",
pheno_file = "Projects/GPRS/tmp/result/plink/prs/2014height_pheno.csv",
r_command='/spack/apps/linux-centos7-x86_64/gcc-8.3.0/r-4.0.0-jfy3icn4kexk7kyabcoxuio2iyyww3o7/bin/Rscript',
prs_stats_R="Projects/GPRS/gprs/prs_stats_quantitative_phenotype.R", data_set_name="2014height",
clump_kb='250',
clump_p1='0.02',
clump_r2='0.1'
)
output files
*_stat.txt
data | filter_condition | snps_nb | P | BETA | Degree_of_freedom | PseudoR2 | OR_top1_to_middle20 | OR_top2_to_middle20 | OR_top5_to_middle20 | OR_top10_to_middle20 |
---|---|---|---|---|---|---|---|---|---|---|
2014height | 250_0.02_0.1 | 143 | 0.5632 | 0.0019 | 1007 | 0.0002 | 0.5396 | 1.3899 | 0.9408 | 1.2972 |
Setp 5-2: Combined PRS statistics results (Optional)
Function:combine-prs-stat
If you have more than one trained PRS model, combine-prs-stat
function is designed to combine statistics results.
For instance: the first PRS model was filtered with P < 0.05, the second PRS model was filtered with P < 0.0005. You will have DATA_0.05_stat.txt/DATA_0.0005_stat.txt
Combining two statistic tables allows users easy to compare between PRS models
How to use it?
Shell:
$ gprs combine-prs-stat --data_set_name [str]
Python:
from gprs.gene_atlas_model import GeneAtlasModel
if __name__ == '__main__':
geneatlas = GeneAtlasModel( ref='1000genomes/hg19',
data_dir='data/2014_GWAS_Height' )
geneatlas.combine_prs_stat(data_set_name='2014height')
output files
*_combined_stat.txt
data | filter_conditions | P | BETA | PseudoR2 | OR_top1_to_middle20 | OR_top2_to_middle20 | OR_top5_to_middle20 | OR_top10_to_middle20 |
---|---|---|---|---|---|---|---|---|
2014height | 250_0.02_0.1 | 0.5632 | 0.0019 | 0.0002 | 0.5396 | 1.3899 | 0.9408 | 1.2972 |
2014height | 250_0.02_0.1 | 0.5632 | 0.0019 | 0.0002 | 0.5396 | 1.3899 | 0.9408 | 1.2972 |