compute$scores
Compute Polygenic Risk Scores (PRS)
Description
Calculates the PRS values for every model in a PolyGeniusData object.
Overview:
-
Collates the unique set of genetic variants across all PRS models.
-
Extracts allele dosages for each variant from the genotype data.
-
Matches each extracted variant to its model definition by chromosome and position, confirming which allele is the effect allele and flipping dosages when necessary.
-
Applies each model’s effect sizes (
beta) to the per-sample dosages and yields the PRS score.
When genotype data is split into multiple files, extraction, variant matching, and effect-size-by-dosage multiplication are performed per file to reduce computation time and memory use. The resulting per-file PRS values are summed to compute the final scores.
Usage
`compute$scores`(data, minor.allele.freq.threshold = 0, model.filter = function(variants) variants, models = NULL, dosages.fallback = NULL, simplify = FALSE)
Arguments
data
|
A |
minor.allele.freq.threshold
|
(Optional) A numeric in |
model.filter
|
(Optional) A function
|
models
|
(Optional) table or list of tables representing It must return a |
simplify
|
Logical indicating if in the case of computing a single PRS model should returned as a vector ( |
Details
Variant matching: While PRS model variants are named as chr_pos_nea_ea (for effect allele ea and non-effect allele nea), genotype files may use other representations such as chr:pos:a0:a1_a2, where a0 and a1 are the two alleles, and a2 indicates the dosage allele. Matching is done by chromosome and position, accepting a match if (ea == a0 & nea == a1) or (ea == a1 & nea == a0).
Correcting allele dosages: Once matched, the dosage allele a2 is compared to the model’s effect allele ea. If a2 != ea, the dosage used is 2 - dosage.
Debug information: When debug output is enabled by the caller, implementation-specific diagnostic files may be written alongside score outputs.
-
PLINK
.trawfiles of extracted variant dosages. -
PLINK
.afreqand.logfiles with allele frequency calculations. -
Matched variant
.traw.prescoresfiles containing matched and flipped allele dosages with columns:variant,flipped,beta,model.id, and one column per sample.
Note: writing these files can be time-consuming and may produce large output. You can set the number of threads using setDTthreads() to improve performance.
Value
Invisibly returns the input PolyGeniusData object with the following additions:
-
A matrix of PRS values (samples × models), stored in
data$scoresordata$layers[[key]]. -
A log entry under
data$log$misccontaining a variant-matching summary for diagnostics.