evaluate$benchmark

Unified benchmark evaluation across PRS analyses

Description

evaluate$benchmark() orchestrates multiple evaluation analyses in one call: discrimination, calibration, pairwise comparison, similarity, risk strata, and incremental value (when baseline covariates are provided).

Usage

`evaluate$benchmark`(models = NULL, on, outcome, baseline.covariates = NULL, type = c("auto", "binary", "continuous", "survival"), time = NULL, event = NULL, obs = NULL, scores.layer = X, score.mode = c("compute.if.missing", "require", "recompute"), score.args = list(), metrics = NULL, reference.model = NULL, compare.test = c("auto", "delong", "bootstrap"), similarity.source = c("scores", "variants"), quantiles = c(0.2, 0.8), bootstrap = 2000, conf.level = 0.95)

Arguments

`models`	Optional model specification.
`on`	Evaluation context (`PolyGeniusData` or genotype input). When genotype input is supplied, PolyGenius internally materializes a temporary `PolyGeniusData` object to resolve and evaluate scores.
`outcome`	Outcome definition. When `on` is `PolyGeniusData`: unquoted expression resolved on observations. When `on` is genotype input: vector of length `n_obs`, list of vectors (each length `n_obs`), or table with one or more columns and `nrow == n_obs`.
`baseline.covariates`	Optional unquoted baseline covariates expression.
`type`	Outcome type (`“auto”`, `“binary”`, `“continuous”`, `“survival”`).
`time`	Unquoted time-to-event expression (required for survival).
`event`	Unquoted event-indicator expression (required for survival).
`obs`	Optional unquoted observation subset expression.
`scores.layer`	Score layer to read/use (symbol or single string).
`score.mode`	Score resolution mode. If `on` is a `PolyGeniusData` object, computed scores are written into that object. If `on` is genotype input, computed scores exist only in the temporary internal evaluation data object and are not returned.
`score.args`	Named list passed to `compute$scores(…)` when needed.
`metrics`	Optional metric subset across analyses. When `NULL`, defaults are used for each analysis.
`reference.model`	Optional reference model passed to `evaluate$compare()` when pairwise comparison results should be anchored to one model.
`compare.test`	Pairwise-comparison test (`“auto”`, `“delong”`, `“bootstrap”`).
`similarity.source`	Similarity source passed to `evaluate$similarity()`. Defaults to `“scores”`.
`quantiles`	Quantiles used by risk-strata analysis.
`bootstrap`	Number of bootstrap replicates for supported analyses.
`conf.level`	Confidence level for interval estimates.

Value

A PolyGeniusEvaluation object with merged benchmark result rows. Any temporary PolyGeniusData constructed from genotype input is not returned. The returned object also carries the merged artifacts, diagnostics, and benchmark log assembled from the component evaluations.

See Also

Other evaluate: evaluate.calibration(), evaluate.compare(), evaluate.discrimination(), evaluate.incremental(), evaluate.risk.strata(), evaluate.similarity()