Association Comparisons

Use association comparisons when the question is not just whether an association exists, but whether associations, groups, or model terms differ.

The comparison interface is question-first:

associate$compare(
  data,
  outcome = demented,
  predictors = PRS,
  covariates = c(age, PCA),
  by = sex,
  type = "heterogeneity"
)

associate$compare() compares association effects or model terms. It is separate from evaluate$compare(), which compares predictive performance between PRS models.

Comparison Types

Type Question Typical model idea
heterogeneity Does the X-Y association differ by group? interaction or equivalent pooled test
group Do outcome levels or survival curves differ across groups? grouped predictor, log-rank, or omnibus test
contrast Which groups or terms differ from each other? pairwise or specified contrasts
nested Does adding X improve the model beyond covariates? reduced vs full model comparison

Heterogeneity

Use heterogeneity comparisons when the scientific question is:

Is the association between X and Y different across levels of by?

het <- associate$compare(
  data,
  outcome = demented,
  predictors = PRS,
  covariates = c(age, PCA),
  by = sex,
  type = "heterogeneity"
)

For linear and logistic models, this corresponds to a pooled interaction-style test:

demented ~ PRS * sex + age + PCA

This is the correct formal comparison. Do not compare whether one subgroup p-value is significant and another is not.

Group Comparisons

Use group comparisons when by defines the groups whose outcomes should be compared.

Unadjusted survival-curve comparison:

tertile.km <- associate$compare(
  data,
  outcome = demented,
  time = age.death,
  by = prs.tertile,
  model = "km",
  type = "group"
)

Adjusted Cox group comparison:

tertile.cox <- associate$compare(
  data,
  outcome = demented,
  time = age.death,
  by = prs.tertile,
  covariates = c(sex, PCA),
  model = "cox",
  type = "group"
)

The KM version asks whether survival curves differ. The Cox version asks whether hazards differ after adjustment.

Contrasts

Use contrasts when an omnibus group test is not enough and the question is:

Which levels differ?

contrasts <- associate$compare(
  data,
  outcome = demented,
  time = age.death,
  by = prs.tertile,
  covariates = c(sex, PCA),
  model = "cox",
  type = "contrast",
  contrast = "pairwise",
  p.adjust.method = "BH"
)

Expected rows include comparisons such as:

  • Mid - Low;
  • High - Low;
  • High - Mid.

Pairwise rows should carry adjusted p-values because multiple contrasts are being tested.

Nested Model Comparisons

Use nested comparisons when the question is:

Does adding X improve the model beyond covariates?

nested <- associate$compare(
  data,
  outcome = demented,
  predictors = PRS,
  covariates = c(age, sex, PCA),
  type = "nested"
)

This compares a reduced model:

demented ~ age + sex + PCA

against a full model:

demented ~ PRS + age + sex + PCA

This is an inferential model comparison. Predictive added value is related but belongs to evaluate$incremental() or evaluate$compare().

Relationship To split.by

split.by can be used to repeat a comparison across another grouping variable:

cmp <- associate$compare(
  data,
  outcome = demented,
  predictors = PRS,
  by = sex,
  split.by = cohort,
  type = "heterogeneity"
)

This means:

In each cohort, test whether the PRS association differs by sex.

It does not mean that split.by itself defines the formal comparison.

Returned Object

Comparison results should return a PolyGeniusAssociations object, just like other association workflows. Omnibus comparisons may not have a single meaningful estimate; in those rows, the statistic and p-value are the primary inferential output.

comparison: Model Contrast

Column Meaning
comparison.type Planned comparison type: heterogeneity, group, contrast, or nested.
outcome, predictor, by Outcome, focal predictor where applicable, and comparison-defining variable.
contrast Pairwise or named contrast label where applicable.
family / model Model family used for the comparison.
estimate, se, lower, upper Effect estimate and uncertainty where the comparison has a coefficient-like scale.
statistic, p.value, adj.p.value Primary comparison test result.
n, n.events, n.competing Sample-size and event counts where relevant.
fit.id Artifact/diagnostic key.

Added Artifacts

Comparison artifacts depend on the comparison type. Survival group comparisons reuse the survival-family artifacts (curves, risk.table, group.summary). Contrast and nested-model comparisons may attach model diagnostics or contrast tables when those details do not belong in the one-row summary table.

km: Survival Group Comparison

Column Meaning
family "km".
outcome, time, predictor Event indicator, follow-up time, and grouped predictor.
term, term.type Grouped comparison row with term.type = "omnibus".
estimate, se, lower, upper Usually NA; the log-rank statistic is the inferential result.
statistic, p.value, adj.p.value Log-rank chi-square statistic and p-values.
n, n.events Analysis sample size and number of events.
effect.scale "median.time" for group summaries and survival plotting.
formula, fit.id Resolved Kaplan-Meier formula and artifact/diagnostic key.

Added Artifacts

Artifact Meaning
curves Survival curve steps for each group level.
risk.table Numbers at risk, events, and censoring over time by group.
group.summary Group-level records, events, median survival, and confidence intervals.
profile.table Analysis-ready observations with time, event, and group level.

cox: Adjusted Survival Group Comparison

Column Meaning
family "cox".
outcome, time, predictor Event indicator, follow-up time, and tested predictor as represented in formula.
term, term.type Cox coefficient row, usually main.
estimate Cox coefficient on the log-hazard scale.
se, lower, upper Standard error and confidence interval on the log-hazard scale.
statistic, p.value, adj.p.value Wald/z statistic and p-values.
n, n.events Analysis sample size and number of events.
effect.scale "log.hazard"; forest plots display hazard ratios.
formula, fit.id Resolved survival formula and artifact/diagnostic key.

Added Artifacts

Artifact Meaning
curves Analysis-level survival curve for the fitted Cox analysis.
risk.table Numbers at risk, events, and censoring over time.
group.summary Analysis-level records, events, and median survival summaries where available.
profile.table Analysis-ready observations with time, event, predictor, and linear predictor.

Plotting

Comparison rows can be shown as:

  • forest plots for contrast-like rows;
  • heatmaps for many outcomes or predictors;
  • survival curves for KM and survival group comparisons;
  • summary tables for omnibus heterogeneity or nested-model tests.

For survival group comparisons, curve and risk-table artifacts are stored on the returned object and read by:

visualize$associations$survival(tertile.km)

Comparison Versus Prediction

Use associate$compare() when the estimand is about association effects, groups, or model terms.

Use evaluate$compare() when the estimand is predictive performance:

evaluate$compare(
  models = c(model_a, model_b),
  on = data,
  outcome = demented
)

That returns delta-style performance metrics such as differences in AUC, RMSE, or C-index depending on outcome type.