Association Comparisons

Use association comparisons when the question is not just whether an association exists, but whether associations, groups, or model terms differ.

The comparison interface is question-first:

associate$compare(
  data,
  outcome = demented,
  predictors = PRS,
  covariates = c(age, PCA),
  by = sex,
  type = "heterogeneity"
)

associate$compare() compares association effects or model terms. It is separate from evaluate$compare(), which compares predictive performance between PRS models.

Comparison Types

Type	Question	Typical model idea
`heterogeneity`	Does the `X`-`Y` association differ by group?	interaction or equivalent pooled test
`group`	Do outcome levels or survival curves differ across groups?	grouped predictor, log-rank, or omnibus test
`contrast`	Which groups or terms differ from each other?	pairwise or specified contrasts
`nested`	Does adding `X` improve the model beyond covariates?	reduced vs full model comparison

Heterogeneity

Use heterogeneity comparisons when the scientific question is:

Is the association between X and Y different across levels of by?

het <- associate$compare(
  data,
  outcome = demented,
  predictors = PRS,
  covariates = c(age, PCA),
  by = sex,
  type = "heterogeneity"
)

For linear and logistic models, this corresponds to a pooled interaction-style test:

demented ~ PRS * sex + age + PCA

This is the correct formal comparison. Do not compare whether one subgroup p-value is significant and another is not.

Group Comparisons

Use group comparisons when by defines the groups whose outcomes should be compared.

Unadjusted survival-curve comparison:

tertile.km <- associate$compare(
  data,
  outcome = demented,
  time = age.death,
  by = prs.tertile,
  model = "km",
  type = "group"
)

Adjusted Cox group comparison:

tertile.cox <- associate$compare(
  data,
  outcome = demented,
  time = age.death,
  by = prs.tertile,
  covariates = c(sex, PCA),
  model = "cox",
  type = "group"
)

The KM version asks whether survival curves differ. The Cox version asks whether hazards differ after adjustment.

Contrasts

Use contrasts when an omnibus group test is not enough and the question is:

Which levels differ?

contrasts <- associate$compare(
  data,
  outcome = demented,
  time = age.death,
  by = prs.tertile,
  covariates = c(sex, PCA),
  model = "cox",
  type = "contrast",
  contrast = "pairwise",
  p.adjust.method = "BH"
)

Expected rows include comparisons such as:

Mid - Low;
High - Low;
High - Mid.

Pairwise rows should carry adjusted p-values because multiple contrasts are being tested.

Nested Model Comparisons

Use nested comparisons when the question is:

Does adding X improve the model beyond covariates?

nested <- associate$compare(
  data,
  outcome = demented,
  predictors = PRS,
  covariates = c(age, sex, PCA),
  type = "nested"
)

This compares a reduced model:

demented ~ age + sex + PCA

against a full model:

demented ~ PRS + age + sex + PCA

This is an inferential model comparison. Predictive added value is related but belongs to evaluate$incremental() or evaluate$compare().

Relationship To `split.by`

split.by can be used to repeat a comparison across another grouping variable:

cmp <- associate$compare(
  data,
  outcome = demented,
  predictors = PRS,
  by = sex,
  split.by = cohort,
  type = "heterogeneity"
)

This means:

In each cohort, test whether the PRS association differs by sex.

It does not mean that split.by itself defines the formal comparison.

Returned Object

Comparison results should return a PolyGeniusAssociations object, just like other association workflows. Omnibus comparisons may not have a single meaningful estimate; in those rows, the statistic and p-value are the primary inferential output.

comparison: Model Contrast

Column	Meaning
`comparison.type`	Planned comparison type: `heterogeneity`, `group`, `contrast`, or `nested`.
`outcome`, `predictor`, `by`	Outcome, focal predictor where applicable, and comparison-defining variable.
`contrast`	Pairwise or named contrast label where applicable.
`family` / `model`	Model family used for the comparison.
`estimate`, `se`, `lower`, `upper`	Effect estimate and uncertainty where the comparison has a coefficient-like scale.
`statistic`, `p.value`, `adj.p.value`	Primary comparison test result.
`n`, `n.events`, `n.competing`	Sample-size and event counts where relevant.
`fit.id`	Artifact/diagnostic key.

Added Artifacts

Comparison artifacts depend on the comparison type. Survival group comparisons reuse the survival-family artifacts (curves, risk.table, group.summary). Contrast and nested-model comparisons may attach model diagnostics or contrast tables when those details do not belong in the one-row summary table.

km: Survival Group Comparison

Column	Meaning
`family`	`"km"`.
`outcome`, `time`, `predictor`	Event indicator, follow-up time, and grouped predictor.
`term`, `term.type`	Grouped comparison row with `term.type = "omnibus"`.
`estimate`, `se`, `lower`, `upper`	Usually `NA`; the log-rank statistic is the inferential result.
`statistic`, `p.value`, `adj.p.value`	Log-rank chi-square statistic and p-values.
`n`, `n.events`	Analysis sample size and number of events.
`effect.scale`	`"median.time"` for group summaries and survival plotting.
`formula`, `fit.id`	Resolved Kaplan-Meier formula and artifact/diagnostic key.

Added Artifacts

Artifact	Meaning
`curves`	Survival curve steps for each group level.
`risk.table`	Numbers at risk, events, and censoring over time by group.
`group.summary`	Group-level records, events, median survival, and confidence intervals.
`profile.table`	Analysis-ready observations with time, event, and group level.

cox: Adjusted Survival Group Comparison

Column	Meaning
`family`	`"cox"`.
`outcome`, `time`, `predictor`	Event indicator, follow-up time, and tested predictor as represented in `formula`.
`term`, `term.type`	Cox coefficient row, usually `main`.
`estimate`	Cox coefficient on the log-hazard scale.
`se`, `lower`, `upper`	Standard error and confidence interval on the log-hazard scale.
`statistic`, `p.value`, `adj.p.value`	Wald/z statistic and p-values.
`n`, `n.events`	Analysis sample size and number of events.
`effect.scale`	`"log.hazard"`; forest plots display hazard ratios.
`formula`, `fit.id`	Resolved survival formula and artifact/diagnostic key.

Added Artifacts

Artifact	Meaning
`curves`	Analysis-level survival curve for the fitted Cox analysis.
`risk.table`	Numbers at risk, events, and censoring over time.
`group.summary`	Analysis-level records, events, and median survival summaries where available.
`profile.table`	Analysis-ready observations with time, event, predictor, and linear predictor.

Plotting

Comparison rows can be shown as:

forest plots for contrast-like rows;
heatmaps for many outcomes or predictors;
survival curves for KM and survival group comparisons;
summary tables for omnibus heterogeneity or nested-model tests.

For survival group comparisons, curve and risk-table artifacts are stored on the returned object and read by:

visualize$associations$survival(tertile.km)

Comparison Versus Prediction

Use associate$compare() when the estimand is about association effects, groups, or model terms.

Use evaluate$compare() when the estimand is predictive performance:

evaluate$compare(
  models = c(model_a, model_b),
  on = data,
  outcome = demented
)

That returns delta-style performance metrics such as differences in AUC, RMSE, or C-index depending on outcome type.

Association Comparisons

Comparison Types

Heterogeneity

Group Comparisons

Contrasts

Nested Model Comparisons

Relationship To split.by

Returned Object

comparison: Model Contrast

Added Artifacts

km: Survival Group Comparison

Added Artifacts

cox: Adjusted Survival Group Comparison

Added Artifacts

Plotting

Comparison Versus Prediction

Relationship To `split.by`