GenotypeInfoSet-class

GenotypeInfoSet R6 class generator

Description

GenotypeInfoSet.class is an R6 generator for sets of GenotypeInfo objects. It stores the objects, exposes light helpers to fan out calls across all contained genotypes, and integrates with S3 [[, c(), and length() methods defined below.

Details

Construction: Use GenotypeInfoSet(…) which delegates to GenotypeInfoSet.class$new(…).

Initialization contract:

  • Pass one or more GenotypeInfo or GenotypeInfoSet objects .

  • Objects are stored as a flat set with genotype names having to be unique/

Fan-out helpers: Public methods such as ⁠$tidy()⁠, ⁠$lift()⁠, ⁠$variants.filterMAF()⁠, etc., call the identically named method on each contained genotype and return a named list of results (names match the set).

Selection & concatenation: See S3 ⁠[[.GenotypeInfoSet⁠, c.GenotypeInfoSet, length.GenotypeInfoSet, and names.GenotypeInfoSet provide a simple interface for using the GenotypeInfoSet.

Active bindings

  • names — character vector of genotype names

  • genotypes — list of GenotypeInfo objects (read-only)

Active bindings

names

Read-only character vector of genotype names.

genotypes

Read-only deep-cloned list of GenotypeInfo objects.

samples

Working sample IDs per genotype.

samples.full

Full (fileset) sample IDs per genotype

n.samples

Size of the working sample set per genotype

n.samples.full

Total number of fileset samples per genotype

Methods

Public methods


Method new()

Get or set the working sample IDs for each genotype in the set.

  • Getting samples (gis$samples): returns a named list where each element is the character vector of working sample IDs for the corresponding genotype.

  • Setting samples (gis$samples <- value):

    • If value is NULL, all genotypes are reset by passing NULL to each genotype’s ⁠$samples⁠.

    • Otherwise, value must be a named list whose names exactly match gis$names, with each element either NULL or an atomic character vector.

Create a new GenotypeInfoSet.

Usage
GenotypeInfoSet.class$new(...)
Arguments

One or more named GenotypeInfo objects. Names are assumed valid and are only checked for uniqueness.

Examples
\dontrun{
# Retrieve working IDs for each genotype (named list)
ids_by_geno <- gis$samples

# Set per-genotype subsets
gis$samples <- list(G1 = c("S01","S02"), G2 = c("A10","A11"))

# Reset all genotypes to the IDs present in their filesets
gis$samples <- NULL
}

Method print()

Print a concise summary of this GenotypeInfoSet object.

Usage
GenotypeInfoSet.class$print(...)
Arguments

Ignored

Returns

Invisibly returns self Extract dosages for given variants across all genotypes


Method dosages()

Usage
GenotypeInfoSet.class$dosages(
  variants,
  format = c("sample", "variant"),
  samples = NULL,
  load = FALSE,
  merge = FALSE,
  .message = NULL,
  .return.split.by.genotype = TRUE,
  logger = NULL
)
Arguments
variants

Vector / data.frame / bed1-like specification of variants to extract.

format

Character; “sample” or “variant” orientation of the output table.

samples

Optional character vector of sample IDs to include (defaults to objects ⁠$samples⁠).

load

Logical; if TRUE, load exported tables instead of returning file paths.

merge

Logical; with load=TRUE, return a single combined table.

.message

Optional message to display during processing.

.return.split.by.genotype

Logical; whether to return results split by genotype.

logger

Optional logger object propagated to each underlying GenotypeInfo call.

Returns

If load=FALSE: vector of file paths. If load=TRUE: list or merged data.frame. Compute PRS scores across all genotypes


Method scores()

Usage
GenotypeInfoSet.class$scores(
  models,
  samples = NULL,
  maf.threshold = 0,
  dosages.fallback = NULL,
  .message = NULL,
  .return.split.by.genotype = TRUE,
  logger = NULL
)
Arguments
models

One or more PRS models (PolyGeniusModel, PolyGeniusModelSet, data.frame, or list).

samples

Optional character vector of sample IDs to include (defaults to objects ⁠$samples⁠).

maf.threshold

Minor allele frequency threshold for filtering variants.

dosages.fallback

Optional fallback dosages for missing variants.

.message

Optional message to display during processing.

.return.split.by.genotype

Logical; whether to return results split by genotype.

logger

Optional logger object propagated to each underlying GenotypeInfo call.

Returns

List with scores (matrix samples × models) and variant.fate table. Compute KING kinship matrix among samples across all genotypes


Method samples.kinship()

Usage
GenotypeInfoSet.class$samples.kinship(
  ...,
  samples = NULL,
  .message = NULL,
  .return.split.by.genotype = TRUE,
  logger = NULL
)
Arguments

Additional arguments passed to GenotypeInfo$samples.kinship (e.g., variants).

samples

Optional character vector of sample IDs to include (defaults to objects ⁠$samples⁠).

.message

Optional message to display during processing.

.return.split.by.genotype

Logical; whether to return results split by genotype.

logger

Optional logger object propagated to each underlying GenotypeInfo call.

Returns

Symmetric numeric matrix (samples × samples) or list of matrices. Compute PCA on sample dosages across all genotypes (PLINK2 –pca)


Method samples.PCA()

Usage
GenotypeInfoSet.class$samples.PCA(
  ...,
  samples = NULL,
  .message = NULL,
  .return.split.by.genotype = TRUE,
  logger = NULL
)
Arguments

Additional arguments passed to GenotypeInfo$samples.PCA (e.g., npcs, variants, approx).

samples

Optional character vector of sample IDs to include (defaults to objects ⁠$samples⁠).

.message

Optional message to display during processing.

.return.split.by.genotype

Logical; whether to return results split by genotype.

logger

Optional logger object propagated to each underlying GenotypeInfo call.

Returns

List with embedding (matrix samples × PCs) and eigenvals, or list of such. Project samples onto PCA space across all genotypes (PLINK2 –score)


Method samples.PCAProject()

Usage
GenotypeInfoSet.class$samples.PCAProject(
  ...,
  samples = NULL,
  .message = NULL,
  .return.split.by.genotype = TRUE,
  logger = NULL
)
Arguments

Additional arguments passed to GenotypeInfo$samples.PCAProject (e.g., project.onto, npcs, variants, approx).

samples

Optional character vector of sample IDs to include (defaults to objects ⁠$samples⁠).

.message

Optional message to display during processing.

.return.split.by.genotype

Logical; whether to return results split by genotype.

logger

Optional logger object propagated to each underlying GenotypeInfo call.

Returns

List with embedding (matrix samples × PCs); may include variants, or list of such.


Method clone()

The objects of this class are cloneable with this method.

Usage
GenotypeInfoSet.class$clone(deep = FALSE)
Arguments
deep

Whether to make a deep clone.

Examples


## ------------------------------------------------
## Method `GenotypeInfoSet.class$new`
## ------------------------------------------------

## Not run: 
# Retrieve working IDs for each genotype (named list)
ids_by_geno <- gis$samples

# Set per-genotype subsets
gis$samples <- list(G1 = c("S01","S02"), G2 = c("A10","A11"))

# Reset all genotypes to the IDs present in their filesets
gis$samples <- NULL

## End(Not run)