as.bigsnpr
Convert GenotypeInfo to bigsnpr format
Description
Converts a GenotypeInfo object to bigsnpr’s file-backed big.matrix format (.rds + .bk files). This enables use of bigsnpr-based algorithms such as LDpred2 and lassosum2 while maintaining compatibility with PolyGenius workflows.
Usage
as.bigsnpr(
genotype.info,
output = NULL,
output.name = NULL,
ind.row = NULL,
ind.col = NULL,
ncores = 1,
nthreads = NULL,
logger = NULL
)
Arguments
genotype.info
|
A |
output.name
|
Character; base name for output files (without extension). Default uses |
ind.row
|
Integer vector or |
ind.col
|
Integer vector or |
ncores
|
Integer; number of cores for parallel processing during the bigsnpr conversion step. Default is 1 (no parallelization). |
nthreads
|
Integer or |
logger
|
Optional logger object. If |
Details
The conversion process:
-
Format conversion: If not already in
“bfile”format, uses$tidy()to convert to PLINK.bed/.bim/.famformat. -
File merging: If multiple genotype files exist, uses
$merge()to combine them into a single fileset. -
Sample subsetting: Respects the
$samplesfield of the inputGenotypeInfoobject, converting only the specified working sample set. -
bigsnpr conversion: Calls bigsnpr’s
snp_readBed2()to create the file-backed matrix representation. -
Variant IDs: Uses PolyGenius standardized variant IDs (
chr:pos:a0:a1in lexicographic allele order) asmarker.ID, ensuring consistent matching with other PolyGenius operations like clumping.
The resulting bigsnpr object includes:
-
File-backed genotype matrix (memory-mapped for efficiency)
-
Variant map with chromosome, position, and alleles
-
Sample/family information
-
Standardized variant identifiers
Note on dependencies: This function requires the bigsnpr package to be installed but does not load it as a hard dependency. Install with install.packages(“bigsnpr”).
Value
A named list with the following components:
-
obj -
The bigSNP object returned by
bigsnpr::snp_attach(). -
rds -
Character; path to the
.rdsfile. -
map -
Data frame with variant information:
chr(character),pos(integer),a0(reference allele),a1(effect allele),marker.id(standardized PolyGenius ID). -
fam -
Data frame with sample information from the bigSNP object.
-
genotypes -
File-backed matrix (FBM) object; direct accessor for the genotype data. Dimensions are samples × variants.
-
build -
Character; genome build from the input
GenotypeInfo.
See Also
GenotypeInfo, LDpred2Algorithm, bigsnpr::snp_readBed2()
Examples
## Not run:
# Convert a reference panel to bigsnpr format
ref_panel <- referencePanels$get("EUR", "GRCh37")
bigsnp.obj <- as.bigsnpr(ref.panel)
# Access the genotype matrix
G <- bigsnp.obj$genotypes
dim(G) # samples × variants
# Use with bigsnpr functions
map <- bigsnp.obj$map
corr <- bigsnpr::snp_cor(
G,
infos.pos = map$pos,
size = 3000,
ncores = 4
)
# Convert with specific output location
bigsnp.perm <- as.bigsnpr(
ref.panel,
output = "/path/to/permanent/storage",
output.name = "EUR_ref_GRCh37"
)
## End(Not run)