This function allows you to obtain the PCA embedding for tSNE or UMAP plotting.

create_embedding(
  path = NULL,
  hic_df = NULL,
  chrs = paste0("chr", c(1:19, "X")),
  dim_pca = 50,
  do_harmony = FALSE,
  batch = NULL
)

Arguments

path

The path for all the normalized cells in a directory. There can be sub-directories. Using path means loading the bandnorm normalized data iteratively from the directory, so it is relatively slower than using hic_df. However, it won't eat up your memory too much, and the speed is acceptable.

hic_df

After using bandnorm, if you keep the data frame, it is possible to use this instead of "path" as input of create_embedding. If using hic_df, the speed will be faster, but is costs more memory.

chrs

Chromosomes used in the embedding. Default is chr1, ..., chr19, and chrX.

dim_pca

Dimension of PCA embedding to be outputted. Default is 50.

do_harmony

Whether to use Harmony to remove the batch effect from the embedding. Default is FALSE

batch

The batch information used for Harmony to remove the batch effect. Required if do_harmony is TRUE.

Examples

data("hic_df")
data("batch")
bandnorm_result = bandnorm(hic_df = hic_df, save = FALSE)
embedding = create_embedding(hic_df = bandnorm_result, do_harmony = TRUE, batch = batch)
#> Warning: no non-missing arguments to max; returning -Inf
#> Warning: no non-missing arguments to max; returning -Inf
#> Warning: NAs introduced by coercion to integer range
#> Error in validObject(r): invalid class "dgTMatrix" object: Negative value in Dim