This function implements 2 rephasing algorithms. The first mindist implements the dynamic programming algorithm to rephase haplotype copy number first described in CHISEL. The objective is to find the phase that minimizes the number of copy number events. The second LOH finds cells with whole chromosome losses and assumes this was a single event and rephases all the bins relative to this.

rephasebins(
  cn,
  chromosomes = NULL,
  method = "mindist",
  whole_chr_cutoff = 0.9,
  ncells = 1,
  clusterfirst = FALSE,
  cl = NULL,
  seed = 42,
  max_iterations = 10
)

Arguments

cn

either a hscn object from callHaplotypeSpecificCN or a dataframe with haplotype specific copy number ie the data slot in an hscn object

chromosomes

vector specifying which chromosomes to phase, default is NULL whereby all chromosomes are phased

method

either mindist or LOH

whole_chr_cutoff

Cutoff for whole chromosome LOH detection when method is LOH, default 0.9

ncells

default 1

clusterfirst

Whether to cluster cells and perform rephasing on clusters rather than cells

cl

Precomputed clustering object from umap_clustering

seed

Random seed for UMAP clustering reproducibility when clusterfirst = TRUE, default 42

max_iterations

Maximum number of convergence iterations, default 10

Value

Either a new hscn object or a dataframe with rephased bins depending on the input

Details

The algorithm iterates until convergence (no more bins need switching) to ensure idempotency - running rephasebins twice on the same data will produce identical results.