combineBAFCN.RdJoins copy number bin data with phased haplotype counts to produce a combined data.frame with copy number states and B-allele frequency (BAF) for each bin.
combineBAFCN(
haplotypes,
CNbins,
filtern = 0,
phased_haplotypes = NULL,
minbins = 100,
minbinschr = 10,
phasing_method = "distribution",
...
)A data.frame with haplotype allele counts. Required columns: `cell_id`, `chr`, `start`, `end`, `hap_label`, `allele_id`, `readcount` (raw) or `allele1`, `allele0`, `totalcounts` (formatted).
A data.frame with copy number bin data. Required columns: `cell_id`, `chr`, `start`, `end`, `state`, `copy`.
Minimum total read count per bin to include. Default 0.
Optional pre-computed phased haplotypes from `computehaplotypecounts()`. If NULL, phasing is performed automatically.
Minimum number of bins per cell to include. Default 100.
Minimum number of bins per chromosome per cell. Default 10.
Method for phasing haplotypes. One of "distribution" (default) or use top N imbalanced cells.
Additional arguments passed to `format_haplotypes()`.
A data.frame with columns from CNbins plus: * `alleleA`: Read counts for A allele * `alleleB`: Read counts for B allele * `totalcounts`: Total read counts * `BAF`: B-allele frequency (alleleB / totalcounts)
data(CNbins)
data(haplotypes)
# Format haplotypes first
haps_formatted <- format_haplotypes_dlp(haplotypes, CNbins)
#> Number of distinct bins in copy number data: 4375
#> Number of distinct bins in haplotype data: 5728
#> Number of distinct bins in formatted haplotype data: 4360
# Combine with CNbins
cnbaf <- combineBAFCN(haps_formatted, CNbins)
#> Finding overlapping cell IDs between CN data and haplotype data...
#> Total number of cells in both CN and haplotypes: 250
#> Number of cells in CN data: 250
#> Number of cells in haplotype data: 250
#> Joining bins and haplotypes...
#> Phase haplotypes...
#> Phasing based on distribution across all cells
#> Join phased haplotypes...
#> Reorder haplotypes based on phase...
#> Total number of cells after removing cells with < 100 bins: 250