When the UMI is applied, bc_cure_umi can filter the UMI-barcode tags by counts.

bc_cure_umi(barcodeObj, depth = 2, doFish = FALSE, isUniqueUMI = FALSE)

# S4 method for class 'BarcodeObj'
bc_cure_umi(barcodeObj, depth = 1, doFish = FALSE, isUniqueUMI = FALSE)

Arguments

barcodeObj

A BarcodeObj object.

depth

A numeric or a vector of numeric, specifying the UMI-barcode tag count threshold. Only the barcodes with UMI-barcode tag count equal to or larger than the threshold are kept.

doFish

A logical value, if true, for barcodes with UMI read depth above the threshold, “fish” for identical barcodes with UMI read depth below the threshold. The consequence of doFish will not increase the number of identified barcodes, but the UMI counts will increase due to including the low depth UMI barcodes.

isUniqueUMI

A logical value. In the case that a UMI relates to several barcodes, if you believe that the UMI is absolutely unique, then only the UMI-barcodes tags with the highest count are kept for each UMI.

Value

A BarcodeObj object with cleanBc slot updated (or created).

Details

When invoking this function, it processes the data with following steps:

  1. (if isUniqueUMI is TRUE) Find the dominant UMI-barcode tag with the highest reads count in each UMI.

  2. UMI-barcode depth filtering.

  3. (if doFish is TRUE) Fishing the UMI-barcode tags with low reads count.

Examples

data(bc_obj)

d1 <- data.frame(
   seq = c(
       "ACTTCGATCGATCGAAAAGATCGATCGATC",
       "AATTCGATCGATCGAAGAGATCGATCGATC",
       "CCTTCGATCGATCGAAGAAGATCGATCGATC",
       "TTTTCGATCGATCGAAAAGATCGATCGATC",
       "AAATCGATCGATCGAAGAGATCGATCGATC",
       "CCCTCGATCGATCGAAGAAGATCGATCGATC",
       "GGGTCGATCGATCGAAAAGATCGATCGATC",
       "GGATCGATCGATCGAAGAGATCGATCGATC",
       "ACTTCGATCGATCGAACAAGATCGATCGATC",
       "GGTTCGATCGATCGACGAGATCGATCGATC",
       "GCGTCCATCGATCGAAGAAGATCGATCGATC"
       ),
   freq = c(
       30, 60, 9, 10, 14, 5, 10, 30, 6, 4 , 6
       )
   )

pattern <- "([ACTG]{3})TCGATCGATCGA([ACTG]+)ATCGATCGATC"
bc_obj <- bc_extract(list(test = d1), pattern, sample_name=c("test"), 
    pattern_type=c(UMI=1, barcode=2))

# Use UMI information to remove the barcode <= 5 UMI-barcode tags
bc_umi_cured <- bc_cure_umi(bc_obj, depth =0, doFish=TRUE, isUniqueUMI=TRUE)
bc_cure_depth(bc_umi_cured, depth = 5)
#> ------------
#> bc_cure_depth: isUpdate is TRUE, update the cleanBc.
#> ------------
#> Bonjour le monde, This is a BarcodeObj.
#> ----------
#> It contains: 
#> ----------
#> @metadata: 3 field(s) available:
#> raw_read_count  barcode_read_count  depth_cutoff
#> ----------
#> @messyBc: 1 sample(s) for raw barcodes:
#>     In sample $test there are: 10 Tags
#> ----------
#> @cleanBc: 1 samples for cleaned barcodes
#>     In sample $test there are: 0 barcodes