Combine records helper functions
Source:R/combine_records_class.R
      combine_records_helper_functions.RdThis page documents helper functions for use with combine_records().
Usage
.mode(ties = FALSE, na.rm = TRUE)
.mean()
.median()
.collapse(separator, na_string = "NA")
.select_max(max_col, use_abs = FALSE, keep_NA = FALSE)
.select_min(min_col, use_abs = FALSE, keep_NA = FALSE)
.select_match(match_col, search_col, separator, na_string = "NA")
.select_exact(match_col, match, separator, na_string = "NA")
.unique(separator, na_string = "NA", digits = 6)
.prioritise(match_col, priority, separator, no_match = NA, na_string = "NA")
.nothing()
.count()
.select_grade(grade_col, keep_NA = FALSE, upper_case = TRUE)Arguments
- ties
 (logical) If TRUE then all records matching the tied groups are returned. Otherwise the first record is returned.
- na.rm
 (logical) If TRUE then NA is ignored
- separator
 (character, NULL) if !NULL this string is used to collapse matches with the same priority
- na_string
 (character) NA values are replaced with this string
- max_col
 (character) the column name to search for the maximum value.
- use_abs
 (logical) If TRUE then the sign of the values is ignored.
- keep_NA
 (logical) If TRUE keeps records with NA values
- min_col
 (character) the column name to search for the minimum value.
- match_col
 (character) the column with labels to prioritise
- search_col
 (character) the name of a column to use as a reference for locating values in the matching column.
- match
 (character) a value to search for in the matching column.
- digits
 (numeric) the number of digits to use when converting numerical values to characters when determining if values are unique.
- priority
 (character) a list of labels in priority order
- no_match
 (character, NULL) if !NULL then annotations not matching any of the priority labels are replaced with this value
- grade_col
 (character) the name of a column containing grades
- upper_case
 (logical) If TRUE then grades are compared to upper case letters to determine their ordering, otherwise lower case.
Value
A function for use with combine_records()
Functions
.mode(): returns the most common value, excluding NA. Ifties == TRUEthen all tied values are returned, otherwise the first value in a sorted unique list is returned (equal to min if numeric). Ifna.rm = FALSEthen NA are included when searching for the modal value and placed last ifties = FALSE(values are returned preferentially over NA)..mean(): calculates the mean value, excluding NA ifna.rm = TRUE.median(): calculates the median value, excluding NA ifna.rm = TRUE.collapse(): collapses multiple matching records into a single string using the provided separator..select_max(): selects a record based on the index of the maximum value in a another column..select_min(): selects a record based on the index of the minimum in a second column..select_match(): returns all records based on the indices of identical matches in a second column and collapses them useing the provided separator..select_exact(): returns records based on the index of identical value matching thematchparameter within the current column, and collapses them using the provided separator if necessary..unique(): collapses a set of records to a set of unique values using the provided separator.digitscan be provided for numeric columns to control the precision used when determining unique values..prioritise(): reduces a set of annotations by prioritising values according to the input. If there are multiple matches with the same priority then they are collapsed using a separator..nothing(): a pass-through function to allow some annotation table columns to remain unchanged..count(): adds a new column indicating the number of annotations that match the given grouping variable..select_grade(): returns records based on the index of the best grade in a second list. The best grade is defined as "A" forupper_case = TRUEor "a" forupper_case = FALSEand the worst grade is "Z" or "z". Any non-exact matches to a character inLETTERSorlettersare replaced with NA.
Examples
# Select matching records
M = combine_records(
        group_by = 'example',
        default_fcn = .select_match(
            match_col = 'match_column',
            match = 'find_me',
            separator = ', ',
            na_string = 'NA')
        )
#> Error in .select_match(match_col = "match_column", match = "find_me",     separator = ", ", na_string = "NA"): unused argument (match = "find_me")
# Collapse unique values
M = combine_records(
        group_by = 'example',
        default_fcn = .unique(
            digits = 6,
            separator = ', ',
            na_string = 'NA')
        )
# Prioritise by source
M = combine_records(
        group_by = 'InChiKey',
        default_fcn = .prioritise(
             match_col = 'source',
             priority = c('CD','LS'),
             separator = '  || ')
    )
# Do nothing to all columns
M = combine_records(
        group_by = 'InChiKey',
        default_fcn = .nothing()
    )
# Add a column with the number of records with a matching inchikey
M = combine_records(
        group_by = 'InChiKey',
        fcns = list(
            count = .count()
        ))
# Select annotation with highest (best) grade
M = combine_records(
        group_by = 'InChiKey',
        default_fcn = .select_grade(
            grade_col = 'grade',
            keep_NA = FALSE,
            upper_case = TRUE
        ))