Compute Metrics for Base Dataset Selection
base_dataset_metrics.Rd
This function computes a metrics table for a set of dataframes provided as a named list. It compares the unique values of a specified identifier column across the dataframes. The identifier column is coerced to a specified type before comparison. The resulting table includes the count of unique identifiers, the total common identifiers shared with other dataframes, and a logical flag indicating the main dataset (the one with the highest total common identifier count).
Value
A data frame with the following columns:
- DataFrame
Name of the dataframe.
- Unique_Count
Number of unique identifier values in the dataframe.
- Total_Common
Sum of identifier overlaps with all other dataframes.
- Is_Main
Logical, TRUE if the dataframe is considered the main dataset based on the maximum total common count.