identify NA cells |
base |
is.na() |
returns logicals T/F for each NA cell |
identify NA in each col |
base & purrr |
purrr::map_df(.x = df, .f= is.na) |> colSums() |
returns total NA per column |
rule-based deductions |
deducorrect |
deduImpute() |
Based on observed values and edit rules, impute as many variables deductively as possible. |
replace NAs |
tidyr |
replace_na(list(x = 0, y = ‘unknown’)) |
Replace/Impute NA with pre-configured values (e.g., mean, median, VIM::maxCat(), etc.) |
across column imputations |
VIM |
kNN() |
k-nearest neighbours for imputation |
across column imputations |
VIM |
rangerImpute() |
use of random forests for imputation |
multiple imputation |
mice |
mice() |
multivariate imputation by chained equations |