outliersMAD is used to identify outliers in vectors using Leys et al.'s (2003) median absolute deviation approach.

outliersMAD(x, MADCutOff = 3.0, replaceOutliersWith = NA,
showMADValues = FALSE, outlierIndices = FALSE, bConstant = 1.4826, digits = 2)

Arguments

x

a vector of numbers

MADCutOff

value to use as cutoff (Leys e tal. recommend 2.5 or 3.0 as default)

replaceOutliersWith

if value is an outlier, what to replace it with? NA by default

showMADValues

if TRUE, will show deviation score of each value

outlierIndices

return index/position of outlier

bConstant

a constant linked to the assumption of normality of the data, disregarding the abnormality induced by outliers

digits

how many digits/decimals to round output to

Value

A vector with outliers identified (default converts outliers to NA)

Details

We can identify and remove outliers in our data by identifying data points that are too extreme—either too many standard deviations (SD) away from the mean or too many median absolute deviations (MAD) away from the median. The SD approach might not be ideal with extreme outliers, whereas the MAD approach is much more robust (for comparison of both approaches, see Leys et al., 2013, Journal of Experimental Social Psychology).

References

See also

Examples

example <- c(1, 3, 3, 6, 8, 10, 10, 1000, -1000) # 1000 is an outlier outliersMAD(example)
#> 2 outliers detected.
#> Outliers replaced with NA
#> [1] 1 3 3 6 8 10 10 NA NA
outliersMAD(example, MADCutOff = 3.0)
#> 2 outliers detected.
#> Outliers replaced with NA
#> [1] 1 3 3 6 8 10 10 NA NA
outliersMAD(example, MADCutOff = 2.5, replaceOutliersWith = -999)
#> 2 outliers detected.
#> Outliers replaced with -999
#> [1] 1 3 3 6 8 10 10 -999 -999
outliersMAD(example, MADCutOff = 1.5, outlierIndices = TRUE)
#> Showing indices of outliers.
#> [1] 8 9
outliersMAD(example, MADCutOff = 1.5, showMADValues = TRUE)
#> Showing MAD from median for each value.
#> 2 outliers detected.
#> [1] -0.84 -0.51 -0.51 0.00 0.34 0.67 0.67 167.61 -169.63
outliersMAD(example, MADCutOff = 1.5, showMADValues = TRUE, replaceOutliersWith = -88)
#> Showing MAD from median for each value.
#> 2 outliers detected.
#> [1] -0.84 -0.51 -0.51 0.00 0.34 0.67 0.67 167.61 -169.63