DOI

Functions to make it easier to analyse and summarise data and results in R. For documentation, see https://hauselin.github.io/hausekeep/ Also check out my R tutorials here.

Installation

To install the package, type the following commands into the R console:

# install.packages("devtools")
devtools::install_github("hauselin/hausekeep") # you might have to install devtools first (see above)

Examples

summaryh() generates formatted results and effect sizes for manuscripts

Generate model summaries that can be copied and pasted straight into your manuscript (no more copy-paste frustrations and errors!). Summaries are formatted according to American Psychological Association (APA) guidelines (get in touch if you require other formats). Example APA summaries generated by summaryh():

  • regression output: b = 1.41, SE = 0.56, t(30) = 2.53, p = .017, r = 0.42
  • ANOVA output: F(1, 30) = 9.00, p = .005, r = 0.48
  • t-test output: t(23) = −4.67, p < .001, r = 0.70

See documentation for optional parameters.

model_lm <- lm(mpg ~ cyl, mtcars) 
summary(model_lm) # base R summary()
summaryh(model_lm) # returns APA-formatted output in a data.table

# linear mixed effects regression
library(lme4); library(lmerTest) # load packages to fit mixed effects models
model <- lmer(weight ~ Time * Diet  + (1 + Time | Chick), data = ChickWeight)
summary(model) # standard summary
summaryh(model)

# ANOVA
summaryh(aov(mpg ~ gear, mtcars))

# correlation
cor.test(mtcars$mpg, mtcars$cyl)
summaryh(cor.test(mtcars$mpg, mtcars$cyl))

es() converts between effect size measures

The es function converts one effect size into other effect sizes (e.g., d, r, R2, f, odds ratio, log odds ratio, area-under-curve AUC). Also available at https://www.escal.site.

es(d = 0.2)
#> d: 0.2
#>     d   r   R2   f oddsratio logoddsratio   auc fishersz
#> 1 0.2 0.1 0.01 0.1     1.437        0.363 0.556      0.1

es(r = c(0.1, 0.4, 0.7))
#> r: 0.1 r: 0.4 r: 0.7
#>       d   r   R2     f oddsratio logoddsratio   auc fishersz
#> 1 0.201 0.1 0.01 0.101     1.440        0.365 0.557    0.100
#> 2 0.873 0.4 0.16 0.436     4.871        1.583 0.731    0.424
#> 3 1.960 0.7 0.49 0.980    35.014        3.556 0.917    0.867

outliersMAD() identifies outliers using robust median absolute deviation approach

example <- c(1, 3, 3, 6, 8, 10, 10, 1000) # 1000 is an outlier
outliersMAD(example) # MAD approach
#> 1 outliers detected.
#> Outliers replaced with NA
#> [1]  1  3  3  6  8 10 10 NA

outliersZ() identifies outliers using Z-score cut-off

example <- c(1, 3, 3, 6, 8, 10, 10, 1000) # 1000 is an outlier
outliersZ(example) # SD approach
#> 1 outliers detected.
#> Outliers replaced with NA
#> [1]  1  3  3  6  8 10 10 NA

# compare with MAD approach from above
outliersZ(example) # SD approach
#> 1 outliers detected.
#> Outliers replaced with NA
#> [1]  1  3  3  6  8 10 10 NA

fit_ezddm() fits EZ-diffusion model for two-choice response time tasks

library(rtdists) # load package to help us simulate some data
data1 <- rdiffusion(n = 100, a = 2, v = 0.3, t0 = 0.5, z = 0.5 * 2) # simulate data
data2 <- rdiffusion(n = 100, a = 2, v = -0.3, t0 = 0.5, z = 0.5 * 2) # simulate data
dataAll <- rbind(data1, data2) # join data
dataAll$response <- ifelse(dataAll$response == "upper", 1, 0) # convert responses to 1 and 0
dataAll$subject <- rep(c(1, 2), each = 100) # assign subject id
dataAll$cond1 <- sample(c("a", "b"), 200, replace = T) # randomly assign conditions a/b
dataAll$cond2 <- sample(c("y", "z"), 200, replace = T) # randomly assign conditions y/z

# fit model to just entire data set (assumes all data came from 1 subject)
fit_ezddm(data = dataAll, rts = "rt", responses = "response")
# fit model to each subject (no conditions)
fit_ezddm(data = dataAll, rts = "rt", responses = "response", id = "subject") 
# fit model to each subject by cond1
fit_ezddm(data = dataAll, rts = "rt", responses = "response", id = "subject", group = "cond1") 
# fit model to each subject by cond1,cond2
fit_ezddm(data = dataAll, rts = "rt", responses = "response", id = "subject", group = c("cond1", "cond2"))

sca_lm() fits every possible linear regression model given a set of predictors and covariates

sca_lm() is a basic implementation of specification curve analysis for linear regression.

# models to fit: mpg ~ cyl; mpg ~ carb; mpg ~ cyl + carb
sca_lm(data = mtcars, dv = "mpg", ivs = c("cyl", "carb")) # default no covariates 

# models to fit (with and without covariate vs): mpg ~ cyl; mpg ~ carb; mpg ~ cyl + carb
sca_lm(data = mtcars, dv = "mpg", ivs = c("cyl", "carb"), covariates = "vs")