Functions to make it easier to analyse and summarise data and results in R. For documentation, see https://hauselin.github.io/hausekeep/ Also check out my R tutorials here.

## Installation

To install the package, type the following commands into the R console:

# install.packages("devtools")
devtools::install_github("hauselin/hausekeep") # you might have to install devtools first (see above)

## Examples

### summaryh() generates formatted results and effect sizes for manuscripts

Generate model summaries that can be copied and pasted straight into your manuscript (no more copy-paste frustrations and errors!). Summaries are formatted according to American Psychological Association (APA) guidelines (get in touch if you require other formats). Example APA summaries generated by summaryh():

• regression output: b = 1.41, SE = 0.56, t(30) = 2.53, p = .017, r = 0.42
• ANOVA output: F(1, 30) = 9.00, p = .005, r = 0.48
• t-test output: t(23) = −4.67, p < .001, r = 0.70

See documentation for optional parameters.

model_lm <- lm(mpg ~ cyl, mtcars)
summary(model_lm) # base R summary()
summaryh(model_lm) # returns APA-formatted output in a data.table

# linear mixed effects regression
library(lme4); library(lmerTest) # load packages to fit mixed effects models
model <- lmer(weight ~ Time * Diet  + (1 + Time | Chick), data = ChickWeight)
summary(model) # standard summary
summaryh(model)

# ANOVA
summaryh(aov(mpg ~ gear, mtcars))

# correlation
cor.test(mtcars$mpg, mtcars$cyl)
summaryh(cor.test(mtcars$mpg, mtcars$cyl))

### es() converts between effect size measures

The es function converts one effect size into other effect sizes (e.g., d, r, R2, f, odds ratio, log odds ratio, area-under-curve AUC). Also available at https://www.escal.site.

es(d = 0.2)
#> d: 0.2
#>     d   r   R2   f oddsratio logoddsratio   auc fishersz
#> 1 0.2 0.1 0.01 0.1     1.437        0.363 0.556      0.1

es(r = c(0.1, 0.4, 0.7))
#> r: 0.1 r: 0.4 r: 0.7
#>       d   r   R2     f oddsratio logoddsratio   auc fishersz
#> 1 0.201 0.1 0.01 0.101     1.440        0.365 0.557    0.100
#> 2 0.873 0.4 0.16 0.436     4.871        1.583 0.731    0.424
#> 3 1.960 0.7 0.49 0.980    35.014        3.556 0.917    0.867

### outliersMAD() identifies outliers using robust median absolute deviation approach

example <- c(1, 3, 3, 6, 8, 10, 10, 1000) # 1000 is an outlier
#> 1 outliers detected.
#> Outliers replaced with NA
#> [1]  1  3  3  6  8 10 10 NA

### outliersZ() identifies outliers using Z-score cut-off

example <- c(1, 3, 3, 6, 8, 10, 10, 1000) # 1000 is an outlier
outliersZ(example) # SD approach
#> 1 outliers detected.
#> Outliers replaced with NA
#> [1]  1  3  3  6  8 10 10 NA

# compare with MAD approach from above
outliersZ(example) # SD approach
#> 1 outliers detected.
#> Outliers replaced with NA
#> [1]  1  3  3  6  8 10 10 NA

### fit_ezddm() fits EZ-diffusion model for two-choice response time tasks

library(rtdists) # load package to help us simulate some data
data1 <- rdiffusion(n = 100, a = 2, v = 0.3, t0 = 0.5, z = 0.5 * 2) # simulate data
data2 <- rdiffusion(n = 100, a = 2, v = -0.3, t0 = 0.5, z = 0.5 * 2) # simulate data
dataAll <- rbind(data1, data2) # join data
dataAll$response <- ifelse(dataAll$response == "upper", 1, 0) # convert responses to 1 and 0
dataAll$subject <- rep(c(1, 2), each = 100) # assign subject id dataAll$cond1 <- sample(c("a", "b"), 200, replace = T) # randomly assign conditions a/b
dataAll\$cond2 <- sample(c("y", "z"), 200, replace = T) # randomly assign conditions y/z

# fit model to just entire data set (assumes all data came from 1 subject)
fit_ezddm(data = dataAll, rts = "rt", responses = "response")
# fit model to each subject (no conditions)
fit_ezddm(data = dataAll, rts = "rt", responses = "response", id = "subject")
# fit model to each subject by cond1
fit_ezddm(data = dataAll, rts = "rt", responses = "response", id = "subject", group = "cond1")
# fit model to each subject by cond1,cond2
fit_ezddm(data = dataAll, rts = "rt", responses = "response", id = "subject", group = c("cond1", "cond2"))

### sca_lm() fits every possible linear regression model given a set of predictors and covariates

sca_lm() is a basic implementation of specification curve analysis for linear regression.

# models to fit: mpg ~ cyl; mpg ~ carb; mpg ~ cyl + carb
sca_lm(data = mtcars, dv = "mpg", ivs = c("cyl", "carb")) # default no covariates

# models to fit (with and without covariate vs): mpg ~ cyl; mpg ~ carb; mpg ~ cyl + carb
sca_lm(data = mtcars, dv = "mpg", ivs = c("cyl", "carb"), covariates = "vs")