My code frequently uses tapply
and summary
as shown below:
data <- tibble( year = rep(2018:2021, 3), x = runif(length(year)))tapply(data$x, data$year, summary)
The output looks like:
$`2018` Min. 1st Qu. Median Mean 3rd Qu. Max. 0.3914 0.5696 0.7477 0.6668 0.8045 0.8614 $`2019` Min. 1st Qu. Median Mean 3rd Qu. Max. 0.1910 0.2863 0.3816 0.4179 0.5313 0.6809 (etc.)
Is there a way to get such summary
-like output in a tibble?
Desired output, using ugly code:
tapply(data$x, data$year, summary)%>% map(~ as.numeric(round(.x, 2))) %>% map_dfr(set_names, names(summary(1))) %>% add_column(year = 2018:2021, .before = 1)
# A tibble: 4 x 7 year Min. `1st Qu.` Median Mean `3rd Qu.` Max.<int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>1 2018 0.39 0.570 0.75 0.67 0.8 0.862 2019 0.19 0.290 0.38 0.42 0.53 0.683 2020 0.01 0.35 0.7 0.55 0.82 0.934 2021 0.06 0.15 0.24 0.32 0.45 0.66
I'm hoping that there is a nice combination of dplyr
functions to do that better -- my code to get the desired output is hacky.
Of course, I'm hoping not to have to rewrite base R's summary
function, as below:
summarise(`Min` = min(x), `1st Qu.` = quantile(x, 0.25), ...)