Quantcast
Channel: Tidy way to get `summary` output per group? - Stack Overflow
Viewing all articles
Browse latest Browse all 4

Tidy way to get `summary` output per group?

$
0
0

My code frequently uses tapply and summary as shown below:

data <- tibble(  year = rep(2018:2021, 3),  x = runif(length(year)))tapply(data$x, data$year, summary)

The output looks like:

$`2018`   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.  0.3914  0.5696  0.7477  0.6668  0.8045  0.8614 $`2019`   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.  0.1910  0.2863  0.3816  0.4179  0.5313  0.6809 (etc.)

Is there a way to get such summary-like output in a tibble?

Desired output, using ugly code:

tapply(data$x, data$year, summary)%>%   map(~ as.numeric(round(.x, 2))) %>%   map_dfr(set_names, names(summary(1))) %>%   add_column(year = 2018:2021, .before = 1)
# A tibble: 4 x 7   year  Min. `1st Qu.` Median  Mean `3rd Qu.`  Max.<int> <dbl>     <dbl>  <dbl> <dbl>     <dbl> <dbl>1  2018  0.39     0.570   0.75  0.67      0.8   0.862  2019  0.19     0.290   0.38  0.42      0.53  0.683  2020  0.01     0.35    0.7   0.55      0.82  0.934  2021  0.06     0.15    0.24  0.32      0.45  0.66

I'm hoping that there is a nice combination of dplyr functions to do that better -- my code to get the desired output is hacky.

Of course, I'm hoping not to have to rewrite base R's summary function, as below:

summarise(`Min` = min(x), `1st Qu.` = quantile(x, 0.25), ...)

Viewing all articles
Browse latest Browse all 4

Latest Images

Trending Articles





Latest Images

<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>
<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596344.js" async> </script>