R 按组列出的统计汇总问题_R_Ggplot2

R 按组列出的统计汇总问题

R 按组列出的统计汇总问题,r,ggplot2,R,Ggplot2,我在这里的最终目标是使用现有的组成员身份，让stat\u summary将摘要行添加到绘图中。我在绘制线条时遇到了麻烦，虽然我理解这个问题，但我不知道如何避免造成它例如： library(ggplot2) df <- data.frame(low=c(20,24,18,16), mid=c(60,61,48,45), high=c(80,75,81,83), category=fa

我在这里的最终目标是使用现有的组成员身份，让

stat\u summary

将摘要行添加到绘图中。我在绘制线条时遇到了麻烦，虽然我理解这个问题，但我不知道如何避免造成它

例如：

library(ggplot2)
df <- data.frame(low=c(20,24,18,16), 
                 mid=c(60,61,48,45), 
                 high=c(80,75,81,83), 
                 category=factor(seq(1:4)), 
                 membership=factor(c(1,1,2,2)))

p <- ggplot(df, aes(x=category, y=mid)) +
  geom_linerange(aes(ymin=low, ymax=high)) +
  geom_point(shape=95, size=8)
p

但是当我试图使用

df

中的成员身份在组内生成平均值时，我遇到了

linerange

的问题（尽管

point

绘图很好）

我从

ggplot\u build（p）

中知道

ymin

ymax

，这就是为什么绘图上没有显示任何内容。但是，如果我使用

fun.data

而不是

fun.ymin

fun.ymax

的话，我会得到错误，因为没有必要的

ymin

和

ymax

美学

$data[[3]]
      x group ymin    y ymax 
1 5.125     2 46.5 46.5 46.5 
2 4.875     1 60.5 60.5 60.5

任何帮助都将不胜感激

在将数据帧传递到

ggplot（）

进行打印之前，您可能会发现更容易计算分组平均值。一种可能的方法如下：

library(dplyr)

df %>%
  rbind(df %>%
          mutate(category = "Aggregate") %>%
          group_by(category, membership) %>%
          summarise_all(mean) %>% # calculate mean for low / mid / high by group
          ungroup() %>%
          select(colnames(df))) %>% #reorder columns to match the original df
  ggplot(aes(x = category, y = mid, ymin = low, ymax = high,
             colour = membership)) +
  geom_linerange(position = position_dodge(width = 0.5)) +
  geom_point(shape = 95, size = 8,
             position = position_dodge(width = 0.5))

（我添加了

color=membership

，以使各组在视觉上更加清晰。）

是的，这是一种更简单、更干净的方法。我仍然很好奇为什么

stat summary

在这种情况下不起作用，但我会接受这个答案，因为它是我的应用程序中可以接受的解决方法。谢谢我认为它源自美学

stat\u summary

的理解。根据帮助文件，它可以理解

，

组

，但不能理解

ymin

或

ymax

。如果使用

fun.data

，函数必须输出

ymin

和

ymax

。啊哈……这很有意义。非常感谢。不知何故，我在帮助文件中遗漏了这一点。

$data[[3]]
      x group ymin    y ymax 
1 5.125     2 46.5 46.5 46.5 
2 4.875     1 60.5 60.5 60.5

library(dplyr)

df %>%
  rbind(df %>%
          mutate(category = "Aggregate") %>%
          group_by(category, membership) %>%
          summarise_all(mean) %>% # calculate mean for low / mid / high by group
          ungroup() %>%
          select(colnames(df))) %>% #reorder columns to match the original df
  ggplot(aes(x = category, y = mid, ymin = low, ymax = high,
             colour = membership)) +
  geom_linerange(position = position_dodge(width = 0.5)) +
  geom_point(shape = 95, size = 8,
             position = position_dodge(width = 0.5))