R ggplot2:如何按填充变量的比例对堆叠条形图重新排序

R ggplot2:如何按填充变量的比例对堆叠条形图重新排序,r,ggplot2,R,Ggplot2,我正在使用kaggle上提供的“纽约房地产销售”数据集: 清理数据集后,我使用此代码生成了以下条形图: nyc_clean %>% filter(year == 2017, borough == "Manhatten") %>% add_count(neighborhood) %>% mutate(neighborhood = fct_reorder(neighborhood, n) %>% fct_rev()) %

我正在使用kaggle上提供的“纽约房地产销售”数据集:

清理数据集后,我使用此代码生成了以下条形图:

nyc_clean %>%
  filter(year == 2017,
         borough == "Manhatten") %>% 
  add_count(neighborhood) %>% 
  mutate(neighborhood = fct_reorder(neighborhood, n) %>% fct_rev()) %>% 
  filter(as.numeric(neighborhood) <= 13) %>% 
  distinct(borough, block, lot, .keep_all = TRUE) %>%
  pivot_longer(c("residential_units", "commercial_units"),
               names_to = "type",
               values_to = "count") %>%
  mutate(neighborhood = fct_reorder(neighborhood, as.numeric(as.factor(type)), 
                                    mean, na.rm = TRUE)) %>% 
  ggplot(aes(neighborhood, count, fill = type)) + 
  geom_col(position = "fill") +
  scale_y_continuous(labels = percent) +
  coord_flip() +
  theme_light()
当尝试对绘图重新排序时,条形图的顺序与上面我的输出一样混乱:

df %>% 
  mutate(neighborhood = fct_reorder(neighborhood, as.numeric(as.factor(type)), 
                                    mean, na.rm = TRUE)) %>% 
  ggplot(aes(neighborhood, count, fill = type)) + 
  geom_col(position = "fill") +
  scale_y_continuous(labels = scales::percent) +
  coord_flip() +
  theme_light()


关于我这里遗漏的内容,有什么想法吗?

一种方法是按
'residential\u unit'
计数来排列数据,并按它们出现的顺序分配系数级别

library(dplyr)
library(ggplot2)


df %>%
 group_by(neighborhood, type) %>%
 summarise(prop = sum(count)) %>%
 mutate(prop = prop.table(prop)) %>%
 arrange(type != 'residential_unit', prop) %>%
 pull(neighborhood) %>% unique -> levels
  
df %>%
  mutate(neighborhood = factor(neighborhood, levels)) %>% 
  ggplot(aes(neighborhood, count, fill = type)) + 
  geom_col(position = "fill") +
  scale_y_continuous(labels = scales::percent) +
  coord_flip() +
  theme_light()

希望这能弥补简洁明了的不足:

df %>% 
  left_join(   # Add res_share for each neighborhood 
    df %>% 
      group_by(neighborhood) %>% 
      mutate(share = count / sum(count)) %>%
      ungroup() %>%
      filter(type == "residential_unit") %>%
      select(neighborhood, res_share = share)
    ) %>%
  mutate(neighborhood = fct_reorder(neighborhood, res_share)) %>% 
  ggplot(aes(neighborhood, count, fill = type)) + 
  geom_col(position = "fill") +
  scale_y_continuous(labels = scales::percent) +
  coord_flip() +
  theme_light()

感谢您提供的实际解决方案!你有没有解释为什么
fct\u reorder
不起作用?在
fct\u reorder
中,你没有指定任何与“住宅单元”相关的内容。它怎么知道你想要多少因子水平?我明白。应用解决方案的问题在于,您按原始计数排列,而不是按分组比例排列。我想把住宅区或商业区的比例按邻里分类,然后按住宅区的比例(按降序)将邻里因素考虑进去。@N1loon你能检查一下更新后的答案是否更接近你想要的吗?
df %>% 
  left_join(   # Add res_share for each neighborhood 
    df %>% 
      group_by(neighborhood) %>% 
      mutate(share = count / sum(count)) %>%
      ungroup() %>%
      filter(type == "residential_unit") %>%
      select(neighborhood, res_share = share)
    ) %>%
  mutate(neighborhood = fct_reorder(neighborhood, res_share)) %>% 
  ggplot(aes(neighborhood, count, fill = type)) + 
  geom_col(position = "fill") +
  scale_y_continuous(labels = scales::percent) +
  coord_flip() +
  theme_light()