R ggplot2：如何按填充变量的比例对堆叠条形图重新排序_R_Ggplot2

R ggplot2：如何按填充变量的比例对堆叠条形图重新排序

R ggplot2：如何按填充变量的比例对堆叠条形图重新排序,r,ggplot2,R,Ggplot2,我正在使用kaggle上提供的“纽约房地产销售”数据集：清理数据集后，我使用此代码生成了以下条形图： nyc_clean %>% filter(year == 2017, borough == "Manhatten") %>% add_count(neighborhood) %>% mutate(neighborhood = fct_reorder(neighborhood, n) %>% fct_rev()) %

我正在使用kaggle上提供的“纽约房地产销售”数据集：

清理数据集后，我使用此代码生成了以下条形图：

nyc_clean %>%
  filter(year == 2017,
         borough == "Manhatten") %>% 
  add_count(neighborhood) %>% 
  mutate(neighborhood = fct_reorder(neighborhood, n) %>% fct_rev()) %>% 
  filter(as.numeric(neighborhood) <= 13) %>% 
  distinct(borough, block, lot, .keep_all = TRUE) %>%
  pivot_longer(c("residential_units", "commercial_units"),
               names_to = "type",
               values_to = "count") %>%
  mutate(neighborhood = fct_reorder(neighborhood, as.numeric(as.factor(type)), 
                                    mean, na.rm = TRUE)) %>% 
  ggplot(aes(neighborhood, count, fill = type)) + 
  geom_col(position = "fill") +
  scale_y_continuous(labels = percent) +
  coord_flip() +
  theme_light()

当尝试对绘图重新排序时，条形图的顺序与上面我的输出一样混乱：

df %>% 
  mutate(neighborhood = fct_reorder(neighborhood, as.numeric(as.factor(type)), 
                                    mean, na.rm = TRUE)) %>% 
  ggplot(aes(neighborhood, count, fill = type)) + 
  geom_col(position = "fill") +
  scale_y_continuous(labels = scales::percent) +
  coord_flip() +
  theme_light()

关于我这里遗漏的内容，有什么想法吗？

一种方法是按

'residential\u unit'

和

计数来排列数据，并按它们出现的顺序分配系数级别
library(dplyr)
library(ggplot2)


df %>%
 group_by(neighborhood, type) %>%
 summarise(prop = sum(count)) %>%
 mutate(prop = prop.table(prop)) %>%
 arrange(type != 'residential_unit', prop) %>%
 pull(neighborhood) %>% unique -> levels
  
df %>%
  mutate(neighborhood = factor(neighborhood, levels)) %>% 
  ggplot(aes(neighborhood, count, fill = type)) + 
  geom_col(position = "fill") +
  scale_y_continuous(labels = scales::percent) +
  coord_flip() +
  theme_light()

希望这能弥补简洁明了的不足：
df %>% 
  left_join(   # Add res_share for each neighborhood 
    df %>% 
      group_by(neighborhood) %>% 
      mutate(share = count / sum(count)) %>%
      ungroup() %>%
      filter(type == "residential_unit") %>%
      select(neighborhood, res_share = share)
    ) %>%
  mutate(neighborhood = fct_reorder(neighborhood, res_share)) %>% 
  ggplot(aes(neighborhood, count, fill = type)) + 
  geom_col(position = "fill") +
  scale_y_continuous(labels = scales::percent) +
  coord_flip() +
  theme_light()

感谢您提供的实际解决方案！你有没有解释为什么fct\u reorder
不起作用？在fct\u reorder中，你没有指定任何与“住宅单元”相关的内容。它怎么知道你想要多少因子水平？我明白。应用解决方案的问题在于，您按原始计数排列，而不是按分组比例排列。我想把住宅区或商业区的比例按邻里分类，然后按住宅区的比例（按降序）将邻里因素考虑进去。@N1loon你能检查一下更新后的答案是否更接近你想要的吗？
df %>% 
  left_join(   # Add res_share for each neighborhood 
    df %>% 
      group_by(neighborhood) %>% 
      mutate(share = count / sum(count)) %>%
      ungroup() %>%
      filter(type == "residential_unit") %>%
      select(neighborhood, res_share = share)
    ) %>%
  mutate(neighborhood = fct_reorder(neighborhood, res_share)) %>% 
  ggplot(aes(neighborhood, count, fill = type)) + 
  geom_col(position = "fill") +
  scale_y_continuous(labels = scales::percent) +
  coord_flip() +
  theme_light()