R ggplot2:如何按填充变量的比例对堆叠条形图重新排序
我正在使用kaggle上提供的“纽约房地产销售”数据集: 清理数据集后,我使用此代码生成了以下条形图:R ggplot2:如何按填充变量的比例对堆叠条形图重新排序,r,ggplot2,R,Ggplot2,我正在使用kaggle上提供的“纽约房地产销售”数据集: 清理数据集后,我使用此代码生成了以下条形图: nyc_clean %>% filter(year == 2017, borough == "Manhatten") %>% add_count(neighborhood) %>% mutate(neighborhood = fct_reorder(neighborhood, n) %>% fct_rev()) %
nyc_clean %>%
filter(year == 2017,
borough == "Manhatten") %>%
add_count(neighborhood) %>%
mutate(neighborhood = fct_reorder(neighborhood, n) %>% fct_rev()) %>%
filter(as.numeric(neighborhood) <= 13) %>%
distinct(borough, block, lot, .keep_all = TRUE) %>%
pivot_longer(c("residential_units", "commercial_units"),
names_to = "type",
values_to = "count") %>%
mutate(neighborhood = fct_reorder(neighborhood, as.numeric(as.factor(type)),
mean, na.rm = TRUE)) %>%
ggplot(aes(neighborhood, count, fill = type)) +
geom_col(position = "fill") +
scale_y_continuous(labels = percent) +
coord_flip() +
theme_light()
当尝试对绘图重新排序时,条形图的顺序与上面我的输出一样混乱:
df %>%
mutate(neighborhood = fct_reorder(neighborhood, as.numeric(as.factor(type)),
mean, na.rm = TRUE)) %>%
ggplot(aes(neighborhood, count, fill = type)) +
geom_col(position = "fill") +
scale_y_continuous(labels = scales::percent) +
coord_flip() +
theme_light()
关于我这里遗漏的内容,有什么想法吗?一种方法是按
'residential\u unit'
和计数来排列数据,并按它们出现的顺序分配系数级别
library(dplyr)
library(ggplot2)
df %>%
group_by(neighborhood, type) %>%
summarise(prop = sum(count)) %>%
mutate(prop = prop.table(prop)) %>%
arrange(type != 'residential_unit', prop) %>%
pull(neighborhood) %>% unique -> levels
df %>%
mutate(neighborhood = factor(neighborhood, levels)) %>%
ggplot(aes(neighborhood, count, fill = type)) +
geom_col(position = "fill") +
scale_y_continuous(labels = scales::percent) +
coord_flip() +
theme_light()
希望这能弥补简洁明了的不足:
df %>%
left_join( # Add res_share for each neighborhood
df %>%
group_by(neighborhood) %>%
mutate(share = count / sum(count)) %>%
ungroup() %>%
filter(type == "residential_unit") %>%
select(neighborhood, res_share = share)
) %>%
mutate(neighborhood = fct_reorder(neighborhood, res_share)) %>%
ggplot(aes(neighborhood, count, fill = type)) +
geom_col(position = "fill") +
scale_y_continuous(labels = scales::percent) +
coord_flip() +
theme_light()
感谢您提供的实际解决方案!你有没有解释为什么fct\u reorder
不起作用?在fct\u reorder
中,你没有指定任何与“住宅单元”相关的内容。它怎么知道你想要多少因子水平?我明白。应用解决方案的问题在于,您按原始计数排列,而不是按分组比例排列。我想把住宅区或商业区的比例按邻里分类,然后按住宅区的比例(按降序)将邻里因素考虑进去。@N1loon你能检查一下更新后的答案是否更接近你想要的吗?
df %>%
left_join( # Add res_share for each neighborhood
df %>%
group_by(neighborhood) %>%
mutate(share = count / sum(count)) %>%
ungroup() %>%
filter(type == "residential_unit") %>%
select(neighborhood, res_share = share)
) %>%
mutate(neighborhood = fct_reorder(neighborhood, res_share)) %>%
ggplot(aes(neighborhood, count, fill = type)) +
geom_col(position = "fill") +
scale_y_continuous(labels = scales::percent) +
coord_flip() +
theme_light()