R 按变量之间的差异对条进行排序
我的目的是绘制一个条形图,其中包含可见的变量:R 按变量之间的差异对条进行排序,r,ggplot2,tidyverse,R,Ggplot2,Tidyverse,我的目的是绘制一个条形图,其中包含可见的变量: “HH\u FIN\u EX”、“ACT\u IND\u CON\u EXP”但按变量diff按升序排列差异本身不应包含在图表中 library(eurostat) library(tidyverse) #getting the data data1 <- get_eurostat("nama_10_gdp",time_format = "num") #filtering data_1_4 <- data1 %>%
“HH\u FIN\u EX”、“ACT\u IND\u CON\u EXP”
但按变量diff
按升序排列<代码>差异本身不应包含在图表中
library(eurostat)
library(tidyverse)
#getting the data
data1 <- get_eurostat("nama_10_gdp",time_format = "num")
#filtering
data_1_4 <- data1 %>%
filter(time=="2016",
na_item %in% c("B1GQ", "P31_S14_S15", "P41"),
geo %in% c("BE","BG","CZ","DK","DE","EE","IE","EL","ES","FR","HR","IT","CY","LV","LT","LU","HU","MT","NL","AT","PL","PT","RO","SI","SK","FI","SE","UK"),
unit=="CP_MEUR")%>% select(-unit, -time)
#transformations and calculations
data_1_4 <- data_1_4 %>%
spread(na_item, values)%>%
na.omit() %>%
mutate(HH_FIN_EX = P31_S14_S15/B1GQ, ACT_IND_CON_EXP=P41/B1GQ, diff=ACT_IND_CON_EXP-HH_FIN_EX) %>%
gather(na_item, values, 2:7)%>%
filter(na_item %in% c("HH_FIN_EX", "ACT_IND_CON_EXP", "diff"))
#plotting
ggplot(data=data_1_4, aes(x=reorder(geo, values), y=values, fill=na_item))+
geom_bar(stat="identity", position=position_dodge(), colour="black")+
labs(title="", x="Countries", y="As percentage of GDP")
图书馆(欧盟统计局)
图书馆(tidyverse)
#获取数据
数据1%选择(-单位,-时间)
#变换与计算
数据1\u 4%
价差(不适用项目,价值)%>%
na.省略()%>%
突变(HH_FIN_EX=P31_S14_S15/B1GQ,ACT_indu CON_EXP=P41/B1GQ,diff=ACT_indu CON_EXP-HH_FIN_EX)%>%
聚集(na_项,值,2:7)%>%
过滤器(不适用于%c中的项目%(“HH\u FIN\u EXP”、“ACT\u IND\u CON\u EXP”、“diff”))
#策划
ggplot(数据=数据1\u 4,aes(x=重新排序(地理位置,值),y=值,填充=不适用项))+
几何图形栏(stat=“identity”,position=position\u dodge(),color=“black”)+
实验室(title=”“,x=“国家”,y=“占GDP的百分比”)
我非常感谢您提出的建议,因为
aes(x=reorder(geo,values[values==“diff”])
会导致一个错误。这就是您想要的吗
data_1_4 %>% mutate(Val = fct_reorder(geo, values, .desc = TRUE)) %>%
filter(na_item %in% c("HH_FIN_EX", "ACT_IND_CON_EXP")) %>%
ggplot(aes(x=Val, y=values, fill=na_item)) +
geom_bar(stat="identity", position=position_dodge(), colour="black") +
labs(title="", x="Countries", y="As percentage of GDP")
您可以明确地计算出所需的顺序(存储在下面的
国家/地区顺序中),并强制因子geo
将其级别设置为该顺序。然后在过滤掉diff
变量后运行ggplot
。因此,将对ggplot
的调用替换为以下内容:
country_order = (data_1_4 %>% filter(na_item == 'diff') %>% arrange(values))$geo
data_1_4$geo = factor(data_1_4$geo, country_order)
ggplot(data=filter(data_1_4, na_item != 'diff'), aes(x=geo, y=values, fill=na_item))+
geom_bar(stat="identity", position=position_dodge(), colour="black")+
labs(title="", x="Countries", y="As percentage of GDP")
这样做,我得到下面的图:
首先,在使用聚集时,不应包含diff
(结果列),这会使事情变得复杂。
将行聚集(不适用项,值,2:7)
更改为聚集(不适用项,值,2:6)
您可以使用此代码按降序计算差异和顺序(使用dplyr::arange
)行:
plotData <- data_1_4 %>%
spread(na_item, values) %>%
na.omit() %>%
mutate(HH_FIN_EX = P31_S14_S15 / B1GQ,
ACT_IND_CON_EXP = P41 / B1GQ,
diff = ACT_IND_CON_EXP - HH_FIN_EX) %>%
gather(na_item, values, 2:6) %>%
filter(na_item %in% c("HH_FIN_EX", "ACT_IND_CON_EXP")) %>%
arrange(desc(diff))
Justai,什么是data1
?另外,diff
是base
中的函数,所以最好不要这样命名变量。添加了缺少的代码行。感谢建议不要使用diff
。
ggplot(plotData, aes(geo, values, fill = na_item))+
geom_bar(stat = "identity", position = "dodge", color = "black") +
labs(x = "Countries",
y = "As percentage of GDP") +
scale_x_discrete(limits = plotData$geo)