R 如何将多个变量(误差条)融合/收集为一个,以映射到几何图形条?
我将从我的目标开始,即为我的每个变量(幅值[mag]、持续时间[dura]和距离[dist])生成图形,但要为训练和测试生成不同的误差条。: 几乎完成图 我有一个这样的数据框:(下面的截图+dput)。它显示了各种生物菌株在训练和测试期间的响应(幅度、距离、持续时间)以及它们的标准误差(SEM)。例如,训练时的持续时间响应在列“训练平均持续时间”中,而测试时的持续时间响应是“测试平均持续时间”.每一项的标准误差在列_duraSEM和测试_duraSEM中 df_组_总和(数据帧) 我用于绘制此图的代码(删除SEM行后)如下所示:R 如何将多个变量(误差条)融合/收集为一个,以映射到几何图形条?,r,ggplot2,melt,geom-bar,R,Ggplot2,Melt,Geom Bar,我将从我的目标开始,即为我的每个变量(幅值[mag]、持续时间[dura]和距离[dist])生成图形,但要为训练和测试生成不同的误差条。: 几乎完成图 我有一个这样的数据框:(下面的截图+dput)。它显示了各种生物菌株在训练和测试期间的响应(幅度、距离、持续时间)以及它们的标准误差(SEM)。例如,训练时的持续时间响应在列“训练平均持续时间”中,而测试时的持续时间响应是“测试平均持续时间”.每一项的标准误差在列_duraSEM和测试_duraSEM中 df_组_总和(数据帧) 我用于绘制此
structure(list(strain = structure(c(1L, 2L, 3L, 4L, 5L, 6L, 7L,
8L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L, 5L, 6L, 7L,
8L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L, 5L, 6L, 7L,
8L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L, 5L, 6L, 7L,
8L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L, 5L, 6L, 7L,
8L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L, 5L, 6L, 7L,
8L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L, 5L, 6L, 7L,
8L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L, 5L, 6L, 7L,
8L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L, 5L, 6L, 7L,
8L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L, 5L, 6L, 7L,
8L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L), .Label = c("N2", "acy-1(LOF)",
"acy-1(GOF)", "pde-4", "unc-43", "crh-1", "glr-1", "avr-14"), class = "factor"),
variable = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 7L, 7L,
7L, 7L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 9L, 9L, 9L, 9L, 9L,
9L, 9L, 9L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 11L,
11L, 11L, 11L, 11L, 11L, 11L, 11L, 12L, 12L, 12L, 12L, 12L,
12L, 12L, 12L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 14L,
14L, 14L, 14L, 14L, 14L, 14L, 14L, 15L, 15L, 15L, 15L, 15L,
15L, 15L, 15L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 17L,
17L, 17L, 17L, 17L, 17L, 17L, 17L, 18L, 18L, 18L, 18L, 18L,
18L, 18L, 18L, 19L, 19L, 19L, 19L, 19L, 19L, 19L, 19L, 20L,
20L, 20L, 20L, 20L, 20L, 20L, 20L), .Label = c("test_avg_dist",
"test_avg_dura", "test_avg_mag", "test_avg_prob", "test_avg_spd",
"test_distSEM", "test_duraSEM", "test_magSEM", "test_probSEM",
"test_spdSEM", "train_avg_dist", "train_avg_dura", "train_avg_mag",
"train_avg_prob", "train_avg_spd", "train_distSEM", "train_duraSEM",
"train_magSEM", "train_probSEM", "train_spdSEM"), class = "factor"),
value = c(0.23102447163515, 0.198503787878788, 0.23892936802974,
0.247270588235294, 0.148316666666667, 0.195762711864407,
0.204740740740741, 0.238755154639175, 1.04759733036707, 1.15537878787879,
0.914684014869888, 1.12286274509804, 0.828916666666667, 0.785491525423729,
0.788407407407407, 1.02309278350515, 0.112163461525871, 0.113447031611172,
0.15930172539742, 0.105397926645665, 0.0370000063024116,
0.0823626968797451, 0.0441620688813484, 0.135786546158742,
0.457040018571118, 0.563727434411572, 0.624264612406578,
0.392625726149316, 0.219488346025285, 0.355836464305103,
0.158243463050796, 0.549997886634136, 0.218104671667048,
0.175578055416405, 0.256197987699313, 0.218534931269605,
0.181253278716812, 0.235434749265196, 0.236043513165036,
0.229165553562148, 0.00460504533342531, 0.0050568065734325,
0.00945562739572128, 0.00524044558789062, 0.00882224860763199,
0.00983820301449839, 0.0162322856355826, 0.00738407922404085,
0.0187491841242793, 0.0287113186085301, 0.0283764910080623,
0.0215386973519077, 0.0471018319675206, 0.0341593217329755,
0.0564553992545153, 0.0271939362203803, 0.00335619679815181,
0.00443251320170775, 0.00919066553588191, 0.00432150262248429,
0.00400887448034098, 0.00664866437888279, 0.00575860867691942,
0.00524462205156711, 0.00460504533342531, 0.0050568065734325,
0.00945562739572128, 0.00524044558789062, 0.00882224860763199,
0.00983820301449839, 0.0162322856355826, 0.00738407922404085,
0.00148090077905166, 0.00224725406956702, 0.00293788372166611,
0.00142518092482957, 0.00475313026432338, 0.00259537819051875,
0.00439432015310276, 0.00179190641262238, 0.337652222222222,
0.294218518518519, 0.338651851851852, 0.311313725490196,
0.254675, 0.2737, 0.390688888888889, 0.314817948717949, 1.3543,
1.429, 1.19151851851852, 1.37256862745098, 1.236, 1.06376666666667,
1.41396296296296, 1.31512820512821, 0.1930557426236, 0.19297076970836,
0.212916856705011, 0.127417008935649, 0.0841239843171108,
0.117210954090848, 0.115413610503398, 0.179227387006556,
0.525206741295172, 0.606796097537911, 0.592920766963248,
0.383218177729097, 0.294853306191478, 0.37983654970313, 0.244065736387288,
0.529995494304863, 0.245519078777542, 0.204069564920836,
0.279438682643543, 0.223741850875084, 0.203505986396722,
0.244494243449087, 0.263225928969608, 0.235094347033923,
0.00509151719343593, 0.00741331297357774, 0.0110354960774679,
0.0058641318136066, 0.0114389388703232, 0.0108143010933781,
0.0182904578688527, 0.00913426247712326, 0.0167858570502119,
0.0279705569908445, 0.030133138276768, 0.0219057666071679,
0.0479637760140276, 0.0332974908188985, 0.0605392786801207,
0.0323033076008837, 0.00498395111761598, 0.0081988397756359,
0.0107052683837969, 0.00442352355941589, 0.00723029142814287,
0.00764631328347674, 0.00980735575566329, 0.00789476278044047,
0.00509151719343593, 0.00741331297357774, 0.0110354960774679,
0.0058641318136066, 0.0114389388703232, 0.0108143010933781,
0.0182904578688527, 0.00913426247712326, 0.00139403793044242,
0.00220415921330836, 0.00299625483623813, 0.00144528089431754,
0.00441088530148196, 0.00248394605240026, 0.00319027562414684,
0.00174638373495128)), row.names = c(NA, -160L), .Names = c("strain",
"variable", "value"), class = "data.frame")
(abs_bar_mag <-
df_group_sum.long %>%
filter(grepl("mag", variable)) %>%
ggplot(aes(x = strain,
y = value,
fill = variable))+
scale_fill_manual(values=c("lightseagreen", "indianred1"))+
geom_bar(stat="identity", position = "dodge") +
#geom_errorbar(aes(ymin=value-1, ymax=value+1), width=.1, position = position_dodge(width=0.9)) +
theme(panel.background = element_blank()) +
theme(text = element_text(size = 20),
axis.line = element_line(colour = "black")) +
ggtitle("") +
theme(plot.title = element_text(size = 30, hjust = 0.5, face = "bold"),
axis.text = element_text(size = 70),
strip.text = element_text(size = 40),
axis.text.x = element_text(angle = 65, hjust = 1, size = 40),
axis.title.y = (element_text(size = 65)))
+
labs(colour = "",
y = "Magnitude",
x = "") +
scale_colour_manual(values = rev())
)
(abs\u bar\u mag%
过滤器(grepl(“mag”,变量))%>%
ggplot(aes(x=应变,
y=值,
填充=可变)+
刻度填充手册(数值=c(“lightseagreen”、“indianred1”))+
几何图形栏(stat=“identity”,position=“dodge”)+
#几何误差条(aes(ymin=value-1,ymax=value+1),宽度=0.1,位置=position\u减淡(宽度=0.9))+
主题(panel.background=element\u blank())+
主题(文本=元素\文本(大小=20),
轴线=元素线(颜色=“黑色”))+
标题(“”)+
主题(plot.title=element\u text(大小=30,hjust=0.5,face=“bold”),
axis.text=元素\文本(大小=70),
strip.text=元素\文本(尺寸=40),
axis.text.x=元素\文本(角度=65,尺寸=1,尺寸=40),
axis.title.y=(元素_文本(大小=65)))
+
实验室(颜色=”,
y=“震级”,
x=“”)+
比例\颜色\手册(值=rev())
)
我非常感谢您提供的任何建议或解决方案
谢谢,
Aram这里的问题是avg列和SEM(标准错误)列需要保持在一起。这需要同时重塑两个值列。有关详细信息,请参阅 因此,我们从宽格式的数据(
df_group_sum.wide
)开始。为了与OP提供的代码一致,只绘制了震级
library(data.table)
library(ggplot2)
molten <- melt(
data.table(df_group_sum.wide), id.vars = "strain",
measure.vars = patterns("avg_mag$", "magSEM$"),
value.name = c("avg", "SEM"))[
, variable := forcats::lvls_revalue(variable, c("test_mag", "train_mag"))][]
molten
OP还提供了一个长格式的data.frame
df_group_sum.long
,它包含的数据确实比df_group_sum.wide
多。现在也应该绘制这些数据
通过查看变量名
unique(df_group_sum.long$variable)
data.frame似乎包含两个数据集(train
和test
)的五个不同变量(dist
,dura
,mag
,prob
,spd
)的聚合数据(avg
和SEM
)。同样,avg
和SEM
需要保持在一行,以便绘制带有错误条的条形图
不幸的是,命名方案不一致。如果包含标准错误的变量命名类似于train\u avg\u mag
,例如train\u SEM\u mag
,而不是train\u magSEM
,则会更好
library(ggplot2)
ggplot(dcast(DT, ... ~ measure),
aes(strain, avg, ymin = avg - SEM, ymax = avg + SEM, fill = dataset)) +
geom_col(position = "dodge") +
geom_errorbar(width=.1, position = position_dodge(width=0.9)) +
scale_fill_manual(values=c("lightseagreen", "indianred1")) +
theme_bw() +
labs(fill = "", y = "Average", x = "") +
facet_wrap(~ variable, scales = "free_y") +
theme(axis.text.x = element_text(angle = 65, hjust = 1))
因此,第一步是拆分变量
名称,以分别获得不同的组:
library(data.table)
DT <- data.table(df_group_sum.long)
DT[, c("dataset", "measure", "variable") :=
DT[, tstrsplit(variable, "_|SEM$")][is.na(V3), `:=`(V3 = V2, V2 = "SEM")]]
DT
“区”“dura”“mag”“prob”“spd”
“测试”“列车”
“平均”“扫描电镜”
现在,使用连接更新将缩写变量名替换为全名:
这里的问题是avg列和SEM(标准错误)列需要保持在一起。这需要同时重塑两个值列的形状。有关详细信息,请参阅 因此,我们从宽格式的数据(
df_group_sum.wide
)开始。为了与OP提供的代码一致,只绘制了震级
library(data.table)
library(ggplot2)
molten <- melt(
data.table(df_group_sum.wide), id.vars = "strain",
measure.vars = patterns("avg_mag$", "magSEM$"),
value.name = c("avg", "SEM"))[
, variable := forcats::lvls_revalue(variable, c("test_mag", "train_mag"))][]
molten
OP还提供了一个长格式的data.frame
df_group_sum.long
,它包含的数据确实比df_group_sum.wide
多。现在也应该绘制这些数据
通过查看变量名
unique(df_group_sum.long$variable)
data.frame似乎包含两个数据集(train
和test
)的五个不同变量(dist
,dura
,mag
,prob
,spd
)的聚合数据(avg
和SEM
)。同样,avg
和SEM
需要保持在一行,以便绘制带有错误条的条形图
不幸的是,命名方案不一致。如果包含标准错误的变量命名类似于train\u avg\u mag
,例如train\u SEM\u mag
,而不是train\u magSEM
,则会更好
library(ggplot2)
ggplot(dcast(DT, ... ~ measure),
aes(strain, avg, ymin = avg - SEM, ymax = avg + SEM, fill = dataset)) +
geom_col(position = "dodge") +
geom_errorbar(width=.1, position = position_dodge(width=0.9)) +
scale_fill_manual(values=c("lightseagreen", "indianred1")) +
theme_bw() +
labs(fill = "", y = "Average", x = "") +
facet_wrap(~ variable, scales = "free_y") +
theme(axis.text.x = element_text(angle = 65, hjust = 1))
因此,第一步是拆分变量
名称,以分别获得不同的组:
library(data.table)
DT <- data.table(df_group_sum.long)
DT[, c("dataset", "measure", "variable") :=
DT[, tstrsplit(variable, "_|SEM$")][is.na(V3), `:=`(V3 = V2, V2 = "SEM")]]
DT
“区”“dura”“mag”“prob”“spd”
“测试”“列车”
“平均”“扫描电镜”
现在,使用连接更新将缩写变量名替换为全名:
您是否可以显示绘图命令?@ChiPak;我已更新我的帖子以包含此命令!您是否可以显示绘图命令?@ChiPak;我已更新我的帖子以包含此命令!
unique(DT[, variable])
unique(DT[, dataset])
unique(DT[, measure])
abbr2full <- data.table(
variable = c("dist", "dura", "mag"),
full = c("Distance", "Duration", "Magnitude")
)
DT[abbr2full, on = "variable", variable := full][]
library(ggplot2)
ggplot(dcast(DT, ... ~ measure),
aes(strain, avg, ymin = avg - SEM, ymax = avg + SEM, fill = dataset)) +
geom_col(position = "dodge") +
geom_errorbar(width=.1, position = position_dodge(width=0.9)) +
scale_fill_manual(values=c("lightseagreen", "indianred1")) +
theme_bw() +
labs(fill = "", y = "Average", x = "") +
facet_wrap(~ variable, scales = "free_y") +
theme(axis.text.x = element_text(angle = 65, hjust = 1))