使用ggplot2中的geom_errorbar()重新格式化数据以绘制校准曲线
简介:使用ggplot2中的geom_errorbar()重新格式化数据以绘制校准曲线,r,ggplot2,reshape,reshape2,tidyr,R,Ggplot2,Reshape,Reshape2,Tidyr,简介: ## Here's what my data frame looks like. ## I know it's ugly, but if you copy and paste it into your console it should work! df_cal <- structure(list(ref.co.mean = c(1.23638284617457, 1.46466241535712, 2.16020882959014, 2.55054760052641, 3.1
## Here's what my data frame looks like.
## I know it's ugly, but if you copy and paste it into your console it should work!
df_cal <- structure(list(ref.co.mean = c(1.23638284617457, 1.46466241535712,
2.16020882959014, 2.55054760052641, 3.13141175081258, 3.86968879644661,
6.5914211520901), ref.co.sd = c(0.0196205483139859, 0.0229279198586359,
0.0172965018302434, 0.0164690175286326, 0.00583116470707786,
0.00975072766851073, 0.0388826652553337), ref.co.se = c(0.00346845569085442,
0.00193776290206006, 0.00166435666462165, 0.00127061228762621,
0.000583116470707786, 0.00229826855196908, 0.00614788918523735
), ref.co.ci = c(0.00707396201972773, 0.00383130164529687,
0.00329939297398704,
0.0025085329371034, 0.00115702958592763, 0.00484892279298878,
0.0124352796323718), id = c("48c", "48c", "48c", "48c", "48c",
"48c", "48c"), aa34.co.mean = c(0, 0.248857142857143, 0.823777777777778,
1.256, 1.886, 2.446, 4.54), aa34.co.sd = c(0, 0.0716567783084826,
0.0660714166547489, 0.0777970497665622, 0.0518459255872629, 0,
0.0690217357069497), aa34.co.se = c(0, 0.00605610310675521,
0.0063577250318932, 0.00600217269807407, 0.00518459255872628, 0,
0.0109132946446067), aa34.co.ci = c(0, 0.0119739921598931,
0.0126034483753748, 0.0118499152368743, 0.0102873564420935, 0,
0.0220742219853317), id = c("aa34", "aa34", "aa34", "aa34", "aa34", "aa34",
"aa34"), aa35.co.mean = c(0.2915625, 0.801035714285714, 1.39911111111111,
1.80436904761905, 2.45672, 3.02355555555556, 5.134975), aa35.co.sd =
c(0.0691998633940125, 0.0474980316455754, 0.0846624379229758,
0.0822798331713915, 0.0595577165445419,
0.0178768075145867, 0.0243007072942329), aa35.co.se = c(0.0122329231657723,
0.00401431635364878, 0.00814664688751334, 0.00634802694633388,
0.00595577165445419, 0.00421360393984362, 0.00384227919014218), aa35.co.ci =
c(0.0249492112853266, 0.00793701687349159, 0.0161497773125,
0.0125327252345785, 0.0118175430765459, 0.00888992723110191,
0.00777174323014678), id = c("aa35", "aa35", "aa35", "aa35",
"aa35", "aa35", "aa35")), .Names = c("ref.co.mean", "ref.co.sd",
"ref.co.se", "ref.co.ci", "id", "aa34.co.mean", "aa34.co.sd",
"aa34.co.se", "aa34.co.ci", "id", "aa35.co.mean", "aa35.co.sd",
"aa35.co.se", "aa35.co.ci", "id"), row.names = c(1L, 33L, 173L,
281L, 449L, 549L, 567L), class = "data.frame")
## This code only gets half of the job done...
## 95% Confidence Intervals for Error Bars:
p <- ggplot(df_cal, aes(x=ref.co.mean, y=aa34.co.mean)) +
theme_bw() +
geom_errorbar(aes(ymin=aa34.co.mean-aa34.co.ci,
ymax=aa34.co.mean+aa34.co.ci), width =.05) +
xlab("Reference CO (ppm)") +
ylab("AA34 CO (ppm)") +
geom_smooth(method='lm', formula = y~x, se = FALSE) +
geom_point(size=2, shape = 21, fill="White") +
geom_abline(intercept = 0, slope = 1, color, linetype=2, color = "firebrick") +
ggtitle("CO Calibration @ 0% RH") +
theme(plot.title = element_text(hjust = 0.5)) +
annotate("rect", xmin = 4.80, xmax = 5.70, ymin = 0.70, ymax = 1.70,
fill="white", colour="red") +
annotate("text", x=5.25, y=1.50, label= "R^2 == 0.994", parse=T) +
annotate("text", x=5.25, y=1.20, label= "alpha == -0.9490", parse=T) +
annotate("text", x=5.25, y=0.90, label= "beta == 0.849", parse=T)
p
我有三种不同空气质量测量的汇总统计数据框架。仪器名称为aa34
、aa35
和48c
。他们每个人都以ppm为单位测量一氧化碳。我有宽格式的数据,其中每个向量是三种仪器的平均值、标准偏差、标准误差或95%置信区间
我想使用ggplot()
和geom\u errorbar()
绘制这些汇总统计数据,但我在将数据转换为长格式以及在ggplot()
中为颜色映射提供ID变量方面遇到了一些问题。我正在学习教程。下面是我想复制的图(当然用有毒烟雾替换豚鼠牙齿数据)。我一直在尝试添加一个额外的y变量,并让它们由ID
变量进行颜色协调。我想要的输出将用三个id
向量中的两个替换示例中的supp
向量,即包含aa34
和aa35
的向量。我与剂量
向量的当量为ref.co.mean
,即我们的x
变量。与len
向量等价的是长格式的向量aa34.co.mean
和aa35.co.mean
数据:
## Here's what my data frame looks like.
## I know it's ugly, but if you copy and paste it into your console it should work!
df_cal <- structure(list(ref.co.mean = c(1.23638284617457, 1.46466241535712,
2.16020882959014, 2.55054760052641, 3.13141175081258, 3.86968879644661,
6.5914211520901), ref.co.sd = c(0.0196205483139859, 0.0229279198586359,
0.0172965018302434, 0.0164690175286326, 0.00583116470707786,
0.00975072766851073, 0.0388826652553337), ref.co.se = c(0.00346845569085442,
0.00193776290206006, 0.00166435666462165, 0.00127061228762621,
0.000583116470707786, 0.00229826855196908, 0.00614788918523735
), ref.co.ci = c(0.00707396201972773, 0.00383130164529687,
0.00329939297398704,
0.0025085329371034, 0.00115702958592763, 0.00484892279298878,
0.0124352796323718), id = c("48c", "48c", "48c", "48c", "48c",
"48c", "48c"), aa34.co.mean = c(0, 0.248857142857143, 0.823777777777778,
1.256, 1.886, 2.446, 4.54), aa34.co.sd = c(0, 0.0716567783084826,
0.0660714166547489, 0.0777970497665622, 0.0518459255872629, 0,
0.0690217357069497), aa34.co.se = c(0, 0.00605610310675521,
0.0063577250318932, 0.00600217269807407, 0.00518459255872628, 0,
0.0109132946446067), aa34.co.ci = c(0, 0.0119739921598931,
0.0126034483753748, 0.0118499152368743, 0.0102873564420935, 0,
0.0220742219853317), id = c("aa34", "aa34", "aa34", "aa34", "aa34", "aa34",
"aa34"), aa35.co.mean = c(0.2915625, 0.801035714285714, 1.39911111111111,
1.80436904761905, 2.45672, 3.02355555555556, 5.134975), aa35.co.sd =
c(0.0691998633940125, 0.0474980316455754, 0.0846624379229758,
0.0822798331713915, 0.0595577165445419,
0.0178768075145867, 0.0243007072942329), aa35.co.se = c(0.0122329231657723,
0.00401431635364878, 0.00814664688751334, 0.00634802694633388,
0.00595577165445419, 0.00421360393984362, 0.00384227919014218), aa35.co.ci =
c(0.0249492112853266, 0.00793701687349159, 0.0161497773125,
0.0125327252345785, 0.0118175430765459, 0.00888992723110191,
0.00777174323014678), id = c("aa35", "aa35", "aa35", "aa35",
"aa35", "aa35", "aa35")), .Names = c("ref.co.mean", "ref.co.sd",
"ref.co.se", "ref.co.ci", "id", "aa34.co.mean", "aa34.co.sd",
"aa34.co.se", "aa34.co.ci", "id", "aa35.co.mean", "aa35.co.sd",
"aa35.co.se", "aa35.co.ci", "id"), row.names = c(1L, 33L, 173L,
281L, 449L, 549L, 567L), class = "data.frame")
## This code only gets half of the job done...
## 95% Confidence Intervals for Error Bars:
p <- ggplot(df_cal, aes(x=ref.co.mean, y=aa34.co.mean)) +
theme_bw() +
geom_errorbar(aes(ymin=aa34.co.mean-aa34.co.ci,
ymax=aa34.co.mean+aa34.co.ci), width =.05) +
xlab("Reference CO (ppm)") +
ylab("AA34 CO (ppm)") +
geom_smooth(method='lm', formula = y~x, se = FALSE) +
geom_point(size=2, shape = 21, fill="White") +
geom_abline(intercept = 0, slope = 1, color, linetype=2, color = "firebrick") +
ggtitle("CO Calibration @ 0% RH") +
theme(plot.title = element_text(hjust = 0.5)) +
annotate("rect", xmin = 4.80, xmax = 5.70, ymin = 0.70, ymax = 1.70,
fill="white", colour="red") +
annotate("text", x=5.25, y=1.50, label= "R^2 == 0.994", parse=T) +
annotate("text", x=5.25, y=1.20, label= "alpha == -0.9490", parse=T) +
annotate("text", x=5.25, y=0.90, label= "beta == 0.849", parse=T)
p
##以下是我的数据帧的外观。
##我知道这很难看,但是如果你把它复制粘贴到你的控制台上,它应该会工作的!
df_cal这里切换到长格式的问题是,x轴有一个长度7的变量,y轴有两个组合长度14的变量。因此,此解决方案绑定行,以便两次包含引用(x轴)数据。然后在ggplot
美学中使用颜色
和组
library(ggplot2)
df_aa34_2<-df_cal[,c(1:4,6:10)]#select first 'aa' group including reference data (48c)
df_aa35_2<-df_cal[,c(1:4,11:15)]#select second 'aa' group including reference data (48c)
names(df_aa34_2)<-names(df_aa35_2)#colnames must be the same for rbind function
DF<-rbind(df_aa34_2,df_aa35_2)#bind rows
p <- ggplot(DF,aes(x=ref.co.mean,y=aa35.co.mean,colour=id,group=id)) +
geom_errorbar(aes(ymin=aa35.co.mean-aa35.co.ci,
ymax=aa35.co.mean+aa35.co.ci), width =.5) +
xlab("Reference CO (ppm)") +
ylab("AA34 & 35 CO (ppm)") +
geom_smooth(method='lm', formula = y~x, se = FALSE) +
geom_point(size=2, shape = 21, fill="White") +
geom_abline(intercept = 0, slope = 1, color, linetype=2, color = "firebrick") +
ggtitle("CO Calibration @ 0% RH") +
theme(plot.title = element_text(hjust = 0.5)) +
annotate("rect", xmin = 4.80, xmax = 5.70, ymin = 0.70, ymax = 1.70,
fill="white", colour="red") +
annotate("text", x=5.25, y=1.50, label= "R^2 == 0.994", parse=T) +
annotate("text", x=5.25, y=1.20, label= "alpha == -0.9490", parse=T) +
annotate("text", x=5.25, y=0.90, label= "beta == 0.849", parse=T)+
theme_bw()
p
库(ggplot2)
df_aa34_2您是否可以编辑您的问题,以包含dput(df_cal)
的输出,使其易于复制?此外,我想知道您是从哪里计算汇总统计数据的?是不是Excel
?使用Rmisc
程序包中的SummarySE
函数可以使这变得更容易,如您的示例链接所示。@J.Con感谢dput()
提示。它并不漂亮,但如果你复制并粘贴到控制台中,它似乎可以工作。我仍在使用R
计算我的汇总统计数据。我正在使用dplyr
从7步时间序列中手动筛选出“高原”。然后,我使用一些基本函数来生成标准偏差、标准误差和95%置信区间的向量,这三种仪器在7个步骤中的每一步都是如此。然后,我对7个校准步骤中的每个步骤执行一个row_bind()
,然后执行一个unique()
,每个校准步骤只提供一个观察值。下面是我用来生成这些汇总统计数据的示例。这看起来很棒,谢谢!我喜欢您使用rbind()
而不是尝试使用restrape2
或tidyr
。我已经将melt()
函数与left\u join()
配对,以获得类似格式的数据,但您的版本涉及的代码行更少。现在包括一些额外的annotate()
行,以包括第二个lm()
输出。