R 编辑:如何编写用于后期Tukey测试和多重比较的循环

R 编辑:如何编写用于后期Tukey测试和多重比较的循环,r,ggplot2,dplyr,posthoc,lsmeans,R,Ggplot2,Dplyr,Posthoc,Lsmeans,这是我的大数据矩阵示例&每列都用多个信息命名,并用下划线分隔 structure(list(Gene = c("AGI4120.1_UBQ", "AGI570.1_Acin"), WT_Tissue_0T_1 = c(0.886461437, 1.093164915), WT_Tissue_0T_2 = c(1.075140682, 1.229862834), WT_Tissue_0T_3 = c(0.632903012, 1.094003128), WT_Tissue_1T_1 = c(0.8

这是我的大数据矩阵示例&每列都用多个信息命名,并用下划线分隔

structure(list(Gene = c("AGI4120.1_UBQ", "AGI570.1_Acin"), WT_Tissue_0T_1 = c(0.886461437, 1.093164915), WT_Tissue_0T_2 = c(1.075140682, 1.229862834), WT_Tissue_0T_3 = c(0.632903012, 1.094003128), WT_Tissue_1T_1 = c(0.883151274, 1.26322126), WT_Tissue_1T_2 = c(1.005627276, 0.962729188), WT_Tissue_1T_3 = c(0.87123469, 0.968078993), WT_Tissue_3T_1 = c(0.723601456, 0.633890322), WT_Tissue_3T_2 = c(0.392585237, 0.534819363), WT_Tissue_3T_3 = c(0.640185369, 1.021934772), WT_Tissue_5T_1 = c(0.720291294, 0.589244505), WT_Tissue_5T_2 = c(0.362131744, 0.475251717), WT_Tissue_5T_3 = c(0.549486925, 0.618177919), mut1_Tissue_0T_1 = c(1.464415756, 1.130533457), mut1_Tissue_0T_2 = c(1.01489573, 1.114915728), mut1_Tissue_0T_3 = c(1.171797418, 1.399956009), mut1_Tissue_1T_1 = c(0.927507448, 1.231911575), mut1_Tissue_1T_2 = c(1.089705396, 1.256782289 ), mut1_Tissue_1T_3 = c(0.993048659, 0.999044465), mut1_Tissue_3T_1 = c(1.000993049, 1.103486794), mut1_Tissue_3T_2 = c(1.062562066, 0.883617224 ), mut1_Tissue_3T_3 = c(1.037404833, 0.851875438), mut1_Tissue_5T_1 = c(0.730883813, 0.437440083), mut1_Tissue_5T_2 = c(0.480635551, 0.298762126 ), mut1_Tissue_5T_3 = c(0.85468388, 0.614923997)), row.names = c(NA, -2L), class = c("tbl_df", "tbl", "data.frame"), spec = structure(list( cols = list(Gene = structure(list(), class = c("collector_character", "collector")), WT_Tissue_0T_1 = structure(list(), class = c("collector_double", "collector")), WT_Tissue_0T_2 = structure(list(), class = c("collector_double", "collector")), WT_Tissue_0T_3 = structure(list(), class = c("collector_double", "collector")), WT_Tissue_1T_1 = structure(list(), class = c("collector_double", "collector")), WT_Tissue_1T_2 = structure(list(), class = c("collector_double", "collector")), WT_Tissue_1T_3 = structure(list(), class = c("collector_double", "collector")), WT_Tissue_3T_1 = structure(list(), class = c("collector_double", "collector")), WT_Tissue_3T_2 = structure(list(), class = c("collector_double", "collector")), WT_Tissue_3T_3 = structure(list(), class = c("collector_double", "collector")), WT_Tissue_5T_1 = structure(list(), class = c("collector_double", "collector")), WT_Tissue_5T_2 = structure(list(), class = c("collector_double", "collector")), WT_Tissue_5T_3 = structure(list(), class = c("collector_double", "collector")), mut1_Tissue_0T_1 = structure(list(), class = c("collector_double", "collector")), mut1_Tissue_0T_2 = structure(list(), class = c("collector_double", "collector")), mut1_Tissue_0T_3 = structure(list(), class = c("collector_double", "collector")), mut1_Tissue_1T_1 = structure(list(), class = c("collector_double", "collector")), mut1_Tissue_1T_2 = structure(list(), class = c("collector_double", "collector")), mut1_Tissue_1T_3 = structure(list(), class = c("collector_double", "collector")), mut1_Tissue_3T_1 = structure(list(), class = c("collector_double", "collector")), mut1_Tissue_3T_2 = structure(list(), class = c("collector_double", "collector")), mut1_Tissue_3T_3 = structure(list(), class = c("collector_double", "collector")), mut1_Tissue_5T_1 = structure(list(), class = c("collector_double", "collector")), mut1_Tissue_5T_2 = structure(list(), class = c("collector_double", "collector")), mut1_Tissue_5T_3 = structure(list(), class = c("collector_double", "collector"))), default = structure(list(), class = c("collector_guess", "collector"))), class = "col_spec"))
我想进行Tukey测试,绘制每个基因反应与时间的条形图;用多重比较字母填写两种基因型

语法

df1 <- df %>% 
         gather(var, response, WT_Tissue_0T_1:mut1_Tissue_5T_3) %>% 
         separate(var, c("Genotype", "Tissue", "Time"), sep = "_") %>% 
         arrange(desc(Gene))
df2 <- df1 %>% 
         group_by(`Gene`,Genotype,Tissue,Time) %>% 
         mutate(Response=mean(response),n=n(),se=sd(response)/sqrt(n))
我的最终目标是绘制每个基因,如下图所示,并表示意义字母

df2$genotype <- factor(df2$genotype, levels = c("WT","mut1")) colours <- c('#336600','#ffcc00')library(ggplot2)ggplot(df2,aes( x=Time, y=Response, fill=Genotype))+geom_bar(stat='identity', position='dodge')+scale_fill_manual(values=colours)+geom_errorbar(aes(ymin=average_measure-se, ymax=average_measure+se)+facet_wrap(~`Gene`)+labs(x='Time', y='Response')

如果可能的话,我将非常感谢您的帮助。

您的代码存在许多问题。我甚至可以说,这篇文章并不适合StackOverflow,因为这里的问题是多种多样的,除了这些特定的bug和语法问题之外,还不能真正扩展。但我会发布一些建议作为开始的答案-希望它们能帮助你

一,。L意味着: 此函数需要类似于lm的拟合模型或ref.grid对象。但是,您正在向它传递一个数据帧,而没有计算最小二乘平均值所需的任何回归特性。思考:当你要求将基因作为自变量进行两两比较时,lsmeans如何知道因变量应该是什么?有关更多详细信息,请查看

根据你的数据,我想说你可能想要运行一个多层回归,使用类似lme4的软件包,将基因和基因型以及时间作为嵌套的分组水平

但为了演示,我将使用lm保持简单。将拟合的回归对象传递给lsmeans将按预期工作:

fit <- lm(Response ~ Gene + Genotype + Time, data=df2)

lsmeans(fit, pairwise ~ Gene)[[2]] 

# Output
contrast                        estimate        SE df t.ratio p.value
AGI4120.1_UBQ - AGI570.1_Acin -0.0515123 0.0299492 42   -1.72  0.0928

Results are averaged over the levels of: Genotype, Time 
最后,请注意,在df1中使用separate将抛出一个警告,尽管不是错误,因为sep=\上的拆分会创建一个额外的值。您可以通过添加一个级别来捕获最终值(似乎是时间索引),从而避免造成混淆:

separate(var, c("Genotype", "Tissue", "Time", "Index"), sep = "_")

谢谢你,安德鲁。非常感谢你的帮助。对不起,我忘了在问题中添加我的颜色。我分别写的。i、 e.df2$基因型你所说的重要字母是什么意思?你能举一个你想制作的图形的例子吗?我想为条形图添加字母,我看到这个软件包可以做到这一点,多个比较字母,另一个例子不知道如何使用它。@Oncidium FYI,lsmeans包现在emmeans可以提供字母来显示统计上不同的组,也就是使用cld功能的紧凑字母显示。请参阅中使用的。你好,我已经编辑了问题。很抱歉误导了你。你给定的lsmeansfit,两两~基因输出不是我期望的。我想看看你的时间。而且我真的需要multcomp和multcompView的帮助。我自己应付不了。再次感谢。!fit1
fit <- lm(Response ~ Gene + Genotype + Time, data=df2)

lsmeans(fit, pairwise ~ Gene)[[2]] 

# Output
contrast                        estimate        SE df t.ratio p.value
AGI4120.1_UBQ - AGI570.1_Acin -0.0515123 0.0299492 42   -1.72  0.0928

Results are averaged over the levels of: Genotype, Time 
ggplot(df1,aes( x=Time, y=response, fill=Genotype))+ 
  geom_bar(stat='summary', fun.y='mean', position=position_dodge(0.9))+
    stat_summary(fun.data = mean_se, geom = "errorbar", 
                 color="gray60", width=.1, position=position_dodge(0.9)) +
  scale_fill_manual(values=c("steelblue","orange"))+
  facet_wrap(~`Gene`)+ 
  labs(x='Time', y='Response')
separate(var, c("Genotype", "Tissue", "Time", "Index"), sep = "_")