R 为什么ggplot2在此脚本中打印时忽略因子级别顺序？_R_Ggplot2_Subset_Factors

R 为什么ggplot2在此脚本中打印时忽略因子级别顺序？

R 为什么ggplot2在此脚本中打印时忽略因子级别顺序？,r,ggplot2,subset,factors,R,Ggplot2,Subset,Factors,据我所知，图例中项目的顺序最好通过控制相关因素级别的顺序来控制。但是，当我设置因子级别的顺序时，结果图似乎忽略了它们（请参见下面的代码）。从其他问题来看，数据帧的子集可能是问题的原因之一。我正在绘制蛋白质序列示意图上的特征位置图，从一个包含许多不同类型特征的大表开始。这意味着我无法避免对数据进行子集，以允许我以不同的方式绘制不同的特征因此，我的问题是： 1）在这种情况下，如何控制图例中项目的顺序？ 2）理想情况下，我希望每个geom_point图层都有一个单独的图例-因此我有一个名为“Mo

据我所知，图例中项目的顺序最好通过控制相关因素级别的顺序来控制。但是，当我设置因子级别的顺序时，结果图似乎忽略了它们（请参见下面的代码）。从其他问题来看，数据帧的子集可能是问题的原因之一。我正在绘制蛋白质序列示意图上的特征位置图，从一个包含许多不同类型特征的大表开始。这意味着我无法避免对数据进行子集，以允许我以不同的方式绘制不同的特征

因此，我的问题是：

1）在这种情况下，如何控制图例中项目的顺序？
2）理想情况下，我希望每个geom_point图层都有一个单独的图例-因此我有一个名为“Motions”，另一个名为“PTM”。这可能吗

library(tidyverse)

df <- as.data.frame(
  type = as.factor(c("Chain", "PTM", "PTM", "Motif", "Motif", "PTM", "Motif", "Chain", "PTM", "PTM", "Motif", "Motif")),
  description = as.factor(c("seq", "methyl", "methyl", "RXL", "RXL", "amine", "CXXC", "seq", "amine", "methyl", "CXXC", "RXL")),
  begin = c(1, 20, 75, 150, 67, 289, 100, 1, 124, 89, 73, 6),
  order = c(1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2),
  length = c(300, 1, 1, 1, 1, 1, 1, 350, 1, 1, 1, 1)
)

plot_start <- -100
plot_end <- 500

dfplot <- ggplot() +
  xlim(plot_start, plot_end) +
  scale_y_continuous(expand = c(0,0), limits =c(0, 2.5))

# white background
dfplot <- dfplot + theme_bw() +  
  theme(panel.grid.minor=element_blank(),
        panel.grid.major=element_blank()) +
  theme(axis.ticks = element_blank(),
        axis.text.y = element_blank()) +
  theme(panel.border = element_blank())

#plot chains
dfplot <- dfplot + geom_rect(data= df[df$type == "Chain",],
                                               mapping=aes(xmin=begin,
                                                           xmax=length,
                                                           ymin=order-0.2,
                                                           ymax=order+0.2),
                                               colour = "blue",
                                               fill = "#C4D9E9")

#set desired order of factor levels
df$description<-factor(df$description, levels = c("amine", "methyl", "RXL", "seq", "CXXC"))

#plot motif positions

dfplot <- dfplot + geom_point(data = filter(df, type == "Motif"),
                                       aes(begin, order, shape = description, color = description), 
                                       size = 3,)

#plot modification positions

dfplot <- dfplot + geom_point(data = filter(df, type == "PTM"),
                              aes(begin, (order + 0.25), shape = description, color = description), 
                              size = 3) 

dfplot

库（tidyverse）
df建议与您的图表略有不同：
SuppressPackageStatupMessages（库（dplyr））
SuppressPackageStatupMessages（库（ggplot2））
df%
ggplot（，aes（x=开始，y=类型，颜色=描述））+
几何点（）

由于我不太理解的原因，当您使用几何点两次时，因子顺序被忽略。修改数据，以便在修复问题后只需调用geom_point

library(tidyverse)

df <- data.frame(
  type = as.factor(c("Chain", "PTM", "PTM", "Motif", "Motif", "PTM", "Motif", "Chain", "PTM", "PTM", "Motif", "Motif")),
  description = as.factor(c("seq", "methyl", "methyl", "RXL", "RXL", "amine", "CXXC", "seq", "amine", "methyl", "CXXC", "RXL")),
  begin = c(1, 20, 75, 150, 67, 289, 100, 1, 124, 89, 73, 6),
  order = c(1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2),
  length = c(300, 1, 1, 1, 1, 1, 1, 350, 1, 1, 1, 1)
)

#set desired order of factor levels
df <- df %>% mutate(
  order = if_else(type == "PTM", true = order + 0.25, false = order),
  description = factor(description, levels = c("amine", "methyl", "RXL", "seq", "CXXC")))

plot_start <- -100
plot_end <- 500

dfplot <- ggplot() +
  xlim(plot_start, plot_end) +
  scale_y_continuous(expand = c(0,0), limits =c(0, 2.5))

# white background
dfplot <- dfplot + theme_bw() +  
  theme(panel.grid.minor=element_blank(),
        panel.grid.major=element_blank()) +
  theme(axis.ticks = element_blank(),
        axis.text.y = element_blank()) +
  theme(panel.border = element_blank())

#plot chains
dfplot <- dfplot + geom_rect(data= df[df$type == "Chain",],
                             mapping=aes(xmin=begin,
                                         xmax=length,
                                         ymin=order-0.2,
                                         ymax=order+0.2),
                             colour = "blue",
                             fill = "#C4D9E9")



#plot motif positions

dfplot <- dfplot + geom_point(data = filter(df, type %in% c("PTM", "Motif")),
                              aes(begin, order, shape = description, color = description), 
                              size = 3)
dfplot

库（tidyverse）
df看起来像是在创建初始绘图后编辑因子级别。试着把这一行放在“可能”之前？@JackBrookes感谢你的想法-不幸的是，如果我直接将这一行移到创建数据帧的位置之后，这一点都没有区别。谢谢你的回答。您是对的，当您更改因子顺序级别时，此代码确实正确调整图例顺序。不幸的是，我找不到一种方法来将这种设置转化为我需要的最终图表。它需要显示蛋白质的长度（我的原始图表中的蓝色条；实际上大约有30个），最终的图表也将有更多的几何层显示不同蛋白质结构域的位置。非常感谢-这正是我想要的。