R ggplot2中的故障日期

R ggplot2中的故障日期,r,sorting,date,ggplot2,R,Sorting,Date,Ggplot2,我通常知道如何在ggplot中排序日期,但这些数据有些不同,我希望有人能为我澄清 考虑: ggplot(tmp3)+ geom_boxplot(aes(x=simdte,y=r2))+ facet_wrap(~simyr, scales='free_x')+ theme(axis.text.x=element_text(angle=45,hjust=1)) 日期按字母数字顺序排列,但现在我想格式化x轴标签,因此我尝试: ggplot(tmp3)+ geom_boxplot(aes(x=reor

我通常知道如何在ggplot中排序日期,但这些数据有些不同,我希望有人能为我澄清

考虑:

ggplot(tmp3)+
geom_boxplot(aes(x=simdte,y=r2))+
facet_wrap(~simyr, scales='free_x')+
theme(axis.text.x=element_text(angle=45,hjust=1))
日期按字母数字顺序排列,但现在我想格式化x轴标签,因此我尝试:

ggplot(tmp3)+
geom_boxplot(aes(x=reorder(strftime(strptime(simdte,'%Y%m%d'),'%b-%d'),as.numeric(simdte)),y=r2))+
facet_wrap(~simyr, scales='free_x')+
theme(axis.text.x=element_text(angle=45,hjust=1))
但请注意,除了2015年6月8日之外,所有日期都是有序的

我也试过了

tmp3=
tmp3 %>%
mutate(plotsimdte=factor(strftime(strptime(simdte,'%Y%m%d'),'%b-%d'),                        levels=strftime(strptime(unique(simdte),'%Y%m%d'),'%b-%d')[order(unique(simdte))]))
并使用
x=plotsimdte
进行打印,但无差异。当我创建这个关于重复级别的因子时,我得到了一个警告,这很混乱,因为我只使用了唯一的值

最后,我试过了

ggplot(tmp3)+
geom_boxplot(aes(x=as.Date(simdte,'%Y%m%d'),y=r2, group=simdte))+
scale_x_date(date_labels ='%b-%d')+
facet_wrap(~simyr, scales='free_x')+
theme(axis.text.x=element_text(angle=45,hjust=1))
但我想保持日期的离散性,因为它们的重要性是作为一个标识符,而不是随时间分布

如有任何建议,将不胜感激。谢谢

我的数据的一小部分

编辑:使用as.data.frame更新dput输出

> dput(as.data.frame(tmp3))
structure(list(mdldte = c("20130525", "20140407", "20140413", 
"20150608", "20130525", "20150608", "20140420", "20130429", "20130608", 
"20130608", "20140323", "20140413", "20150325", "20150608", "20140511", 
"20130601", "20150608", "20130608", "20140420", "20150305", "20150415", 
"20130608", "20140531", "20150608", "20140531", "20150608", "20130403", 
"20130503", "20150415", "20140407", "20150608", "20140323", "20130525", 
"20140420", "20130403", "20130403", "20130608", "20150501", "20150608", 
"20130429", "20160607", "20140527", "20140420", "20140531", "20140502", 
"20150325", "20140428", "20160620", "20160620", "20130403", "20160527", 
"20150415", "20140413", "20160607", "20140413", "20150608", "20160613", 
"20150608", "20140407", "20150501", "20140323", "20160607", "20140531", 
"20150305", "20150409", "20140428", "20130503", "20130525", "20140428", 
"20140407", "20130503", "20130525", "20130403", "20150305", "20150217", 
"20150501", "20130608", "20150305", "20150217", "20130608", "20140511", 
"20160527", "20140502", "20150415"), simdte = c("20130403", "20130403", 
"20130403", "20130429", "20130429", "20130429", "20130503", "20130503", 
"20130503", "20130525", "20130525", "20130525", "20130601", "20130601", 
"20130601", "20130608", "20130608", "20130608", "20140323", "20140323", 
"20140323", "20140407", "20140407", "20140407", "20140413", "20140413", 
"20140413", "20140420", "20140420", "20140420", "20140428", "20140428", 
"20140428", "20140502", "20140502", "20140502", "20140511", "20140511", 
"20140511", "20140517", "20140517", "20140517", "20140527", "20140527", 
"20140527", "20140531", "20140531", "20140531", "20150217", "20150217", 
"20150217", "20150305", "20150305", "20150305", "20150325", "20150325", 
"20150325", "20150409", "20150409", "20150409", "20150415", "20150415", 
"20150415", "20150427", "20150427", "20150427", "20150501", "20150501", 
"20150501", "20150608", "20150608", "20150608", "20160527", "20160527", 
"20160527", "20160607", "20160607", "20160607", "20160613", "20160613", 
"20160613", "20160620", "20160620", "20160620"), r2 = c(0.862283742909527, 
0.813142444594872, 0.700946018367384, 0.474388980021752, 0.826648311592866, 
0.794283339648572, 0.79687922855493, 0.808984929407683, 0.781751354268809, 
0.535951689307516, 0.68524477567256, 0.716321630808227, 0.373141090466726, 
0.723850452026657, 0.408972539926536, 0.29346057127035, 0.319261073048776, 
0.319535158994707, 0.872351278607699, 0.871652058666136, 0.509872096326808, 
0.398605136979609, 0.420745998256184, 0.596082529689281, 0.793035779455997, 
0.661212720614186, 0.736581215438551, 0.89337362408349, 0.900773593767951, 
0.916946297262156, 0.700865150846107, 0.839501961957186, 0.863684601286204, 
0.819367869015135, 0.765192251153536, 0.590744027549224, 0.720092636591613, 
0.732237645665246, 0.701898569000057, 0.505310296599101, 0.756344530560126, 
0.522404606955389, 0.631453896947287, 0.732767696833121, 0.669168785479052, 
0.340080390313005, 0.397681954572616, 0.708286400101956, 0.551718623201008, 
0.62217661847446, 0.160935876745664, 0.79407487647674, 0.729924604817696, 
0.716024523586796, 0.526169199415047, 0.702098331814224, 0.748626603557805, 
0.432690018453805, 0.710646849035047, 0.526049259906931, 0.811336120223548, 
0.679819505156441, 0.591396577448379, 0.656686513355743, 0.698313842140892, 
0.718604690738853, 0.768070041705958, 0.453336001102217, 0.544446423520199, 
0.583336140040845, 0.172961846412558, 0.298155303932666, 0.731010397306203, 
0.582517045429492, 0.521708072638302, 0.610885761462162, 0.543494236386099, 
0.630580819311437, 0.642714888852003, 0.736302041771047, 0.736086951074143, 
0.444437396681972, 0.445336147280364, 0.43829690520584), simyr = c("2013", 
"2013", "2013", "2013", "2013", "2013", "2013", "2013", "2013", 
"2013", "2013", "2013", "2013", "2013", "2013", "2013", "2013", 
"2013", "2014", "2014", "2014", "2014", "2014", "2014", "2014", 
"2014", "2014", "2014", "2014", "2014", "2014", "2014", "2014", 
"2014", "2014", "2014", "2014", "2014", "2014", "2014", "2014", 
"2014", "2014", "2014", "2014", "2014", "2014", "2014", "2015", 
"2015", "2015", "2015", "2015", "2015", "2015", "2015", "2015", 
"2015", "2015", "2015", "2015", "2015", "2015", "2015", "2015", 
"2015", "2015", "2015", "2015", "2015", "2015", "2015", "2016", 
"2016", "2016", "2016", "2016", "2016", "2016", "2016", "2016", 
"2016", "2016", "2016"), mdlpreds = structure(c(4L, 2L, 3L, 1L, 
3L, 2L, 4L, 2L, 3L, 3L, 4L, 2L, 1L, 2L, 3L, 1L, 3L, 3L, 4L, 4L, 
1L, 1L, 1L, 3L, 2L, 3L, 3L, 4L, 4L, 4L, 2L, 3L, 4L, 2L, 4L, 1L, 
3L, 3L, 3L, 3L, 2L, 1L, 4L, 2L, 4L, 3L, 1L, 4L, 4L, 4L, 3L, 4L, 
2L, 2L, 1L, 3L, 3L, 1L, 3L, 2L, 2L, 3L, 3L, 4L, 4L, 3L, 2L, 1L, 
3L, 2L, 3L, 1L, 2L, 1L, 3L, 1L, 1L, 3L, 2L, 2L, 2L, 1L, 1L, 1L
), .Label = c("phv", "phvfsca", "phvaso", "phvasofsca"), class = "factor")), class = "data.frame", .Names = c("mdldte", 
"simdte", "r2", "simyr", "mdlpreds"), row.names = c(NA, -84L))

问题是,您的日期目前被解释为字符数据,而R对它们进行了一些洗牌。您真正想要的是将它们视为真正的日期对象,然后让ggplot的高级函数相应地处理排序和标记

将日期数据转换为日期类型:

tmp3$newdate <- as.Date(strptime(tmp3$simdte, '%Y%m%d'))

将来,了解
str
函数很有用,它可以快速告诉您数据列的格式(也可以从RStudio中的环境面板访问):

如您所见,原始的“simdte”列被存储为字符数据。R(和ggplot)将数据的每个值视为唯一的级别或类别。相反,日期数据基本上是数字数据。R将它们视为连续的,这使得在时间轴或轴上精确地绘制它们变得更容易。它还可以更容易地将基础数据与任何打印标签的格式分离

更新:使用日期作为类别并按日期顺序绘制方框图 相反,如果我们希望每个日期作为一个类别(而不是让日期数据作为数字距离),那么解决方案实际上更简单。当你试图改变输入到ggplot美学中的值的数量时,会发生奇怪的事情,我怀疑这是你排序错误问题的根本原因

关键是依靠ggplot的内置标签功能。再次,对
ggplot
的主调用将输入原始数据,
scale\u x\u discrete
处理漂亮标签的创建:

plot.new <- ggplot(tmp3)+
    geom_boxplot(aes(x=simdte,y=r2))+
    facet_wrap(~simyr, scales='free_x')+
    scale_x_discrete(labels = function(x) strftime(strptime(x, '%Y%m%d'), '%b-%d'))+
    theme(axis.text.x=element_text(angle=45,hjust=1))
print(plot.new)

plot.new您的
dput
输出不可复制。运行该代码会产生错误:
未找到对象“simdte”
。我也无法复制数据帧。复制和粘贴是否存在格式问题?我已将rds下载链接添加到rds文件,该链接会产生404错误。
dput
不适用于
dplyr
的分组df对象。使用
dput(as.data.frame(您的_数据))
@jdobres对这次失败深表歉意。dput在帖子中更新了,它在我的电脑上工作。我已经在我的Q中尝试了这个A。我用一个更好的例子更新了Q。我实际上是在打印箱线图,所以我使用的是
group=simdte
。此外,在这种情况下时间也不是那么重要-日期基本上只是标识符,所以我希望它们作为离散的x标签。更新以解决您的情况!谢谢,成功了。仍然想知道为什么重新排序几乎有效可能是
reorder
和数据集中的行顺序之间的一种奇怪的交互。因此,应避免这些类型的脆弱数据操作。
str(tmp3)

'data.frame':   28 obs. of  7 variables:
 $ mdldte  : chr  "20150305" "20140531" "20160620" "20150305" ...
 $ simdte  : chr  "20130403" "20130429" "20130503" "20130525" ...
 $ r2      : num  0.542 0.485 0.54 0.4 0.594 ...
 $ simyr   : chr  "2013" "2013" "2013" "2013" ...
 $ mdlyr   : chr  "2015" "2014" "2016" "2015" ...
 $ mdlpreds: Factor w/ 4 levels "phv","phvfsca",..: 1 1 1 1 4 1 4 2 3 4 ...
 $ newdate : Date, format: "2013-04-03" "2013-04-29" "2013-05-03" "2013-05-25" ...
plot.new <- ggplot(tmp3)+
    geom_boxplot(aes(x=simdte,y=r2))+
    facet_wrap(~simyr, scales='free_x')+
    scale_x_discrete(labels = function(x) strftime(strptime(x, '%Y%m%d'), '%b-%d'))+
    theme(axis.text.x=element_text(angle=45,hjust=1))
print(plot.new)