如何正确地将字符串更改为R中的月份缩写?
我有一个数据框,它给我的月份标记为M1-M12,而不是一月到十二月。我试图将M值转换为月份缩写,但我似乎无法计算出来。 这是原始数据帧的dput:如何正确地将字符串更改为R中的月份缩写?,r,date,R,Date,我有一个数据框,它给我的月份标记为M1-M12,而不是一月到十二月。我试图将M值转换为月份缩写,但我似乎无法计算出来。 这是原始数据帧的dput: mapoc_temp = structure(list(Longitude = c(-43.5411605834961, -43.5411605834961, -43.5411605834961, -43.5411605834961, -43.5411605834961, -43.5411605834961, -43.5411605834961,
mapoc_temp = structure(list(Longitude = c(-43.5411605834961, -43.5411605834961,
-43.5411605834961, -43.5411605834961, -43.5411605834961, -43.5411605834961,
-43.5411605834961, -43.5411605834961, -43.5411605834961, -43.5411605834961,
-43.5411605834961, -43.5411605834961, -43.5411605834961, -43.5411605834961,
-43.5411605834961, -43.5411605834961, -43.5411605834961, -43.5411605834961,
-43.5411605834961, -43.5411605834961, -43.5411605834961, -43.5411605834961,
-43.5411605834961, -43.5411605834961, -43.5411605834961, -43.5411605834961,
-43.5411605834961, -43.5411605834961, -43.5411605834961, -43.5411605834961,
-43.5411605834961, -43.5411605834961, -43.5411605834961, -43.5411605834961,
-43.5411605834961, -43.5411605834961), Latitude = c(59.950626373291,
59.950626373291, 59.950626373291, 59.950626373291, 59.950626373291,
59.950626373291, 59.950626373291, 59.950626373291, 59.950626373291,
59.950626373291, 59.950626373291, 59.950626373291, 59.950626373291,
59.950626373291, 59.950626373291, 59.950626373291, 59.950626373291,
59.950626373291, 59.950626373291, 59.950626373291, 59.950626373291,
59.950626373291, 59.950626373291, 59.950626373291, 59.950626373291,
59.950626373291, 59.950626373291, 59.950626373291, 59.950626373291,
59.950626373291, 59.950626373291, 59.950626373291, 59.950626373291,
59.950626373291, 59.950626373291, 59.950626373291), Temp = c(-1.1657087802887,
-1.70908033847809, -1.70908033847809, -1.64846479892731, -1.50903105735779,
-1.50903105735779, -1.29481840133667, -0.819319725036621, -0.819319725036621,
0.937921285629272, -0.033661849796772, -0.033661849796772, 3.09912943840027,
3.3768904209137, 3.3768904209137, 5.44990491867065, 5.90848398208618,
5.90848398208618, 8.87255096435547, 7.57381582260132, 7.57381582260132,
9.52607250213623, 9.41888046264648, 9.41888046264648, 7.80030059814453,
7.23698377609253, 7.23698377609253, 3.53716945648193, 4.55290651321411,
4.55290651321411, 0.885161995887756, 1.48482501506805, 1.48482501506805,
-0.0936287492513657, 0.650709450244904, 0.650709450244904), month = c("M1",
"M1", "M1", "M2", "M2", "M2", "M3", "M3", "M3", "M4", "M4", "M4",
"M5", "M5", "M5", "M6", "M6", "M6", "M7", "M7", "M7", "M8", "M8",
"M8", "M9", "M9", "M9", "M10", "M10", "M10", "M11", "M11", "M11",
"M12", "M12", "M12"), year = c(2016, 2017, 2018, 2016, 2017,
2018, 2016, 2017, 2018, 2016, 2017, 2018, 2016, 2017, 2018, 2016,
2017, 2018, 2016, 2017, 2018, 2016, 2017, 2018, 2016, 2017, 2018,
2016, 2017, 2018, 2016, 2017, 2018, 2016, 2017, 2018)), row.names = c(NA,
-36L), class = "data.frame")
我尝试使用以下代码将M字符串更改为月份:
#Rename my months so they are abbreviated and not M1:M12
mapoc_temp$month = c(M1 = "Jan", M2 = "Feb", M3 = "Mar",
M4 = "Apr", M5 = "May", M6 = "Jun",
M7 = "Jul", M8 = "Aug", M9 = "Sep",
M10 = "Oct", M11 = "Nov", M12 = "Dec")
但是,当我使用该代码时,它为每一行提供了不同于原始代码的月份,正如您在我发布的新数据框中所看到的:
mapoc_temp = structure(list(Longitude = c(-43.5411605834961, -43.5411605834961,
-43.5411605834961, -43.5411605834961, -43.5411605834961, -43.5411605834961,
-43.5411605834961, -43.5411605834961, -43.5411605834961, -43.5411605834961,
-43.5411605834961, -43.5411605834961, -43.5411605834961, -43.5411605834961,
-43.5411605834961, -43.5411605834961, -43.5411605834961, -43.5411605834961,
-43.5411605834961, -43.5411605834961, -43.5411605834961, -43.5411605834961,
-43.5411605834961, -43.5411605834961, -43.5411605834961, -43.5411605834961,
-43.5411605834961, -43.5411605834961, -43.5411605834961, -43.5411605834961,
-43.5411605834961, -43.5411605834961, -43.5411605834961, -43.5411605834961,
-43.5411605834961, -43.5411605834961), Latitude = c(59.950626373291,
59.950626373291, 59.950626373291, 59.950626373291, 59.950626373291,
59.950626373291, 59.950626373291, 59.950626373291, 59.950626373291,
59.950626373291, 59.950626373291, 59.950626373291, 59.950626373291,
59.950626373291, 59.950626373291, 59.950626373291, 59.950626373291,
59.950626373291, 59.950626373291, 59.950626373291, 59.950626373291,
59.950626373291, 59.950626373291, 59.950626373291, 59.950626373291,
59.950626373291, 59.950626373291, 59.950626373291, 59.950626373291,
59.950626373291, 59.950626373291, 59.950626373291, 59.950626373291,
59.950626373291, 59.950626373291, 59.950626373291), Temp = c(-1.1657087802887,
-1.70908033847809, -1.70908033847809, -1.64846479892731, -1.50903105735779,
-1.50903105735779, -1.29481840133667, -0.819319725036621, -0.819319725036621,
0.937921285629272, -0.033661849796772, -0.033661849796772, 3.09912943840027,
3.3768904209137, 3.3768904209137, 5.44990491867065, 5.90848398208618,
5.90848398208618, 8.87255096435547, 7.57381582260132, 7.57381582260132,
9.52607250213623, 9.41888046264648, 9.41888046264648, 7.80030059814453,
7.23698377609253, 7.23698377609253, 3.53716945648193, 4.55290651321411,
4.55290651321411, 0.885161995887756, 1.48482501506805, 1.48482501506805,
-0.0936287492513657, 0.650709450244904, 0.650709450244904), month = c("Jan",
"Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct",
"Nov", "Dec", "Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul",
"Aug", "Sep", "Oct", "Nov", "Dec", "Jan", "Feb", "Mar", "Apr",
"May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"), year = c(2016,
2017, 2018, 2016, 2017, 2018, 2016, 2017, 2018, 2016, 2017, 2018,
2016, 2017, 2018, 2016, 2017, 2018, 2016, 2017, 2018, 2016, 2017,
2018, 2016, 2017, 2018, 2016, 2017, 2018, 2016, 2017, 2018, 2016,
2017, 2018)), row.names = c(NA, -36L), class = "data.frame")
正如你所看到的,它没有指定M1-Jan、M2-Feb和M3-Mar等,而是按照顺序给出了月份的缩写,无论原始值是什么。有人知道如何修复此问题吗?使用dplyr,您可以尝试以下方法:
install.packages("dplyr")
library(dplyr)
mapoc_temp <- mapoc_temp %>%
mutate(month_new = month.abb[as.numeric(gsub("M","",month))])
如果您不想在新变量中使用它,只需将month_new更改为month。但这是为了显示的目的,月M值被正确转换 用向量覆盖月份是正确的。@Annet的建议是正确的。但这就是为什么您的代码没有达到预期的效果。代码mapoc_temp$month=cM1=Jan,M2=Feb。。。逐字地告诉R用特定的序列Jan-Feb-…,替换月份列,删除之前存在的内容。因为R使用向量重新排序,并且被替换的列有12个以上的值,所以12个月的序列会重复它自己,只要它必须重复,以便填充您要替换的整个列。@bschneidr将month_new更改为month I也很容易将其添加到文本中。然而,它不应该在环境中提供载体,这正是克里斯汀·赛尔所说的。如果我不清楚的话,对不起,@Annet。当我在前面的评论中提到你的代码时,我是在向克里斯汀讲话。您的Annet代码没有理由在环境中创建新的向量。我相信克里斯汀的工作区里有一些奇怪的事情,可能需要用一个新的R会话或rmlist=ls来清理。我有dplyr。我没有收到错误消息。它没有改变另一列,而是在我的环境中为我提供值,其中M1=Jan,等等,KristenCyr,该代码中没有任何内容会在全局环境中放置任何其他变量。如果你看到M1等人在那里,那么还有什么东西可能在不久前把它放在那里吗?@Annet,在推荐该命令时请非常小心。。。这是不可逆转的,新用户可能要等到太晚才意识到他们所有的努力都需要重做。这对一些人来说可能微不足道,但对许多人来说可能是灾难性的。格式c:有人吗?@r2evan很公平。我会移除它,并在将来考虑它。
Longitude Latitude Temp month year month_new
1 -43.54116 59.95063 -1.1657088 M1 2016 Jan
2 -43.54116 59.95063 -1.7090803 M1 2017 Jan
3 -43.54116 59.95063 -1.7090803 M1 2018 Jan
4 -43.54116 59.95063 -1.6484648 M2 2016 Feb
5 -43.54116 59.95063 -1.5090311 M2 2017 Feb
6 -43.54116 59.95063 -1.5090311 M2 2018 Feb
7 -43.54116 59.95063 -1.2948184 M3 2016 Mar
8 -43.54116 59.95063 -0.8193197 M3 2017 Mar
9 -43.54116 59.95063 -0.8193197 M3 2018 Mar
10 -43.54116 59.95063 0.9379213 M4 2016 Apr